Refine search
Results 21-30 of 1,233
Integration of machine learning-based prediction for enhanced Model’s generalization: Application in photocatalytic polishing of palm oil mill effluent (POME)
2020
Ng, Kim Hoong | Gan, Y.S. | Cheng, Chin Kui | Liu, Kun-Hong | Liong, Sze-Teng
In predicting palm oil mill effluent (POME) degradation efficiency, previous developed quadratic model quantitatively evaluated the effects of O2 flowrate, TiO2 loadings and initial concentration of POME in labscale photocatalytic system, which however suffered from low generalization due to the overfitting behaviour. Evidently, high RMSE (131.61) and low R₂ (−630.49) obtained indicates its insufficiency in describing POME degradation at unseen factor ranges, hence verified the fact of poor generalization. To overcome this issue, several models were developed via machine learning-assisted techniques, namely Gaussian Process Regression (GPR), Linear Regression (LR), Decision Tree (DT), Supported Vector Machine (SVM) and Regression Tree Ensemble (RTE), subsequently being assessed systematically. To achieve high generalization, all models were subjected to ‘train-all-test-all’ strategy, 5-fold and 10-fold cross validation. Specifically, GPR model was furnished with high accuracy in ‘train-all-test-all’ strategy, judging from its low RMSE (1.0394) and high R₂ (0.9962), which however menaced by the risk of overfitting. In contrast, despite relatively poorer RMSE and R₂ (1.7964 and 0.9886) obtained in 5-fold cross validation, GPR model was rendered with highest generalization, while sufficiently preserving its accuracy in development process. Besides, SVM and RTE models were also demonstrated promising R₂ (0.9372 and 0.9208), which however shadowed by their high RMSEs (4.2174 and 4.7366). Furthermore, the extraordinary generalization of GPR model was coincidentally verified in 10-fold cross validation. The lowest RMSE (2.1624) and highest R₂ (0.9835) obtained with feature number of 36 asserted its sufficiency in both generalization and accuracy prospect. Other models were all rendered with slight lower R₂ (> 0.9), plausibly due to the higher RMSE (> 4.0). According to GPR model, optimized POME degradation (52.52%) can be obtained at 70 mL/min of O₂, 70.0 g/L of TiO₂ and 250 ppm of POME concentration, with only ∼3% error as compared to the actual data.
Show more [+] Less [-]Cryptosporidium and Giardia in surface water and drinking water: Animal sources and towards the use of a machine-learning approach as a tool for predicting contamination
2020
Ligda, Panagiota | Claerebout, Edwin | Kostopoulou, Despoina | Zdragas, Antonios | Casaert, Stijn | Robertson, Lucy J. | Sotiraki, Smaragda
Cryptosporidium and Giardia are important parasites due to their zoonotic potential and impact on human health, often causing waterborne outbreaks of disease. Detection of (oo)cysts in water matrices is challenging and few countries have legislated water monitoring for their presence. The aim of this study was to investigate the presence and origin of these parasites in different water sources in Northern Greece and identify interactions between biotic/abiotic factors in order to develop risk-assessment models. During a 2-year period, using a longitudinal, repeated sampling approach, 12 locations in 4 rivers, irrigation canals, and a water production company, were monitored for Cryptosporidium and Giardia, using standard methods. Furthermore, 254 faecal samples from animals were collected from 15 cattle and 12 sheep farms located near the water sampling points and screened for both parasites, in order to estimate their potential contribution to water contamination. River water samples were frequently contaminated with Cryptosporidium (47.1%) and Giardia (66.2%), with higher contamination rates during winter and spring. During a 5-month period, (oo)cysts were detected in drinking-water (<1/litre). Animals on all farms were infected by both parasites, with 16.7% of calves and 17.2% of lambs excreting Cryptosporidium oocysts and 41.3% of calves and 43.1% of lambs excreting Giardia cysts. The most prevalent species identified in both water and animal samples were C. parvum and G. duodenalis assemblage AII. The presence of G. duodenalis assemblage AII in drinking water and C. parvum IIaA15G2R1 in surface water highlights the potential risk of waterborne infection. No correlation was found between (oo)cyst counts and faecal-indicator bacteria. Machine-learning models that can predict contamination intensity with Cryptosporidium (75% accuracy) and Giardia (69% accuracy), combining biological, physicochemical and meteorological factors, were developed. Although these prediction accuracies may be insufficient for public health purposes, they could be useful for augmenting and informing risk-based sampling plans.
Show more [+] Less [-]Predicting the modifying effect of soils on arsenic phytotoxicity and phytoaccumulation using soil properties or soil extraction methods
2020
Zhang, Xiaoqing | Dayton, Elizabeth A. | Basta, Nicholas T.
Soils have the ability to modify contaminant bioavailability and toxicity. Prediction the modifying effect of soil on arsenic phytoaccumulation and phytoavailability using either soil property data or soil chemical extraction data in risk assessment of contaminated soil is highly desirable. In this study, plant bioassays important to ecological receptors, were conducted with 20 soils with a wide range in chemical and physical soil properties to determine the relationships between As measured by soil chemical extraction (soil pore water, Bray-1, sodium phosphate solution, hydroxylamine hydrochloride, and acid ammonium oxalate) or soil physico/chemical properties on arsenic phytotoxicity and phytoaccumulation. Soil pore water As and Bray-1 extracted As were significantly (P < 0.01) correlated with lettuce tissue As and those extractants and sodium phosphate were correlated with ryegrass tissue As. Hydroxylamine and acid ammonium oxalate extractions did not correlate with plant bioassay endpoints. Simple regression results showed that lettuce tissue relative dry matter growth (RDMG) was inversely related to tissue As concentration (r² = 0.85, P < 0.01), with no significant relationship for ryegrass. Soil clay exhibited strong adsorption for As and significantly reduce tissue As for lettuce and ryegrass. In addition to clay content, reactive aluminum oxide (AlOx), reactive Fe oxide (FeOx) and eCEC was inversely related to ryegrass tissue As. Multiple regression equation was strongly predictive (r² = 0.83) for ryegrass tissue As (log transformed) using soil AlOx, organic matter, pH, and eCEC as variables. Soil properties can greatly reduce contaminant phytoavailability, plant exposure and risk, which should be considered when assessing contaminant exposure and site-specific risk in As-contaminated soils.
Show more [+] Less [-]Nitrate exposure induces intestinal microbiota dysbiosis and metabolism disorder in Bufo gargarizans tadpoles
2020
Xie, Lei | Zhang, Yuhui | Gao, Jinshu | Li, Xinyi | Wang, Hongyuan
Excess nitrate has been reported to be associated with many adverse effects in humans and experimental animals. However, there is a paucity of information of the effects of nitrate on intestinal microbial community. In this study, the effects of nitrate on development, intestinal microbial community, and metabolites of Bufo gargarizans tadpoles were investigated. B. gargarizans were exposed to control, 5, 20 and 100 mg/L nitrate-nitrogen (NO₃–N) from eggs to Gosner stage 38. Our data showed that the body size of tadpoles significantly decreased in the 20 and 100 mg/L NO₃–N treatment group when compared to control tadpoles. Exposure to 20 and 100 mg/L NO₃–N also caused indistinct cell boundaries and nuclear pyknosis of mucosal epithelial cells in intestine of tadpoles. In addition, exposure to NO₃–N significantly altered the intestinal microbiota diversity and structure. The facultative anaerobic Proteobacteria occupy the niche of the obligately anaerobic Bacteroidetes and Fusobacteria under the pressure of NO₃–N exposure. According to the results of functional prediction, NO₃–N exposure affected the fatty acid metabolism pathway and amino acid metabolism pathway. The whole-body fatty acid components were found to be changed after exposure to 100 mg/L NO₃–N. Therefore, we concluded that exposure to 20 and 100 mg/L NO₃–N could induce deficient nutrient absorption in intestine, resulting in malnutrition of B. gargarizans tadpoles. High levels of NO₃–N could also change the intestinal microbial communities, causing dysregulation of fatty acid metabolism and amino acid metabolism in B. gargarizans tadpoles.
Show more [+] Less [-]A meta-analysis of microbial community structures and associated metabolic potential of municipal wastewater treatment plants in global scope
2020
Tian, Lu | Wang, Lin
Microbial community in wastewater treatment plants (WWTPs) are affected by various environmental factors. The microbial communities from different WWTPs around world were compared by meta-analysis of the published high-throughput sequencing data of 16S rRNA of these WWTPs, the various environmental factors considered. Community richness indexes showed significant difference between altitude groups, and there was no latitudinal diversity gradient in WWTPs’ microbiomes. Climate was the most important influential factor and process was the second factor, and latitude and altitude contributed 5.51% and 4.78% of the overall variance of the data separately. Three significantly enriched bacterial communities in latitude and altitude respectively were showed by ternary plots. Mantel test illustrated that microbial community was strongly correlated with dissolved oxygen, temperature and pollutants concentrations. The prediction of potential functions revealed that microbial function structures were more stable than community structures. Some dominant bacteria in WWTPs have potential pathogenicity may pose serious threat to the environment and human health.
Show more [+] Less [-]A three-phase-successive partition-limited model to predict plant accumulation of organic contaminants from soils treated with surfactants
2020
The application of surfactants is an effective way to inhibit the migration of organic contaminants (OCs) from soil to plants, and thus would be a great candidate method for producing safe agricultural products in organic-contaminated farmland. In this study, it was found that cetyltrimethyl ammonium bromide (CTMAB) reduced the OCs in cabbage by 22.0–64.1%, and those in lettuce by 18.8–36.5%. We developed a mathematical model to predict the accumulation of OCs in plants in the presence of surfactants. The successive partitioning of OCs among three phases, namely, soil, soil water and plant roots, was considered. The equilibrium of OC between the soil and soil water was scaled using the sorption coefficient of OCs on soils normalized by the soil organic carbon (Kₒc) and carbon-normalized OCs sorption coefficient with the sorbed surfactants (Kₛₛ). To precisely calculate the Kₒc and Kₛₛ, the bioavailable and bound OCs were measured using a sequential extraction method. Linear positive correlations between the logarithm of Kₒc (or Kₛₛ) and the logarithm of the octanol-water partition coefficient (log Kₒw) of OCs were established for laterite soils, paddy soils and black soils. In the presence of CTMAB, the equilibrium of OCs between the soil water and plant roots was scaled using the carbon-normalized OC sorption coefficient with the sorbed surfactants (Kₛf), whose logarithmic value was linearly correlated with the log Kₒw of the OCs. A three-phase-successive partition-limited model was developed based on these relationships, demonstrating an average prediction accuracy of 76.6 ± 36.8%. Our results indicated that the decrease in bioavailable OCs in soils and the increase in sorption of OCs on roots should be taken into consideration when predicting plant uptake. This research provides a validated mathematical model for predicting the concentration of OCs in plants in the presence of surfactants.
Show more [+] Less [-]Gastrointestinal dysbiosis following diethylhexyl phthalate exposure in zebrafish (Danio rerio): Altered microbial diversity, functionality, and network connectivity
2020
Buerger, Amanda N. | Dillon, David T. | Schmidt, Jordan | Yang, Tao | Zubcevic, Jasenka | Martyniuk, Christopher J. | Bisesi, Joseph H.
Microbiome community structure is intimately involved in key biological functions in the gastrointestinal (GI) system including nutrient absorption and lipid metabolism. Recent evidence suggests that disruption of the GI microbiome is a contributing factor to metabolic disorders and obesity. Poor diet and chemical exposure have been independently shown to cause disruption of the GI microbiome community structure and function. We hypothesized that the addition a chemical exposure to overfeeding exacerbates adverse effects on the GI microbiome community structure and function. To test this hypothesis, adult zebrafish were fed a normal feeding regime (Control), an overfeeding regime (OF), or an overfeeding regime contaminated with diethylhexyl phthalate (OF + DEHP), a suspected obesogen-inducing chemical. After 60 days, fecal matter was collected for sequencing, identification, and quantification of the GI microbiome using the 16s rRNA hypervariable region. Analysis of beta diversity indicated distinct microbial profiles between treatments with the largest divergence between Control and OF + DEHP groups. Based upon functional predictions, OF + DEHP treatment altered carbohydrate metabolism, while both OF and OF + DEHP affected biosynthesis of fatty acids and lipid metabolism. Co-occurrence network analysis revealed decreases in cluster size and a fracturing of the microbial community network into unconnected components and a loss of keystone species in the OF + DEHP treatment when compared to Control and OF treatments. Data suggest that the addition of DEHP in the diet may exacerbate microbial dysbiosis, a consequence that may explain in part its role as an obesogenic chemical.
Show more [+] Less [-]Relative performance of different data mining techniques for nitrate concentration and load estimation in different type of watersheds
2020
Li, Shiyang | Bhattarai, Rabin | Cooke, Richard A. | Verma, Siddhartha | Huang, Xiangfeng | Markus, Momcilo | Christianson, Laura
The increasing availability of water quality datasets has led to a greater focus on hydrologic and water quality analysis, thus requiring more efficient and accurate modelling methods. Data mining techniques have been increasingly used for water quality analysis and prediction of the concentration and load of nitrogen pollutants instead of more traditional simulation methods. In this study, we tested the multilayer perceptron (MLP), k-nearest neighbor (k-NN), random forest, and reduced error pruning tree (REPTree) methods, along with the traditional linear regression, to predict nitrate levels based on long-term data from six watersheds with different land-use practices in the midwestern United States. Both the concentration and load results indicated that REPTree had the best performance, with an R² of 0.61–0.85 and a relative absolute error of <75.8%. The different watershed types, however, influenced the performance of the data mining methods, where all four methods showed a higher accuracy for urban dominant watershed and lower accuracy for agricultural and forest watersheds. Out of these four methods, classification tree methods (REPTree and RF) performed better than cluster methods (MLP and k-NN) for agricultural and forested watersheds. Our results indicated that both the data structure based on the dominant land use and type of algorithmic method should be carefully considered for selecting a data mining method to predict nitrate concentration and load for a watershed.
Show more [+] Less [-]A hybrid air pollutant concentration prediction model combining secondary decomposition and sequence reconstruction
2020
Sun, Wei | Huang, Chenchen
Acid rain is a serious threat to terrestrial ecosystems. To provide more accurate early warning information for acid rain prevention, urban planning, and travel planning, a novel air pollutant prediction model was proposed in this paper to predict NO₂ and SO₂. First, the data were decomposed into several sub-sequences by a complete ensemble empirical mode decomposition with adaptive noise. Second, the subsequences are reconstructed by variational mode decomposition and sample entropy. Then, the new subsequences are predicted by the extreme learning machine combined with the whale optimization algorithm. The empirical analysis was carried out through 8 data sets. According to the experimental results, three main conclusions can be drawn. First, the proposed model in this paper has excellent prediction performance and robustness. In all the comparison experiments, the R² and RMSE of the proposed model are the best among all the models. Second, data preprocessing is very necessary. After adding the decomposition algorithm, the average improvement levels of R² and RMSE were 897.57% and 50.78%, respectively. Third, the re-decomposition of IMF1 is an effective method to improve prediction accuracy. After the re-decomposition of IMF1, R² can be improved by 13.64% on average on the original basis, and RMSE can be reduced by 31.99% on average. The results of this study can provide a valuable reference for the research of air pollutant prediction. In future work, the application of the proposed model in other air pollutants or other regions can be explored.
Show more [+] Less [-]A new spatially explicit model of population risk level grid identification for children and adults to urban soil PAHs
2020
Li, Fufu | Wu, Shaohua | Wang, Yuanmin | Yan, Daohao | Qiu, Lefeng | Xu, Zhenci
The traditional incremental lifetime cancer risk (ILCR) model of urban soil polycyclic aromatic hydrocarbon (PAH) health risk assessment has a large spatial scale and commonly calculates relevant statistics by regarding the whole area as a geographic unit but fails to consider the high heterogeneity of the PAH distribution and differences in population susceptibility and density in an area. Therefore, the risk assessment spatial performance is insufficient and does not reflect the characteristics of cities, which are centered on human activities and serve the needs of humans, thus making it difficult to effectively support PAH prevention and treatment measures in cities. Here, the random forest model combined with the kriging residual model (RFerr-K) is used to estimate high-precision PAH distributions, separately considering the exposure characteristics of children and adults with different susceptibilities, and kindergarten point-of-interest (POI) and population density index (PDI) data were used to estimate the distributions of the kindergarten children and adults in the study area. Through the refined expression of these three dimensions, a new spatially explicit model of the incremental lifetime cancer-causing population distribution (MapPILCR) was constructed, and the risk threshold range delineation method was proposed to accurately identify regional risk levels. The results showed that the RFerr-K model significantly improves the accuracy of PAH prediction. The susceptibility index (SI) of children is 45% higher than that of adults, and POI and PDI data can be used effectively in population distribution estimation. The MapPILCR model provides a useful method for the spatially explicit assessment of the cancer risk of urban populations to inspire urban pollution grid management.
Show more [+] Less [-]