Relative performance of different data mining techniques for nitrate concentration and load estimation in different type of watersheds

Li, Shiyang; Bhattarai, Rabin; Cooke, Richard A.; Verma, Siddhartha; Huang, Xiangfeng; Markus, Momcilo; Christianson, Laura

Relative performance of different data mining techniques for nitrate concentration and load estimation in different type of watersheds

2020

The increasing availability of water quality datasets has led to a greater focus on hydrologic and water quality analysis, thus requiring more efficient and accurate modelling methods. Data mining techniques have been increasingly used for water quality analysis and prediction of the concentration and load of nitrogen pollutants instead of more traditional simulation methods. In this study, we tested the multilayer perceptron (MLP), k-nearest neighbor (k-NN), random forest, and reduced error pruning tree (REPTree) methods, along with the traditional linear regression, to predict nitrate levels based on long-term data from six watersheds with different land-use practices in the midwestern United States. Both the concentration and load results indicated that REPTree had the best performance, with an R² of 0.61–0.85 and a relative absolute error of <75.8%. The different watershed types, however, influenced the performance of the data mining methods, where all four methods showed a higher accuracy for urban dominant watershed and lower accuracy for agricultural and forest watersheds. Out of these four methods, classification tree methods (REPTree and RF) performed better than cluster methods (MLP and k-NN) for agricultural and forested watersheds. Our results indicated that both the data structure based on the dominant land use and type of algorithmic method should be carefully considered for selecting a data mining method to predict nitrate concentration and load for a watershed.

显示更多 [+]

AGROVOC关键词

algorithms data collection data mining land use neural networks nitrates nitrogen pollutants prediction pruning regression analysis trees water pollution water quality

书目信息

发表于

Environmental pollution

卷 263 页码 114618 ISSN 0269-7491

出版者

Elsevier Ltd

其它主题

Nitrate concentration; Watershed land use; Midwestern united states; Forested watersheds; Water quality analysis

语言

英语

许可

//data.crossref.org/schemas/AccessIndicators.xsd:license_ref>http://purl.org/eprint/accessRights/OpenAccess | //data.crossref.org/schemas/AccessIndicators.xsd:program>//data.crossref.org/schemas/AccessIndicators.xsd:license_ref> | //data.crossref.org/schemas/AccessIndicators.xsd:program>

类型

Journal Article; Text

自何时收录于AGRIS: 2024-02-29

格式: MODS

数据提供者

这条记录提供自 National Agricultural Library

发现该数据提供方在AGRIS的更多集合

链接

DOI DOI https://dx.doi.org/10.1016/j.envpol.2020.114618

在Google Scholar上查找

如果您发现与此记录相关的任何错误信息，请联系 [email protected]

AGRIS - 国际农业科技情报系统

Share

Relative performance of different data mining techniques for nitrate concentration and load estimation in different type of watersheds

2020

AGROVOC关键词

书目信息