Site-scale modeling of surface ozone in Northern Bavaria using machine learning algorithms, regional dynamic models, and a hybrid model
2021
Nabavi, Seyed Omid | Nölscher, Anke C. | Samimi, Cyrus | Thomas, Christoph | Haimberger, Leopold | Lüers, Johannes | Held, Andreas
Ozone (O₃) is a harmful pollutant when present in the lowermost layer of the atmosphere. Therefore, the European Commission formulated directives to regulate O₃ concentrations in near-surface air. However, almost 50% of the 5068 air quality stations in Europe do not monitor O₃ concentrations. This study aims to provide a hybrid modeling system that fills these gaps in the hourly surface O₃ observations on a site scale with much higher accuracy than existing O₃ models. This hybrid model was developed using estimations from multiple linear regression-based eXtreme Gradient Boosting Machines (MLR-XGBM) and O₃ reanalysis from European regional air quality models (CAMS-EU). The binary classification of extremely high O₃ events and the 1- and 24-h forecasts of hourly O₃ were investigated as secondary aims. In this study thirteen stations in Northern Bavaria, out of which six do not monitor O₃, were chosen as test sites. Considering the computational complexity of machine learning algorithms (MLAs), we also applied two recent MLA interpretation methods, namely SHapley Additive exPlanations (SHAP) and Local interpretable model-agnostic explanations (LIME).With SHAP, we showed an increasing effect of temperature on O₃ concentrations which intensifies for temperatures exceeding 17 °C. According to LIME, O₃ concentration peaks are mainly governed by meteorological factors under dry and warm conditions on a regional scale, whereas local nitrogen oxide concentrations control base O₃ concentrations during cold and wet periods.While recently developed MLAs for the spatial estimation of hourly O₃ concentrations had a station-based root-mean-square error (RMSE) above 27 μg/m³, our proposed model significantly reduced the estimation errors by about 66% with an RMSE of 9.49 μg/m³. We also found that logistic regression (LR) and MLR-XGBM performed best in the site-scale classification and 24-h forecast of O₃ concentrations (with a station-averaged accuracy and RMSE of 0.95 and 19.34 μg/m³, respectively).
Show more [+] Less [-]AGROVOC Keywords
Bibliographic information
This bibliographic record has been provided by National Agricultural Library