AI Improves Air Pollution Prediction Accuracy

A recent review published in the journal Environmental Research examined how well machine learning (ML) algorithms predict ambient air pollution levels compared to traditional statistical methods.

Study: AI Improves Air Pollution Prediction Accuracy. Image Credit: TR STOK/Shutterstock.com
Study: AI Improves Air Pollution Prediction Accuracy. Image Credit: TR STOK/Shutterstock.com

The researchers focused on three key pollutants: nitrogen dioxide (NO₂), ultrafine particles (UFPs), and black carbon (BC). These pollutants have high spatial and temporal variability and significant health impacts. The review aimed to determine if ML methods offer better performance than traditional statistical techniques in capturing these variations.

Background

Air pollution is a major global health issue, affecting illness and death rates worldwide. Accurate exposure assessments are crucial for understanding the risks of air pollution. Pollutants such as NO₂, UFPs, and BC fluctuate widely in space and time, making them challenging to model with traditional methods.

Land use regression (LUR) models are often used to estimate outdoor air pollution, but these models may struggle with the complex, non-linear relationships between pollution levels and environmental factors. ML techniques, which can better capture these non-linear relationships, have become increasingly popular in air pollution modeling.

About the Review

In this study, the authors aimed to assess the performance of ML methods in predicting ambient concentrations of NO₂, UFPs, and BC compared to statistical regression models. To identify relevant studies, they searched two major scientific databases, Scopus and Web of Science, for research published up to June 13, 2024.

The studies had to meet specific criteria: they needed to report spatial or spatiotemporal models using both ML and statistical regression methods for the same pollutants and datasets, focus on outdoor UFPs, NO₂, or BC, include a quantitative assessment of model performance, and be peer-reviewed articles with original data.

The researchers identified 38 eligible studies with 46 model comparisons. These studies were conducted in various countries, covering urban, regional, and global spatial extents. Detailed information on study designs, modeling methods, and performance metrics, including coefficient of determination (R²), and root mean square error (RMSE) was extracted.

Statistical methods ranged from linear regression techniques, like multiple linear regression (MLR) and stepwise linear regression (SLR), to nonlinear and regularized methods such as generalized additive models (GAM) and least absolute shrinkage and selection operator (LASSO). The ML methods included random forest (RF), artificial neural networks (ANN), extreme gradient boosting (XGBoost), and convolutional neural networks (CNN), among others.

Key Results

The review found that ML methods outperformed statistical regression models in 34 of the 46 model comparisons. On average, the best ML models showed an increase of 0.12 in R² and a 20% decrease in RMSE compared to the best statistical models. Tree-based methods, such as RF and XGBoost, were the most frequently used and best-performing ML approaches, surpassing other methods in 12 of 17 multi-model comparisons. Whereas ANN models often performed the worst among all the evaluated ML methods.

ML methods provided greater performance gains for spatiotemporal models (predicting hourly, daily, or monthly pollutant levels) compared to spatial models (predicting annual or seasonal averages). This may be due to linear non-regularized statistical methods, which struggled with the complexity of short-term pollutant variations.

Interestingly, nonlinear and regularized statistical regression methods, such as GAM and LASSO, sometimes performed similarly to ML models, especially for spatial models. This suggests that flexible regression techniques can match ML performance in some scenarios.

Applications

This review has significant implications for air pollution exposure assessment and epidemiological research. Accurate modeling of ambient air pollutant concentrations is crucial for estimating individual exposures and understanding the health impacts of air pollution. The superior performance of ML, particularly tree-based methods, in predicting spatiotemporal variations of NO₂, UFPs, and BC suggests that these techniques could improve exposure assessments and epidemiological studies.

The insights from this study can guide the choice of modeling approaches for different air pollutants and study designs. For example, nonlinear and regularized statistical methods may be more suitable for modeling spatial patterns, while ML techniques could be beneficial for spatiotemporal modeling.

Conclusion

The review summarized that ML methods, especially tree-based algorithms, generally outperformed traditional statistical regression techniques in predicting the spatial and temporal variations of key air pollutants. It highlighted the potential of ML to enhance air pollution exposure assessment and contribute to more accurate epidemiological studies.

The review also emphasized the need for further research to compare a broader range of statistical and ML methods and the importance of standardized reporting of methodologies and results. Future research should explore the performance of different ML algorithms, including deep learning methods, in various contexts, and prioritize the development of standardized reporting guidelines to ensure transparency, reproducibility, and comparability across studies.

Journal reference:
Muhammad Osama

Written by

Muhammad Osama

Muhammad Osama is a full-time data analytics consultant and freelance technical writer based in Delhi, India. He specializes in transforming complex technical concepts into accessible content. He has a Bachelor of Technology in Mechanical Engineering with specialization in AI & Robotics from Galgotias University, India, and he has extensive experience in technical content writing, data science and analytics, and artificial intelligence.

Citations

Please use one of the following formats to cite this article in your essay, paper or report:

  • APA

    Osama, Muhammad. (2024, August 16). AI Improves Air Pollution Prediction Accuracy. AZoAi. Retrieved on September 18, 2024 from https://www.azoai.com/news/20240816/AI-Improves-Air-Pollution-Prediction-Accuracy.aspx.

  • MLA

    Osama, Muhammad. "AI Improves Air Pollution Prediction Accuracy". AZoAi. 18 September 2024. <https://www.azoai.com/news/20240816/AI-Improves-Air-Pollution-Prediction-Accuracy.aspx>.

  • Chicago

    Osama, Muhammad. "AI Improves Air Pollution Prediction Accuracy". AZoAi. https://www.azoai.com/news/20240816/AI-Improves-Air-Pollution-Prediction-Accuracy.aspx. (accessed September 18, 2024).

  • Harvard

    Osama, Muhammad. 2024. AI Improves Air Pollution Prediction Accuracy. AZoAi, viewed 18 September 2024, https://www.azoai.com/news/20240816/AI-Improves-Air-Pollution-Prediction-Accuracy.aspx.

Comments

The opinions expressed here are the views of the writer and do not necessarily reflect the views and opinions of AZoAi.
Post a new comment
Post

While we only use edited and approved content for Azthena answers, it may on occasions provide incorrect responses. Please confirm any data provided with the related suppliers or authors. We do not provide medical advice, if you search for medical information you must always consult a medical professional before acting on any information provided.

Your questions, but not your email details will be shared with OpenAI and retained for 30 days in accordance with their privacy principles.

Please do not ask questions that use sensitive or confidential information.

Read the full Terms & Conditions.

You might also like...
Machine Learning Optimizes EV Charging Stations in Hong Kong's Green Transport Push