In a paper recently published in the journal Desalination and Water Treatment, researchers evaluated machine learning models for predicting chemical oxygen demand (COD), biological oxygen demand (BOD), and suspended solids (SS) at AlHayer wastewater treatment plant (ALWTP) in Saudi Arabia.
Four models were tested: logistic regression (LR), random forest (RF), gradient boosting (GB), and support vector regression (SVR). RF excelled in predicting COD and SS, while GB was best for BOD, highlighting the efficacy of ensemble learning models.
Background
Water pollution remains a critical global concern exacerbated by urbanization and industrialization, leading to increased wastewater generation. Effective treatment of wastewater is crucial for mitigating environmental impact and preserving water resources. Traditional methods for wastewater treatment involve physical, chemical, and biological processes to remove contaminants such as organic matter, suspended solids, and nutrients.
Previous research in modeling wastewater treatment plant (WWTP) performance has primarily focused on predicting specific pollutant levels using limited historical data and conventional modeling approaches. However, these efforts often overlook the complex nonlinear dynamics, high variability in operational conditions, and the need for comprehensive long-term data integration.
This study addressed these gaps by proposing advanced machine-learning techniques to enhance the prediction accuracy of WWTP performance at ALWTP in Saudi Arabia. The research spanned data collected over four years (2018-2021), encompassing diverse operational scenarios. Techniques including LR, RF, GB, and SVR were evaluated and compared to develop robust predictive models.
By integrating sophisticated machine learning methodologies and leveraging extensive longitudinal data, this study aimed to provide more accurate forecasts of contaminant removal efficiency at ALWTP. This approach not only advanced predictive modeling capabilities but also contributed to sustainable water management practices by optimizing treatment plant operations and resource allocation.
Methodology and Data Analysis
In this study, the performance of ALWTP in Riyadh, Saudi Arabia, was evaluated using machine learning techniques to predict key water quality parameters. ALWTP processed approximately 400,000 cubic meters (m3) of wastewater daily through primary, secondary, and tertiary treatment stages, culminating in sludge treatment for reuse in irrigation.
Data spanning four years (2018-2021), obtained from Saudi Water National Company, encompassed various influent and effluent parameters such as COD, BOD, total organic carbon (TOC), SS, total dissolved solids (TDS), and potential of hydrogen (pH), capturing seasonal variations.
Four machine learning models were employed to predict pollutant levels. LR predicted probabilities based on input variables, while RF aggregated predictions from multiple decision trees to enhance accuracy. GB iteratively improved predictions by correcting previous errors using decision trees, and SVR utilized kernel functions like radial basis function for nonlinear regression.
Each model's parameters were optimized using Python, focusing on accuracy metrics such as coefficient of determination (R2), mean absolute error (MAE), and median absolute error (MdAE). The researchers employed a rigorous methodology: data evaluation, partitioning into training and testing sets, model development, and performance assessment against statistical benchmarks.
By integrating advanced machine learning with comprehensive, long-term operational data from ALWTP, the researchers aimed to improve the accuracy of predicting wastewater treatment efficiency. This approach not only enhanced predictive modeling capabilities but also supported sustainable water management strategies by optimizing treatment processes and resource allocation. The findings provided valuable insights for improving the operational efficiency of wastewater treatment plants, contributing to environmental conservation efforts.
Results and Discussions
The researchers presented a statistical overview of influent and effluent wastewater parameters at ALWTP, Riyadh. Parameters such as BOD, COD, SS, and others exhibited non-Gaussian distributions with varying skewness and kurtosis, indicating significant variability in wastewater characteristics. The treatment plant achieved high removal efficiencies: 91% for COD, 92% for BOD, and 90% for SS on average, ensuring compliance with Saudi Arabian regulations.
Four models were evaluated for their ability to predict SS, COD, and BOD removal efficiencies. RF consistently outperformed other models with high accuracies (R2 ranging from 0.82 to 0.95) in both the training and testing stages. For SS removal, RF achieved an accuracy of 0.946, while for COD and BOD removal, RF achieved 0.906 and 0.922, respectively, in testing. SVR showed notably poorer performance across all parameters. Sensitivity analysis revealed TDS and COD as significant influencers on removal efficiencies.
Comparison with previous soft computing models indicated that RF and GB models in this study performed competitively, demonstrating robustness in predicting effluent quality parameters. These models offered reliable tools for optimizing wastewater treatment processes, enhancing operational efficiency, and ensuring environmental compliance.
Conclusion
In conclusion, the researchers evaluated machine learning models for predicting the removal efficiency of COD, BOD, and SS at the ALWTP in Saudi Arabia. Four models were tested using four years of operational data. RF excelled in predicting COD and SS removal, while GB was superior for BOD. Sensitivity analysis highlighted the significant impact of TDS and COD on removal efficiencies. These findings demonstrated the potential of machine learning models to enhance predictive accuracy, optimize wastewater treatment processes, and support sustainable water management practices.
Journal reference:
- Hani Mahanna, Nora ELRahsidy, Mosbeh R. Kaloop, Shaker El-Sapakh, Ayed Alluqmani, Raouf Hassan, Prediction of Wastewater Treatment Plant Performance through Machine Learning Techniques, Desalination and Water Treatment, 2024, 100524, ISSN 1944-3986, https://doi.org/10.1016/j.dwt.2024.100524, https://www.sciencedirect.com/science/article/pii/S1944398624005587