In a paper published in the journal Water, researchers examined wastewater as a substitute for limited water resources and its impact on technological advancements in the sector by focussing on the abundance of physical data generated by this approach, including chemical, biological, and microbiological information. They employed machine learning (ML) algorithms, using experimental data to gain insights.
The study aimed to identify the most popular ML models from the "Web of Science" database and analyzed their relevance and historical development in wastewater treatment technologies. The research highlighted the dominance of developed countries in publishing articles on this topic, with a significant upward trend in publications. Supervised learning, particularly models like Artificial Neural Network (ANN), Random Forest (RF), Support Vector Machine (SVM), Linear Regression (LR), Adaptive Neuro-Fuzzy Inference System (ANFIS), Decision Tree (DT), and Gradient Boosting (GB), was prevalent. Genetic algorithms (GA) stood out as a standard method for model calibration. While ML enhanced data analysis in wastewater treatment, challenges persisted in obtaining high-quality data and interpreting complex models.
ML Applications in Wastewater Treatment
ML is a versatile technology within the broader artificial intelligence (AI) field. It simplifies complex problems, making them more manageable in various sectors. In geoscience, it aids in lithological discrimination and hydrology predictions. Finance uses AI for risk management, while healthcare benefits from ML algorithms in medicine and epidemiological analysis—environmental disciplines, including wastewater treatment, leverage ML to simulate complex systems.
Wastewater treatment, essential for environmental protection and water resource conservation, is influenced by diverse factors, resulting in a complicated process with varying parameters. ML models offer a promising approach to understanding and optimizing these systems. Applications include energy-efficient treatment, quality prediction, wastewater reuse, and performance enhancement. Data preprocessing is also crucial for accurate results. This study employs textual analysis to explore ML's role in wastewater treatment, providing insights into popular models and algorithms.
Methodology
The study revolves around the acquisition of data from the Web of Science platform using specific keywords such as "Wastewater," "ML," and "AI" to refine the search for pertinent scientific literature. The resulting data, of a semi-structured nature, underwent further analysis.
Text mining, or text analytics, was also used, and it encompasses techniques and tools for extracting insights and information from unstructured or semi-structured textual documents. Text mining serves as the bridge to transform textual content into actionable knowledge.
Wastewater Treatment Research Insights
The analysis of wastewater treatment research trends reveals a substantial increase in publications since 2016, driven by advancements in numerical and statistical approaches and computing technology. Global contributions span 81 countries, with developed nations leading in publication numbers. International collaboration is robust, with solid partnerships involving countries such as the United States of America (USA), China, and Spain. The top 30 publishing journals are concentrated in developed countries, reflecting their prominence in this field. Optimization methods, with Genetic Algorithm being the most prominent, and ML models like ANNs and SVMs are widely utilized in addressing wastewater treatment challenges, indicating the growing significance of AI and ML in this domain.
Review Findings
This study provides a comprehensive overview of the application of ML models in wastewater treatment, offering insights into publication trends, influential models, and global contributions.
ML is increasingly essential in scientific research, particularly in modeling complex systems using extensive data. In wastewater treatment, the choice of ML model depends on research objectives and data readiness. This study aims to identify the most commonly used ML models in the wastewater domain through textual analysis. After preprocessing the abstract text, researchers compile a list of AI models.
The results reveal that the seven models are prominent in wastewater research: ANN, RF, SVM, LR, ANFIS, DT, and GB. After 2018, these models have seen increased use, with some experiencing a 75% uptick in applications. Biological neurons inspire ANN, and this model actively utilizes layers of neurons for processing information. It finds applications in monitoring and designing wastewater treatment systems, optimizing energy consumption in sludge incineration, and predicting plant performance.
RF is an ensemble learning approach based on decision trees. Researchers actively employ it in wastewater-based epidemiology, analyze energy consumption in treatment plants, model wastewater treatment systems, and predict inflow at treatment facilities.
SVM, a supervised classification algorithm, effectively handles non-linear problems and finds applications in evaluating biological wastewater treatment, calculating quality indicators, detecting industrial discharges, and simulating phosphorus removal.
LR is a simple statistical modeling method to find the best-fit line for output data. It is applied to understand adsorption processes, predict wastewater treatment plant performance, and study pharmaceutical adsorption onto biochar.
DT is another popular model used in various fields, including wastewater treatment. It creates decision diagrams for predicting effluent nitrogen, optimizing wastewater network maintenance, predicting water quality, and controlling sewage discharge.
Logistic Regression is a supervised classification algorithm that predicts outcomes with probabilities between 0 and 1. It indicates total inorganic nitrogen in treated wastewater, estimates sewer pipe failures, identifies suitable indicators for wastewater monitoring, categorizes citizen messages, and forecasts groundwater flooding in sewer networks.
KNN is a non-parametric supervised learning algorithm that classifies or predicts data points based on proximity. It has been used to examine factors causing septic failures, indicate total coliform removal in urban wastewater treatment, and forecast water quality and effluent total nitrogen in wastewater treatment facilities.
Conclusion
To sum up, the study highlights the increasing use of ML models in wastewater treatment, driven by their adaptability and effectiveness in addressing complex system challenges. Prominent models include ANN, RF, SVM, LR Model, ANFIS, DT, and GB, with model choice dependent on specific problem characteristics and data quality. While this research is valuable, it acknowledges limitations in data coverage and emphasizes the importance of carefully selecting and interpreting ML models in wastewater applications, ultimately promoting sustainability and enhancing the quality of life through AI-driven wastewater management.