In a paper published in the journal Environments, researchers employed bibliometric and meta-analysis techniques to investigate the use of supervised machine learning regression models in satellite-based water quality monitoring. The study revealed a growing interest in satellite technology as a cost-effective and expansive approach to monitoring water quality amid rising anthropogenic water pollution.
Machine learning was instrumental in analyzing complex satellite geospatial data, with deep neural networks emerging as favored algorithms. The research provided valuable insights into trends, critical sensors, and indicators, offering a comprehensive resource for the growing field of geospatial artificial intelligence (AI) in water quality monitoring.
Background
Water pollution poses a significant global concern, impacting the economy, safety, the environment, and human health. Issues like high costs, time-intensive procedures, safety risks, and limited spatial and temporal coverage plague traditional water quality monitoring methods. Satellite data offers a cost-effective and efficient alternative, providing real-time results for the timely identification of water quality issues on a planetary scale.
The growth of satellite remote sensing has led to an influx of big geospatial data, necessitating cluster-based high-performance computing systems and cloud platforms to handle the volume, velocity, and variety of data. These technologies enable efficient data analysis and interpretation, enhancing the application of satellite insights in water quality monitoring. AI techniques like time-series forecasting, classification, regression predictions, computer vision, and natural language processing play vital roles in predicting critical events such as algal blooms improving water resource management and sustainability.
Scope and Objectives
This paper sheds light on the shortcomings of prior reviews in providing a thorough grasp of the remote sensing literature. Various previous studies needed more comprehensive bibliometric analyses or meta-analyses. This review aims to address these gaps by employing a rigorous approach, combining bibliometric and meta-analysis methods to synthesize and evaluate previous research on the application of machine learning and satellite data in water quality monitoring. Its goal is to conclude factors influencing regression algorithm performance in predicting water quality using satellite data, such as the satellite sensor type, AI techniques, water quality parameters, and geographic location. The paper has included available analysis-ready data (ARD) preprocessing method.
Methods and Bibliometric Results
Researchers conducted an extensive search in the Scopus-indexed database from 2005 to the present day to comprehensively evaluate machine and deep learning applications in satellite-based water quality monitoring. This meticulous search encompassed relevant keywords that adhered to Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) 2020 guidelines. It revealed China, the United States, and the European Union (excluding the United Kingdom) as the most prolific regions in this research field. China, in particular, stood out due to significant investments in satellite technology and artificial intelligence. The researchers assessed article quality actively by employing a publication production quality index (PPQI), and influential journals like Remote Sensing and Remote Sensing of the Environment played a crucial role in this evaluation. The study also identified the top 10 most cited papers, showcasing their impact within the scholarly community.
Machine Learning Techniques and Satellite Sensors
The research comprehensively presents commonly used machine and deep learning techniques for satellite-based water quality monitoring. The authors actively categorized these techniques by learning types and subdivided them based on application and specific algorithm names. This taxonomy is a valuable resource for researchers seeking the most suitable algorithms. Additionally, they explored 20 satellite sensors used in water quality monitoring, categorizing them by imaging system type and highlighting their advantages and disadvantages. This information assists researchers in making informed choices when selecting satellite sensors for their specific monitoring applications.
Satellite-based water quality monitoring relies on comprehensive datasets and open-source code resources to overcome challenges such as limited expertise and training data. These resources promote collaboration, transparency, and advancements in the field while facilitating the use of machine-learning algorithms. This shift toward open science accelerates research and enhances reproducibility, improving water quality monitoring capabilities.
Limitations and Gaps
The review has limitations such as the reliance on the Scopus database, which may exclude specific niche journals, and using R2 in the meta-analysis, assuming identical study conditions. Research gaps include the need for more explainable artificial intelligence (XAI) methods to enhance model transparency, deep neural networks (DNNs) as black-box solutions, and the prevalence of traditional model evaluation metrics. Additionally, researchers must make more progress in monitoring certain water quality aspects, such as Contaminants of Emerging Concern (CEC).
Recommendations and Prospects
The paper recommends synchronizing data acquisition with satellite overpasses, emphasizing the importance of XAI techniques, improving sensor design to mitigate noise, and using multiple databases for comprehensive literature searches. Future research should address gaps by generating labeled data and employing semi-supervised learning, data augmentation, and synthetic training data. Optimizing model evaluation metrics and integrating XAI is crucial. Satellite missions on the horizon, such as Plankton, Aerosol, Cloud, and Ocean Ecosystem (PACE), Geostationary Littoral Imaging and Monitoring Radiometer (GLIMR), Surface Biology and Geology (SBG), Landsat Next, and Sentinel-2C and -2D, present exciting prospects for advancing water quality monitoring and enhancing our understanding of the environment.
Conclusion
To sum up, from 2005 to 2023, satellite remote sensing and machine learning algorithms for monitoring water quality in various aquatic environments have significantly improved. Advances in algorithms, gaming industry investments in fast chips, and improved Internet access have made these technologies more accessible for research. Researchers have identified critical sensors and water quality indicators, primarily focusing on inland waters in China and the United States. This comprehensive review aims to bridge the gap between hydrology, computer science, and remote sensing, encouraging collaborative solutions for water quality challenges.