Enhancing Water Quality Modeling with AI

In this article published in the journal Nature, the authors aimed to improve water quality modeling in the Great Barrier Reef by matching ungauged catchments with gauged ones. They employed an explainable AI approach to identify catchment similarities and classify them based on their dissolved inorganic nitrogen (DIN) response categories.

Study: Enhancing Water Quality Modeling in the Great Barrier Reef with Explainable AI. Image credit: Generated using DALL.E.3
Study: Enhancing Water Quality Modeling in the Great Barrier Reef with Explainable AI. Image credit: Generated using DALL.E.3

Background

Water quality modeling is essential for understanding and managing the health of aquatic ecosystems. It involves predicting the concentrations of various substances, including DIN, which plays a crucial role in water quality assessment. Accurate modeling of DIN is vital for addressing environmental challenges and making informed decisions regarding land use and conservation efforts.

Modeling DIN in ungauged catchments, areas without prior data collection, presents a significant challenge. Traditional methods heavily rely on data from gauged catchments, but such approaches are less suitable for DIN modeling due to their complex interactions with both natural and human-induced factors. This problem is exacerbated by the limited availability of observed data in ungauged areas, hindering the development of accurate water quality models.

DIN concentrations are influenced by a wide range of factors, both biotic and abiotic, leading to spatial and temporal variability. Existing classification methods primarily based on physical similarities among catchments do not account for these complex biotic influences, resulting in limitations in the predictive capabilities of water quality models.

While process-based models have proven effective for modeling abiotic processes, their applicability to constituents like DIN, which are influenced by biotic factors, remains largely unexplored. Research on the spatiotemporal scales necessary for accurate DIN modeling is notably deficient.

The authors of the present study addressed the existing research gaps in water quality modeling, particularly for DIN, in ungauged catchments. They proposed an innovative approach that used spatial data, specifically original vegetation data, as proxies to categorize catchments based on their DIN responses. The integration of Artificial Neural Networks (ANN) and Explainable AI (XAI) facilitates the matching of ungauged catchments to gauged ones, considering the intricate interplay of biotic and abiotic factors affecting DIN.

Study Results

Catchment Matching using ANN-PR and XAI-SHAP

The results of catchment matching show that, except for the Mary Catchment, the ungauged portions of gauged catchments do not consistently classify together, and catchments do not necessarily classify with their nearest neighbors. The choice of spatial dataset used for matching led to different catchment matches. While Category 2 matched catchments generally clustered together spatially, Category 3 matched catchments had different distributions based on the dataset used. This indicates that different datasets reveal different spatial characteristics of the catchments.

Variable Feature Independence

The study revealed that each catchment had a unique combination and weighting of deviated features. The top 10% XAI-SHAP floristic structure variables could match the most similar gauged catchment based on the combination of deviated variables. It also identified catchments with unique combinations of deviated variables. Only variable combinations occurring in ungauged catchments, and not in the gauged ones, were identified.

ANN-WQ Simulator Performance

The performance of the ANN-WQ simulator was notably influenced by the combination of catchments included in the training datasets. Training using data grouped from multiple catchments generated satisfactory to very good performance for DIN simulation. However, simulations generated in the unsupervised environment for individual catchments showed flatline results. When datasets were discriminated by spatiotemporal regime, they achieved satisfactory to very good performance for most metrics.

Classifying Catchments: Variable Independence vs ANN-PR

While the ANN-PR approach matched all ungauged catchments to a gauged counterpart, the XAI-SHAP variable independence approach using relative variable distributions could not match 17 catchments. The ANN-PR matches using the Original Vegetation dataset most closely aligned with the XAI-SHAP landform and floristic structure dataset.

Verification of Catchment Classification for DIN Similarities

Both XAI-SHAP Variable Independence and ANN-PR techniques for catchment classification matched the pseudo-ungauged Herbert catchment to the gauged Mary catchment. Performance criteria clustered towards datasets discriminated to Mary and Category 1 flows only. Training data discriminated to the individually matched catchment (Mary) and discriminated to wet season flows achieved the best DIN simulations.

Discussion

This research focused on classifying ungauged catchments that flow into the Great Barrier Reef based on proxy data for drivers of DIN using an explainable AI approach called XAI-SHAP. The study demonstrates the importance of data for proxy drivers of DIN in classifying catchments and evaluating DIN simulation performances. Dataset complexity and consistency, training dataset arrangements, and prior knowledge of spatiotemporal similarities play crucial roles in the performance of the ANN-WQ simulator.

  • Dataset Complexity and Consistency: The complexity and representative flow patterns in datasets greatly influence the performance of the ANN-WQ simulator. Flatline simulations result from inadequate complexity in the dataset, and the relationship between flow, spatial data, and DIN response is essential for simulations.
  • Training Dataset Influence: Training data arrangements that group catchments using prior knowledge of spatiotemporal similarities or datasets that discriminate by flow regime improve DIN simulation performance. These arrangements remove heteroscedasticity in DIN patterns to flow.
  • ANN-PR vs XAI-SHAP Classification: Catchments matched using ANN-PR are not always the same as those recommended to be matched by the XAI-SHAP deviation approach for variable independence. The XAI-SHAP approach provides insights into catchments grouped by known DIN-to-flow proxy drivers, showing that the most similar drivers of DIN are not necessarily the neighboring catchment.
  • Practical Application: The study establishes Original Vegetation as a suitable proxy for DIN dynamics in water quality modeling. Out of 41 ungauged catchments, only 20 are suitable for data transfer with existing gauged catchments for water quality modeling. For the ungauged catchments that failed to match gauged ones, new monitoring and gauging sites are recommended to collect data representative of all DIN regimes.

Conclusion

To sum up, the authors successfully matched ungauged catchments that flow into the Great Barrier Reef with gauged catchments using ANN-PR and datasets related to Land Use and Original Vegetation. Additionally, the XAI-SHAP method was employed to explain similarities between catchments based on feature deviations and to group them into known spatiotemporal categories, which improved the performance of the ANN-WQ simulator.

However, it was found that not all catchments matched using ANN-PR shared deviated feature similarity with a spatiotemporal category, suggesting the need for further monitoring in those unmatched areas. Prior discrimination of data based on the spatiotemporal category of ungauged catchments significantly enhanced the ANN-WQ simulator's performance. These findings highlighted the value of XAI-SHAP in customizing catchment matching for water quality datasets, emphasizing the importance of knowledge derived from original vegetation data in this process.

Journal reference:
Soham Nandi

Written by

Soham Nandi

Soham Nandi is a technical writer based in Memari, India. His academic background is in Computer Science Engineering, specializing in Artificial Intelligence and Machine learning. He has extensive experience in Data Analytics, Machine Learning, and Python. He has worked on group projects that required the implementation of Computer Vision, Image Classification, and App Development.

Citations

Please use one of the following formats to cite this article in your essay, paper or report:

  • APA

    Nandi, Soham. (2023, October 29). Enhancing Water Quality Modeling with AI. AZoAi. Retrieved on October 24, 2025 from https://www.azoai.com/news/20231029/Enhancing-Water-Quality-Modelling-with-AI.aspx.

  • MLA

    Nandi, Soham. "Enhancing Water Quality Modeling with AI". AZoAi. 24 October 2025. <https://www.azoai.com/news/20231029/Enhancing-Water-Quality-Modelling-with-AI.aspx>.

  • Chicago

    Nandi, Soham. "Enhancing Water Quality Modeling with AI". AZoAi. https://www.azoai.com/news/20231029/Enhancing-Water-Quality-Modelling-with-AI.aspx. (accessed October 24, 2025).

  • Harvard

    Nandi, Soham. 2023. Enhancing Water Quality Modeling with AI. AZoAi, viewed 24 October 2025, https://www.azoai.com/news/20231029/Enhancing-Water-Quality-Modelling-with-AI.aspx.

Comments

The opinions expressed here are the views of the writer and do not necessarily reflect the views and opinions of AZoAi.
Post a new comment
Post

Sign in to keep reading

We're committed to providing free access to quality science. By registering and providing insight into your preferences you're joining a community of over 1m science interested individuals and help us to provide you with insightful content whilst keeping our service free.

or

While we only use edited and approved content for Azthena answers, it may on occasions provide incorrect responses. Please confirm any data provided with the related suppliers or authors. We do not provide medical advice, if you search for medical information you must always consult a medical professional before acting on any information provided.

Your questions, but not your email details will be shared with OpenAI and retained for 30 days in accordance with their privacy principles.

Please do not ask questions that use sensitive or confidential information.

Read the full Terms & Conditions.

You might also like...
New AI-Powered Dataset Speeds Up Building Damage Assessments After Disasters