Unveiling Drought Impacts: How Explainable Machine Learning Revolutionizes Prediction and Assessment

In an article in the press with the journal Science of The Total Environment, researchers demonstrated the feasibility of using an explainable machine learning (ML) model for the prediction and assessment of complex drought impacts.

Study: Unveiling Drought Impacts: How Explainable Machine Learning Revolutionizes Prediction and Assessment. Image Credit: i am adventure/Shutterstock
Study: Unveiling Drought Impacts: How Explainable Machine Learning Revolutionizes Prediction and Assessment. Image Credit: i am adventure/Shutterstock

Background

Drought is one of the common natural disasters with extensive environmental, economic, and social impacts. Drought impacts are primarily intangible and indirect when compared with the impacts of other natural disasters, such as wildfires and floods, which increase the difficulties to assess and quantify drought impacts through commonly utilized datasets, such as remote sensed imagery.

Although researchers in several studies have proposed more than 100 indicators for different drought early-warning systems to monitor the frequency, severity, intensity, and onset of drought events, most of these studies failed to demonstrate a correlation between these indicators and drought impacts.

Thus, the lack of quantitative data on drought impacts in several sectors possessing high-quality temporal and spatial resolution and the nonlinearity and complexity of the relationships between drought indicators and impacts are the key challenges for quantitatively assessing and predicting multi-dimensional and complex drought impacts.

ML-based models are increasingly being used to monitor and predict droughts owing to their exceptional performance on predictive tasks. The state-of-the-art (SOTA) ML-based models can more accurately capture the non-linear and complex characteristics of droughts compared to traditional regression models.

Moreover, the ML models are computationally less consuming and more robust for high-dimensional datasets. Several drought studies have extensively used Random Forest (RF), Artificial Neural Network, and Support Vector Machine (SVM) ML models to predict visible impacts such as vegetation stress or hydro-meteorological drought events. However, the lack of explainability in complex ML models trained using datasets with a high-dimensional feature space/black box issue is a major disadvantage of using these models for drought studies.

The cost of the model’s failure for drought assessment and monitoring, specifically false negative predictions, can significantly affect society. Thus, the explainability of ML models is crucial to study droughts and their impacts. ML models must explain the relationships learned between the features employed to capture the environmental and climate conditions and drought impacts and explain their outcomes, as drought impact prediction is a real-world task requiring prompt action and risk management.

Although a limited number of drought studies have used explainable ML models, no study has used explainable ML models to comprehend the complex relationships between drought indicators commonly used in the United States (US) and multi-dimensional drought impacts. Another major disadvantage is the lack of sufficient robustness of ML models to ensure their generalizability and reproducibility. ML models used in most drought analysis studies lack an easily reproducible systematic pipeline.

Complex drought impact assessment and prediction

In this paper, researchers proposed an explainable ML pipeline using the extreme gradient boosting (XGBoost) model and SHapley Additive exPlanations (SHAP) model based on a comprehensive drought impact database in the U.S. An ML pipeline with an advanced tree-based model was applied to predict and understand complex impacts of droughts in a climate policy setting.

The pipeline incorporated explainable ML to interpret the relationships between multi-dimensional drought impacts and hydro-meteorological indicators based on reliable prediction results that indicate different suitable performance metrics for high-stakes event prediction. 

A text-based and multi-source dataset was utilized to label the monthly drought impacts at the county level across several dimensions in the continental US. Multi-source reports obtained from the drought impact reporter (DIR) were used to indicate the drought impact occurrences.

The standardized temperature index (STI) and standardized precipitation index (SPI) were applied in several time scalars to describe short- to long-term temperature and precipitation anomalies separately to consider the lagged and accumulated effects of drought impacts in different time scales.

Researchers investigated the ability of the explainable ML models to learn and interpret the relationships between precipitation and temperature-based hydro-meteorological indicators and various drought impacts. Additionally, the land cover (LC) dataset was employed to consider the spatial geographic environment variations, while the social vulnerability index (SVI) was used as a comprehensive regional vulnerability and socioeconomic status indicator.

Moreover, the U.S. climate regions from the National Centers for Environmental Information (NCEI) and monthly time tags were also included to reflect the general temporal and spatial patterns in the continental U.S. Feature engineering was utilized for input feature processing. RF, SVM, One Rule (OneR), and Logistic Regression (LR) models were employed as the baseline models against which the predictive performance of XGBoost was compared at the national and state levels.

Significance of the study

The XGBoost models significantly outperformed the baseline models in predicting the multi-dimensional drought impact occurrence from text-based DIR by attaining 0.883 and 0.942 average weighted harmonic mean of the precision and recall (F2) scores at the national and state levels, respectively. Specifically, the XGBoost models achieved a 40.2% and 17.3% improvement in F2 scores compared to LR and RF models, respectively, while predicting the occurrence of seven drought impact types at the national level.

Additionally, the average F2 score of XGBoost at the state level over drought impacts and all selected states displayed a 6.7% improvement over the national-level models, which indicated the feasibility of using XGBoost to predict complex drought impact occurrence based on drought indicators at both spatial scales.

The model interpretation at the state level demonstrated the significant contribution of STI and SPI in the multi-dimensional drought impact prediction. Moreover, the relationships, time scalar, and importance of STI and SPI varied depending on the drought locations and type of drought impacts.

The patterns between the SHAP value-indicated drought impacts and SPI variables revealed that negative SPI values positively contributed to complex drought impacts. Thus, the SPI variable-based explainability improved the trustworthiness of the XGBoost models as the outcomes aligned with the expert knowledge.

To summarize, the findings of this study demonstrated the effectiveness of XGBoost models in accurately predicting complex drought impacts and, more interpretably, rendering the relationships between drought indicators and impacts. Moreover, the study also displayed the potential of using explainable ML to better comprehend the multi-dimensional drought impacts at the regional level and motivate proper responses.

Journal reference:
 
Samudrapom Dam

Written by

Samudrapom Dam

Samudrapom Dam is a freelance scientific and business writer based in Kolkata, India. He has been writing articles related to business and scientific topics for more than one and a half years. He has extensive experience in writing about advanced technologies, information technology, machinery, metals and metal products, clean technologies, finance and banking, automotive, household products, and the aerospace industry. He is passionate about the latest developments in advanced technologies, the ways these developments can be implemented in a real-world situation, and how these developments can positively impact common people.

Citations

Please use one of the following formats to cite this article in your essay, paper or report:

  • APA

    Dam, Samudrapom. (2023, July 26). Unveiling Drought Impacts: How Explainable Machine Learning Revolutionizes Prediction and Assessment. AZoAi. Retrieved on September 18, 2024 from https://www.azoai.com/news/20230726/Unveiling-Drought-Impacts-How-Explainable-Machine-Learning-Revolutionizes-Prediction-and-Assessment.aspx.

  • MLA

    Dam, Samudrapom. "Unveiling Drought Impacts: How Explainable Machine Learning Revolutionizes Prediction and Assessment". AZoAi. 18 September 2024. <https://www.azoai.com/news/20230726/Unveiling-Drought-Impacts-How-Explainable-Machine-Learning-Revolutionizes-Prediction-and-Assessment.aspx>.

  • Chicago

    Dam, Samudrapom. "Unveiling Drought Impacts: How Explainable Machine Learning Revolutionizes Prediction and Assessment". AZoAi. https://www.azoai.com/news/20230726/Unveiling-Drought-Impacts-How-Explainable-Machine-Learning-Revolutionizes-Prediction-and-Assessment.aspx. (accessed September 18, 2024).

  • Harvard

    Dam, Samudrapom. 2023. Unveiling Drought Impacts: How Explainable Machine Learning Revolutionizes Prediction and Assessment. AZoAi, viewed 18 September 2024, https://www.azoai.com/news/20230726/Unveiling-Drought-Impacts-How-Explainable-Machine-Learning-Revolutionizes-Prediction-and-Assessment.aspx.

Comments

The opinions expressed here are the views of the writer and do not necessarily reflect the views and opinions of AZoAi.
Post a new comment
Post

While we only use edited and approved content for Azthena answers, it may on occasions provide incorrect responses. Please confirm any data provided with the related suppliers or authors. We do not provide medical advice, if you search for medical information you must always consult a medical professional before acting on any information provided.

Your questions, but not your email details will be shared with OpenAI and retained for 30 days in accordance with their privacy principles.

Please do not ask questions that use sensitive or confidential information.

Read the full Terms & Conditions.

You might also like...
Machine Learning Unveils Satellite Salinity Bias Patterns