In an article recently published in the journal Climate, researchers assessed flood susceptibility in Australian tropical cyclone-prone regions using a random forest (RF) machine learning (ML) model and satellite remote sensing data.
Background
Tropical cyclones are extreme weather phenomena that result in the destruction of infrastructure, high mortality, and significant economic loss. Torrential rain, storm surges, and destructive winds are the major hazards associated with tropical cyclones, which cause landslides and flooding.
Although the frequency of tropical cyclones will decrease due to anthropogenic climate change, their intensity is expected to increase. Additionally, tropical cyclone precipitation rates will increase worldwide, leading to an increased exposure to tropical cyclone-related flood hazards, which necessitates a better understanding of these hazards.
In recent years, ML models, such as the adaptive neuro-fuzzy inference system (ANFIS), decision trees (DTs), support vector machines (SVMs), and artificial neural networks (ANNs), have gained significant attention for mapping flood hazards. These models use past data and flood-influencing factors (FIFs) to infer whether a location can be flood-prone.
However, the accuracy of the ML models primarily relies on the training dataset, which is a major limitation as an error-prone training dataset can lead to inaccurate outputs. Additionally, a limited-size training dataset can increase the challenges for ML models to predict and extrapolate values on or outside of the training range spatial border.
Despite these limitations, ML models are more accurate compared to traditional multi-criteria decision-making (MCDM) and statistical flood hazard assessment and mapping methods.
The proposed approach
In this study, researchers investigated the tropical cyclone-induced flooding in coastal regions of Australia due to the impact of tropical cyclone Debbie in 2017. Specifically, they used differential evolution (DE)-optimized RF and flood data of the event/satellite remote sensing data to model flood susceptibility in the region of Airlie Beach, Mackay, and Bowen in North Queensland/landfalling region of tropical cyclone Debbie and created a flood hazard map.
An RF model was trained, optimized, and validated on the flooding event during tropical cyclone Debbie using specific FIFs to identify the flood susceptibility of the location. Additionally, the impacts of FIFs were utilized to explain the flood hazard map and critically evaluate the flood hazard assessment using the SHapley Additive exPlanations (SHAP) method.
The study used data from the Copernicus Emergency Management Service Activation EMSR200 rapid flood mapping using synthetic aperture radar on 29 March 2017. The flood-influencing factors (FIFs), including elevation, slope angle, stream power index (SPI), topographical wetness index (TWI), terrain ruggedness index (TRI), Distance to river (DtR), soil moisture (SM), normalized difference vegetation index (NDVI), and land use land cover (LULC), were selected based on the literature and their availability within the region studied in this research.
The receiver operating characteristic (ROC) curve and overall accuracy (OA) were utilized as metrics to evaluate the model performance. DE was used for hyperparameter optimization for RF algorithms. Moreover, the RF predict function was used to create the flood hazard map using the FIF datasets.
Significance of the study
The flood susceptibility of the tropical cyclone Debbie landfalling region was modeled successfully using a DE-optimised RF model, 988 flooded data points, and nine FIFs. The model scoring allowed an effective assessment of the model's ability to learn the rules of flooding in the study area, ignoring the noise within the data, leading to a 93.5% OA on the training dataset.
Additionally, the model displayed 80.2% accuracy on the testing dataset, which indicated the high accuracy of the model even on the unseen data. The ROC curve also demonstrated the model’s ability to differentiate between non-flooded and flooded points effectively.
The false positive (FP) and the true positive (TP) were plotted on the x-axis and y-axis at each classification threshold on the curve. The area under the curve achieved was 0.925, which demonstrated the model's robustness.
Elevation was the most crucial feature/FIF as the low-lying coastal regions were most prone to flooding and most flooding during the tropical cyclone occurred along the coast. Other important features/FIFs were slope, NDVI, SM, and DtR, while SPI, TWI, and TRI had minimal contributions to the flood hazard susceptibility outcome.
SHAP analysis confirmed the reliability of the flood hazard map. For instance, elevation was an important factor in the model’s prediction across the study region, excluding locations in the 20–100 m elevation and inland area in the south. Similarly, DtR displayed a major impact along all major rivers, while SPI values were high near rivers.
Thus, explainable artificial intelligence (AI) improved the model prediction interpretation, assisting decision-makers to better understand ML-based flood hazard assessments and mitigate adverse impacts of flooding in coastal regions affected by tropical cyclones.