Using AI and cutting-edge climate models, scientists unlock century-old climate patterns to better understand Europe’s vulnerability to extreme weather events—critical insights for future policy and resilience planning.
Research: Artificial intelligence reveals past climate extremes by reconstructing historical records. Image Credit: Sepp photography / Shutterstock
In a paper published in the journal Nature Communications, researchers examined historical climate extremes and climate risk by reconstructing European climate events using artificial intelligence (AI). They utilized a coupled model intercomparison project phase 6 (CMIP6) of Earth system model data through transfer learning, surpassing traditional statistical methods (e.g., Inverse Distance Weighting and Kriging) and diffusion models in accuracy. This approach enabled the reconstruction of extreme events and regional spatial trends from 1901 to 2018, enhancing data granularity and filling historical data gaps that hinder climate analysis and policy.
Related Work
Past work highlighted unprecedented global and European temperature records, with Europe experiencing record-breaking summers, droughts, and wildfires in recent years. These extremes underscore rising climate risks, as the Intergovernmental Panel on Climate Change Sixth Assessment (IPCC AR6) report identified Europe as significantly affected by intense and frequent heat events since 1960. Advances in deep learning (DL) have enhanced climate data infilling, with convolutional neural network (CNN) and generative adversarial network (GAN) methods proving more reliable and accurate than traditional statistical approaches.
Climate Data Reconstruction Techniques
The Hadley Centre Global Climate Extremes Index, version 3 (HadEX3), and HadEX-CAM datasets provide climate indices derived from daily precipitation and temperature data collected at over 30,000 weather stations. For HadEX3, index values undergo quality control and are interpolated onto a latitude-longitude grid using the ADW method, which is effective for extrapolating sparse station data. However, due to the limitations of angular distance weighting (ADW) in capturing spatial variability, HadEX3 data are unsuitable for AI-based infilling research.
Instead, HadEX-CAM was created using the climate anomaly method (CAM), which does not interpolate but averages valid station data within grid boxes on a 1.875° × 1.25° grid. This CAM approach ensures index calculation accuracy by referencing the 1981–2010 period, and only stations missing data for more than three days a month are excluded to avoid anomalies. To further improve data integrity, anomalous values from specific stations (e.g., Lugano) were excluded.
A DL algorithm called climate reconstruction AI (CRAI) was used for data reconstruction. The CRAI model trains on the HadEX-CAM indices and missing-value masks, simulating realistic data gaps. The model was optimized by conducting hyperparameter searches, resulting in a configuration that maintained predictive accuracy across extreme indices by averaging outputs from multiple model runs. This ensemble approach also enabled uncertainty estimation, improving model robustness and showing high performance in data-scarce regions like the Mediterranean.
TX90p is the percentage of days when the daily maximum temperature is > 90th percentile. Left panel: original HadEX3 dataset. Central panel: original HadEX-CAM dataset. Right panel: reconstruction using CRAI.
The reconstruction focused on recreating monthly extreme indices directly, rather than daily values, to maintain consistency with HadEX3’s data structure. Limited training data led to a transfer learning approach using Earth system data from 45 CMIP6 simulations, with resolutions compatible with HadEX-CAM. The extreme indices were calculated and remapped onto the HadEX-CAM grid before being split into training, validation, and test sets by incorporating data across both higher and lower-resolution models. This random split ensured comprehensive model training, capturing specific climate change trends and historical event markers over the data period.
The model evaluation relied mainly on European Reanalysis 5th Generation (ERA5) and 20th Century Reanalysis Version 3 (20CRv3) reanalysis datasets, given their high alignment with HadEX3 data. While ERA5 provided detailed data from 1940 to 2018, 20CRv3 extended coverage to the early 20th century. Taylor diagrams confirmed ERA5’s alignment with HadEX-CAM indices, making it ideal for model evaluation across different climate periods.
Robust Climate Data Reconstruction
The CRAI utilizes a U-Net architecture with partial convolutions to reconstruct extreme climate indices effectively. Trained on historical simulations from the CMIP6 archive, CRAI underwent evaluation using datasets not included in the training, including simulation data, reanalysis datasets like ERA5, and observational data from HadEX-CAM. Metrics such as root mean square error (RMSE), Spearman rank-order correlation coefficient (SROCC), Wasserstein distance (WD), and the coefficient of determination (R²) demonstrated that CRAI consistently outperformed traditional interpolation methods like inverse DW (IDW) and Kriging by consistently yielding lower RMSE values and better alignment with ERA5 and 20CRv3 data.
Additionally, CRAI exhibited a remarkable ability to generalize, performing effectively on reanalysis datasets, further confirming its robustness in handling climate data with varying degrees of missing values and offering improved accuracy for extreme indices in high-risk areas.
The comparative analysis of CRAI with other methods showed superior results in reconstructing datasets, particularly for extreme indices with lower RMSE values. The evaluation process indicated that while all methods displayed improved performance in regions with fewer missing values, CRAI remained particularly effective in areas with a higher prevalence of missing data.
When applied to the full HadEX-CAM dataset over Europe, CRAI maintained alignment with findings from the AR6 IPCC report, revealing trends in warm days and cool nights. Discrepancies were observed mainly in earlier years, particularly for the Mediterranean region, highlighting CRAI's sensitivity to regional missing data trends and ability to enhance historical climate reconstruction.
Conclusion
To sum up, the study presented an AI-based reconstruction of temperature extreme indices in Europe from 1901 to 2018, leveraging transfer learning from diverse CMIP6 simulations and benefiting from the region's dense early measurements. The reconstructed dataset outperformed traditional methods like IDW and Kriging, preserving the accuracy of HadEX3 while offering detailed local climate representations.
Further analysis confirmed CRAI's capacity to reveal detailed regional patterns across extreme heat and cold events, aligning with AR6 IPCC findings but adding resolution to events like the 1911 European heatwave and the 1929 cold wave. This AI approach demonstrates the potential to support data-driven climate policy and adaptation planning by filling data gaps in historical records and enhancing climate risk assessment.
Journal reference:
- Plésiat, É., Dunn, R. J., Donat, M. G., & Kadow, C. (2024). Artificial intelligence reveals past climate extremes by reconstructing historical records. Nature Communications, 15(1), 1-12. DOI:10.1038/s41467-024-53464-2, https://www.nature.com/articles/s41467-024-53464-2