Artificial intelligence (AI) methods like deep learning (DL) are revolutionizing weather forecasting and crunching massive data sets to generate faster, more accurate predictions. They even identify complex weather patterns that traditional methods might miss. This tech is leading to a new era of weather forecasting, with benefits for everyone from farmers to disaster response teams. This article discusses the importance and applications of AI, especially DL, in weather forecasting and some recent developments.
Importance of AI
Timely and accurate weather forecasting is critical in the early warning of weather impacts on different aspects of human livelihood like harvesting and irrigation in agriculture and construction work. For instance, weather forecasting offers decision-making support for autonomous vehicles to reduce traffic congestion and accidents, which entirely depend on predicting and sensing external environmental factors like air visibility and rainfall.
Current numerical weather prediction (NWP) methods face limitations while rooted in theory. These include the need for powerful and costly computing resources, challenges in extracting useful insights from massive observational datasets, and an incomplete understanding of the underlying physical mechanisms governing weather.
In NWP, small differences in initial conditions due to the chaotic nature of the atmosphere significantly influence the model results. Additionally, the presence of different forms of uncertainties and diversity in the datasets, coupled with the spatio-temporal correlations between datasets, results in substantial challenges to NWP.
The success of data-driven DL methods in time series prediction, speech recognition, and computer vision shows their ability to effectively extract spatial and temporal features from spatio-temporal data. Thus, DL-based weather prediction (DLWP) can act as a suitable alternative to the conventional method, as meteorological data is big geospatial data.
DL as Enhancements to NWP Methods
Although several studies have already used data-driven DL in weather forecasting, the best forecast performance was obtained from the combination of AI and NWP techniques. Conventional NWP data assimilation requires developing a handcrafted observation parameter, which is a dedicated human development effort for all assimilated observation types.
Moreover, only a small fraction of the available observations are utilized as data thinning methods are employed before the assimilation. DL methods are more efficient compared to conventional methods as they learn from data, thereby using the full potential of available observations. The availability of sufficient training data is crucial for training a DL method. However, sourcing adequate training data to train an AI method that completely replaces a conventional NWP method is extremely difficult.
Thus, the available training data can be utilized to improve the existing NWP physical models by adopting a residual learning approach. For instance, a time-consuming iterative online cost function optimization is required in 4D-Var NWP data assimilation, which is an inverse modeling problem. Although a cost function optimization is also required while training a DL method, this optimization is made only once in an offline mode, unlike NWP data assimilation where the optimization is made iteratively, making the DL approach operationally very efficient.
Probabilistic NWP forecasts are primarily obtained by running several forecasts with marginally modified initial conditions, which is a time-consuming approach. A DL can be trained for both probability density output and a deterministic forecast without producing several integrations of a deterministic climate or NWP model due to the flexibility of DL techniques.
DL Applications
A convolutional long-short term memory (LSTM) (ConvLSTM) network with convolutional structures in both the state-to-state and input-to-state transitions has been developed for precipitation nowcasting. ConvLSTM comprises a forecasting network and an encoding network. The encoding network encodes the meteorological data's spatio-temporal relationships, improving forecasting accuracy.
However, the ConvLSTM's convolutional recurrence structure is location-invariant. Thus, a trajectory-gated recurrent unit (TrajGRU) model has been proposed to learn the location-variant structure actively in natural motion and transformation. This TrajGRU model utilizes the previous state and current input to generate the local neighborhood set dynamically for every location at each timestamp. Thus, TrajGRU efficiently captures the spatio-temporal correlations in meteorological data.
A predictive recurrent neural network (RNN) (PredRNN) that incorporates additional connections between the adjacent time steps in a core stacked spatio-temporal LSTM (ST-LSTM) was proposed as a general forecasting framework. The PredRNN uses a dual-memory mechanism to memorize and extract both temporal and spatial variations of the sequences simultaneously in a unified memory pool. PredRNN and its variants, like improved PredRNN (PredRNN++), are general frameworks successfully extended to precipitation nowcasting.
Google Research has recently presented high-resolution precipitation nowcasting DLWP models, including a model that leveraged the ubiquitous U-Net CNN, and the forecasting was considered as an image-to-image translation problem in one study, and an improved neural weather model (NWM) MetNet that employs axial self-attention mechanisms, to forecast precipitation rates. MetNet was the first DLWP model that outperformed NWP at a certain temporal and spatial scale.
Toolkit for extreme climate analysis (TECA) primarily involves large-scale detection of patterns in climate data using heuristic methods. Based on the TECA analysis output, a deep convolutional neural network (CNN) was applied to predict the class label for two extreme weather events by treating it as a binary classification task on cropped, centered patches from two-dimensional (2D) multi-channel images.
A three-dimensional (3D) multi-channel spatio-temporal convolutional encoder-decoder was presented in a study for multi-class localization and detection of extreme weather events such as tropical depressions, extra-tropical cyclones, and tropical cyclones. This study used deep autoencoding architecture for bounding box regression using semi-supervised learning, which involved training the autoencoder with reconstruction for unlabeled data. Thus, this approach effectively addresses the challenge of labeling meteorological datasets. Similarly, a hybrid CNN-LSTM model was developed for both typhoon intensity and formation forecasting.
This hybrid model was designed to capture the complex temporal and spatial features by three components, including a CNN with a 3D filter that was leveraged to obtain the spatial correlations between 3D atmospheric variables like air pressure and wind, an LSTM designed for capturing the temporal correlations, and a 2D CNN that extracts features from the previous feature map's local neighborhood and is utilized to analyze 2D sea surface variables like sea surface temperature (SST).
The CNN-LSTM hybrid model performance was notably better compared to previous typhoon forecasting methods. The weather forecasting problem was treated as an end-to-end DL problem in a study based on an LSTM autoencoder. An effective information fusion mechanism was proposed to learn from historical data incorporating prior knowledge from NWP to forecast several meteorological variables.
A recent study published in Pattern Analysis and Applications presented an innovative, lightweight, data-driven weather forecasting model. The study explored temporal modeling approaches of temporal convolutional networks (TCN) and LSTM and evaluated the model performance.
The proposed DL networks with TCN and LSTM layers were assessed in two regressions, including multi-input single-output and multi-input multi-output. Results demonstrated that the proposed DL LSTM model could be effectively utilized for more accurate and efficient weather forecasting for up to 12 hours compared to the well-established and complex weather research and forecasting (WRF) NWP model.
Recent Developments
A study published in Nature presented an AI-based method for medium-range, accurate global weather forecasting. The proposed powerful AI-based weather forecasting system, Pangu-Weather, trained on 39 years of global data, generated stronger deterministic forecast results compared to the operational integrated forecasting system (IFS) of the European Centre for Medium-Range Weather Forecasts (ECMWF), the world's best NWP system, on every tested weather variable against reanalysis data.
Researchers demonstrated that 3D deep networks equipped with Earth-specific priors could effectively deal with complex patterns in weather data. Additionally, a hierarchical temporal aggregation strategy could reduce accumulation errors in medium-range forecasting. The proposed method also showed its effectiveness in ensemble forecasts and extreme weather forecasts. For instance, the method displayed higher accuracy in tracking tropical cyclones compared to ECMWF-HRES when Pangu-Weather was initialized with reanalysis data.
Overall, DL-based approaches excel in weather forecasting by increasing both forecasting efficiency and accuracy and addressing the shortcomings of conventional weather prediction methods effectively. However, challenges like effective integration with NWP, generalizability, data availability, and interpretability must be addressed effectively for wider adoption of these approaches.
References and Further Reading
Bi, K., Xie, L., Zhang, H., Chen, X., Gu, X., Tian, Q. (2023). Accurate medium-range global weather forecasting with 3D neural networks. Nature, 619(7970), 533-538. https://doi.org/10.1038/s41586-023-06185-3
Hewage, P., Trovati, M., Pereira, E., Behera, A. (2021). Deep learning-based effective fine-grained weather forecasting model. Pattern Analysis and Applications, 24(1), 343-366. https://doi.org/10.1007/s10044-020-00898-1
Ren, X., Li, X., Ren, K., Song, J., Xu, Z., Deng, K., Wang, X. (2021). Deep Learning-Based Weather Prediction: A Survey. Big Data Research, 23, 100178. https://doi.org/10.1016/j.bdr.2020.100178
Dewitte, S., Cornelis, J. P., Müller, R., Munteanu, A. (2020). Artificial Intelligence Revolutionises Weather Forecast, Climate Monitoring and Decadal Prediction. Remote Sensing, 13(16), 3209. https://doi.org/10.3390/rs13163209