In a paper published in the journal Computers and Electronics in Agriculture, researchers aimed to accurately estimate crop evapotranspiration (ETc) for winter wheat using several interpretable and non-interpretable machine learning (ML) models using observed crop ETc data from 2007 to 2013. The models, including random forest, extreme gradient boosting, support vector machine, and deep neural network, were optimized using the particle swarm optimization algorithm. This research also aims to address the challenges in optimizing water management practices.
Background
In agricultural research, the challenges posed by water scarcity and climate change on agricultural development and food security have garnered attention. This issue is particularly significant in the North China Plain, a critical contributor to China's wheat production and the world's largest winter wheat producer. However, the region faces hurdles as monsoons impact its yearly rainfall, supplying only 20–30% of the necessary water for the winter wheat growth period. To address water shortages and maintain crop yields, irrigation becomes vital. Yet, this practice strains groundwater reserves, leading to ecological imbalances and hindering sustainable agricultural development. Thus, creating efficient irrigation systems and enhancing water usage in winter wheat cultivation is crucial to alleviate water stress in the North China Plain.
Accurately estimating ETc is paramount in this context, serving as a foundation for rational water management, conservation, and enhanced productivity. ETc encompasses the comprehensive water loss through crop inter-class evaporation and plant transpiration, significantly influencing crop water requirements. Precisely estimating winter wheat ETc in the North China Plain is vital for mitigating water stress, enhancing water use efficiency, and promoting sustainability and water resource management. Several techniques, including the aerodynamic method, lysimeter method, vorticity correlation method, water balance method, and remote sensing method, have been employed for monitoring ETc. However, these methods often involve high costs, time investments, and substantial efforts, making direct ETc measurements challenging.
Proposed method
Kc signifies the ratio of ETc to ET0, portraying distinct crop attributes that define the demand for crop evapotranspiration, encompassing factors like crop height, crop-soil surface resistance, and crop-soil surface albedo. According to the single crop coefficient method in FAO-56, the growth stages of winter wheat were divided into four standard stages: initial, crop development, mid-season, and late season. These growth stages were determined for winter wheat in 2007–2013.
The PSO algorithm optimizes the performance of interpretable and non-interpretable models for estimating the daily ETc of winter wheat under varying meteorological conditions and Kc combinations. PSO seeks optimal solutions to enhance search performance and convergence in a given dimension. Modern ML models have succeeded in various domains, but their opacity raises interpretability concerns. Interpretability, enabling understanding of input-output mapping, is crucial for transparency. This study developed interpretable ML models based on PSO-optimized random forest (PSO-RF) and extreme gradient boosting regression (PSO-XGBR) for accurate winter wheat ETc estimation.
Performance metrics such as Mean Absolute Error (MAE), Root Mean Square Error (RMSE), Nash-Sutcliffe Efficiency (NSE), R2, and Global Performance Indicator (GPI) were used to evaluate the models. Notably, models based on crop coefficient (Kc) and solar radiation (Rn) inputs proved accurate even with limited meteorological data. Among the models, the Particle Swarm Optimized Support Vector Machine (PSO-SVM) showed the best ETc estimation results. The study also employed local interpretable model-agnostic explanations (LIME) to provide insights into hydrological and climatic processes. The inflection points of daily climatic parameters related to ETc were identified.
Experimental results
From 2007 to 2013, researchers collected daily flux data from irrigated farmland ecosystems in the North China Plain to analyze winter wheat growth. They aimed to determine the Kc values that influence the ETc of winter wheat across different growth stages. The daily actual Kc values were derived by dividing observed ETc by reference ET0, and Kc values for different growth stages were adjusted using the least square method based on the single crop coefficient approach.
Kc values for the initial and end stages of winter wheat growth were lower compared to recommended values, while Kc for the mid-season stage was similar to the recommended value. The study also examined correlations between meteorological data (temperature, radiation, humidity, wind speed, etc.), Kc values, ETc of winter wheat. It was revealed that certain meteorological factors, including net radiation (Rn) and temperature (Tmax, Tmin), had stronger correlations with ETc compared to others.
Various interpretable ML models were employed to estimate winter wheat ETc based on different input combinations of meteorological data and Kc values. It was observed that certain ML models, specifically the PSO-SVM and PSO-RF models, provided reliable estimates of ETc using minimal meteorological inputs. The models' performances were further improved by including additional meteorological variables. LIME analysis was used to enhance the understanding of the ML models' behavior and provide interpretable explanations for their predictions. This analysis highlighted the significance of certain meteorological variables, such as temperature and net radiation, in influencing the estimated ETc values.
Conclusion
In summary, this study developed interpretable and non-interpretable MLg models using the PSO algorithm to estimate winter wheat ETc with limited meteorological data accurately. The recommended winter wheat ETc estimation models were the PSO-RF model (interpretable) and the PSO-SVM model (non-interpretable). Adding key variables like Kc, Rn, Tmax, Tmin, n, and U2 in the ninth input combination significantly improved model performance, with the PSO-SVM model showing the best results. Through LIME analysis, a hypothesis was proposed that related lower Tmin, Tmax, Rn, and n values to lower winter wheat ETc, which was found to be accurate and aligned with real-world hydroclimatic processes. This study contributes robust models for winter wheat ETc estimation, augmented by enhanced interpretability using the LIME method. Future research should focus on refining input data, exploring additional interpretability techniques, and expanding the scope to benefit agricultural water resource management and crop production sustainability.