Forecasting soil physical properties such as clay, sand, and silt is vital for agricultural and environmental purposes. In a recent publication in the journal Sustainability, researchers proposed an innovative method to predict soil properties. The proposed method unites geospatial artificial intelligence (GeoAI) with satellite imagery fusion.
Background
Soil plays a vital role in climate and ecosystem regulation and is fundamental for producing 97 percent of the world's food. It also profoundly impacts agricultural productivity, environmental protection, and wildlife conservation. The texture of the soil is classified as sand, silt, and clay. It is critical for erosion control, water management, and productivity. Soil faces significant challenges, including erosion from rainfall at various scales, which can modify its properties, especially its texture. Hence, predicting soil properties spatially becomes imperative for assessing the impact of human use on soil quality.
Remote sensing (RS) data, which is globally accessible and abundant, greatly benefits agriculture. Advances in RS technology have improved large-scale data processing. Geographic Information Systems (GIS) with RS enhance data collection efficiency, analysis, and modeling. GIS tools facilitate the integration of spatial information and environmental parameters, aiding spatial prediction.
Soil property forecasting
To address the significant challenges, the research aims to predict soil physical properties within a 30 by 30-meter spatial resolution. The soil sample dataset comprises 317 samples collected from depths of 0–30 cm by the Iran Water and Soil Research Institute. These samples encompass clay, sand, and silt properties and were collected using a grid sampling approach with GPS coordinates. The samples span various land cover categories, with 73 percent from agricultural land, 13 percent from range land, and others from different land types. Many samples are situated below 200 m in altitude.
Soil texture analysis employed the hydrometer method, and the dataset was reduced after removing outliers. Environmental parameters considered include RS variables, climate variables, and topographic variables, influencing soil texture properties. The topographical data was provided by the Shuttle Radar Topographical Mission (SRTM) digital terrain model, and the RS parameters were provided by the Landsat-8 satellite. Climate information was gathered via meteorological stations.
Researchers used a hybrid CNN-RF model, the Convolutional Neural Network (CNN), and the Random Forest (RF) algorithm to predict the features of soil texture. RF combines decision trees for higher accuracy, while CNN extracts features. The CNN-RF hybrid leverages both approaches for improved performance.
Results and analysis
The correlation coefficients between the features of the Multi-Resolution Index of Valley Bottom Flatness (MRVBF) and clay (0.2) and between RS variable Band 7 (B7) and clay (-0.26) indicate relatively strong relationships compared to other parameters. Conversely, the associations between clay and RI, as well as clay and aspect, were weak. For sand, the correlation coefficients of -0.18 with the environmental variable land surface temperature (LST) and 0.12 with the RS variable Band 5 (B5) show stronger associations compared to other parameters.
MRVBF and Coloration Index (CI) exhibited the weakest correlations with sand. The correlation coefficient of silt and elevation is strongly positive, and the correlation of silt with the Normalized Difference Vegetation Index (NDVI) is relatively strongly negative. However, the associations between silt and B5 and silt and the environmental variable Redness Index (RI) were weak.
The feature importance analysis conducted using the RF algorithm revealed that B7, CI, and Topographic Wetness Index (TWI) had higher associations with silt content. For sand, LST, B5, and elevation exhibited relatively higher importance, while MRVBF, B7, and TWI were significant factors influencing clay content. The impact of parameters on clay and sand was mainly influenced by RS, topography, and climatic parameters, with RS parameters having a more significant impact. However, climatic parameters had no role in determining silt content. Among climatic parameters, rainfall, and RS parameters, NDVI and B7 significantly impacted soil texture. Among topographic parameters, TWI, MRVBF, and Multi-Resolution Ridgetop Flatness Index (MRRTF) were the most influential.
The outcomes showed that the CNN-RF model performed better in both the training and testing stages than the other models for all soil texture parameters. According to the mean-square error (MSE) evaluation metric, the CNN model was more accurate than the RF method, with sand, silt, and clay qualities showing the highest accuracy. The prediction error plots and Taylor diagrams confirmed the superior accuracy of the CNN-RF model.
Finally, prediction maps for soil texture properties were generated for the entire study area, showing that the CNN-RF model provided predictions closer to the actual values compared to the stand-alone models. The RF and CNN models exhibited more similarities in their prediction maps. External validation using soil samples not used during training further validated the accuracy of the models, with the CNN-RF model consistently performing the best.
Conclusion
In summary, the study compared CNN, RF, and CNN-RF algorithms for soil texture prediction using satellite images. RF identified MRVBF, LST, and B7 as crucial for clay, sand, and silt. Remote sensing variables, especially NDVI, B7, SI, B5, CI, RI, and CLI, had the most influence. The hybrid CNN-RF model showed the highest accuracy. Prediction maps benefit agriculture, erosion monitoring, and irrigation. Future research could explore meta-heuristic algorithms, gray-level cooccurrence matrix variables, and additional machine-learning models for improved accuracy.