In a paper published in the journal Scientific Reports, researchers delved into the complexity of predicting compressive strength (CS) in preplaced aggregate concrete (PAC), also known as two-stage concrete (TSC). They evaluated 13 machine learning (ML) models using 261 data points and 11 input variables.
While other models like gradient boosting and categorical boosting (catboost) also performed effectively, extreme gradient boosting (XGBoost) displayed exceptional accuracy. Sensitivity analysis highlighted the significant impact of input parameters, with gravel, sand, cement, and additives playing major roles. Utilizing the XGBoost model, they analyzed Shapley Additive Explanations (SHAP), identifying water-to-binder ratio, superplasticizer, and gravel as key CS factors.
Furthermore, researchers created a graphical user interface (GUI) for real-world use, simplifying prediction methods for civil engineering projects. This study provided valuable insights for researchers and practitioners, enhancing predictive models' reliability in PAC strength prediction.
Related work
Previous work has extensively explored the complexities of traditional concrete, while PAC has received comparatively less attention. Recognizing this gap, researchers have turned to ML techniques to predict concrete properties, reflecting a growing trend in integrating artificial intelligence (AI) into structural engineering. To indicate specific characteristics, a variety of machine learning (ML) techniques have been used, such as multi-expression programming (MEP), random forest (RF), gene expression programming (GEP), and artificial neural networks (ANN).
Methodology and Analysis
The study initially divided the dataset into training, testing, and validation sets to ensure robust model evaluation. The researchers employed ML techniques, including linear regression, support vector machine (SVM), and XGBoost. The performance of these models was meticulously compared, with an external validation analysis conducted to assess their effectiveness.
The researchers identified the most promising technique demonstrating superior predictive capabilities as the optimal approach for the study. Data collection involved utilizing a dataset comprising 261 samples from published literature, encompassing eleven key parameters influencing the compressive strength of PAC. They conducted descriptive analysis to characterize these input variables, examining mean, median, and skewness measures to understand their distribution and variability. Pearson's correlation analysis revealed relationships between variables, guiding understanding of their impacts on compressive strength.
Researchers optimized parameters and hyperparameters to develop the models and ensure peak performance. They utilized 13 ML models, each meticulously adjusted with precise hyperparameter values identified through systematic trial-and-error techniques. This meticulous approach to hyperparameter selection and refinement was crucial for accurate predictions.
Several regression error metrics were utilized to evaluate the model's effectiveness, including the correlation coefficient, mean absolute error, and root mean square error. These measurements offered insightful information about the predictive models' dependability and accuracy. Additionally, external validation was conducted by comparing the models' performance against established criteria and prior research findings, confirming their effectiveness.
ML Model Analysis
The results comprehensively analyze 13 ML models' performance predicting PAC's compressive strength. Regression plots and error assessments enabled the evaluation of the model's predictive capabilities, revealing XGBoost as the most accurate model, closely trailed by gradient boosting and catboost. External validation further supported these findings, emphasizing the importance of model selection based on specific research requirements.
Additionally, sensitivity analysis highlighted the relative contributions of input parameters to the overall model sensitivity, with gravel and water-to-binder ratio demonstrating significant impacts. The SHAP analysis elucidated the relative importance of input features on compressive strength, emphasizing variables such as gravel and water with substantial influences.
Furthermore, developing a GUI provided a user-friendly tool for predicting compressive strength, catering to academic research and industrial applications. Overall, the study's findings promise practical applications across various industries, from optimizing concrete mixtures to enhancing quality control in construction, ultimately revolutionizing concrete production, construction practices, and structural design within the construction industry.
The regression plot comparison showcased the varying performance of different ML models, with XGBoost demonstrating the highest R2 value, indicating superior predictive accuracy. Other models like catboost, gradient boosting, and voting regressor exhibited competitive performance, highlighting their efficacy in capturing complex data patterns.
Conversely, models like Lasso and Ridge Regressor showed relatively weaker fits, indicating limitations in explanatory power within this specific context. The error assessment provided further insights into model performance, with catboost, decision tree (DT), and linear regression showing relatively lower mean absolute errors, indicating higher accuracy.
Models like ANN and SVM exhibited moderate predictive abilities, while others like AdaBoost and RF demonstrated satisfactory performance. These findings collectively underscored the strengths and weaknesses of each model, guiding informed decisions regarding their practical application in predicting the compressive strength of preplaced aggregate concrete.
Conclusion
To summarize, this study evaluates 13 ML models to predict the compressive strength of preplaced aggregate concrete. Researchers identified key factors influencing strength, and models like XGBoost emerged as highly accurate predictors. Other models, such as gradient boosting and catboost, also showed strong performance.
External validation results confirmed the accuracy of selected models. Sensitivity analysis highlighted significant contributors, while SHAP analysis provided insights into key influential factors. This research offers valuable guidance for selecting appropriate models and optimizing concrete strength prediction.
Furthermore, the findings contribute to advancing the field of concrete technology by providing practical insights into optimizing mixtures and enhancing quality control in construction practices.
Journal reference:
- Javed, M. F., Fawad, M., Lodhi, R., Najeh, T., & Gamil, Y. (2024). Forecasting the strength of preplaced aggregate concrete using interpretable machine learning approaches. Scientific Reports, 14:1, 8381. https://doi.org/10.1038/s41598-024-57896-0, https://www.nature.com/articles/s41598-024-57896-0