In a recent publication in the journal Scientific Reports, researchers explored the significance of biomass higher heating value (HHV) in gauging energy potential from agricultural byproducts. A robust model for HHV estimation emerges through a combination of feature selection and machine learning (ML).
Background
In recent years, renewable energy sources such as biomass, solar radiation, hydropower, geothermal, and tidal energy have gained prominence in various global sectors due to their accessibility, cost-effectiveness, and efficiency. Biomass has drawn much interest from these sources. It goes through various mechanical and chemical processes before being turned into energy. Numerous studies explore different aspects of biomass conversion, including combustion and pyrolysis specifications, the thermodynamics of biomass gasification-based syngas, and efficient biofuel production.
HHV plays a pivotal role in designing and operating biomass-based energy systems. While HHV measurement through adiabatic oxygen bomb calorimetry is accurate, it is time-consuming and costly. Proximate analysis offers an efficient and cost-effective alternative to estimating HHV.
ML models for HHV
Recently, ML tools have made significant strides in diverse academic and industrial domains, including renewable energy and environmental preservation. Different ML techniques are used in biomass energy. These include artificial neural networks (ANNs), adaptive neuro-fuzzy inference systems (ANFIS), random forest (RF), support vector regression (SVR), and group methods of data handling (GMDH). Researchers used ML to explore important details about turning plant material into energy, predict how much heat different plant materials can produce, connect waste material heat values with related factors, and see how moisture affects plant material characteristics.
The current study systematically selects the most influential biomass features governing HHV using feature selection techniques. These selected features are independent variables for HHV estimation using five ML tools. Sensitivity analysis identifies the most accurate tool, which is then validated against a recent model from the literature. This study is the most comprehensive in the field and ranks proximate and ultimate compositional analyses according to their significance for HHV. It addresses the research gap by combining feature selection scenarios with ML methods and offering superior accuracy compared to recent models.
Results and analysis
To construct and evaluate a robust predictive model for HHV, an extensive experimental database is essential. Researchers compiled a database consisting of 532 HHV records and corresponding proximate (fixed carbon, ash, and volatile matter) and ultimate (carbon, hydrogen, sulfur, nitrogen, and oxygen) compositional analyses. The data undergoes several steps: feature selection, ML model design, model comparison, and evaluation.
Feature Selection: The study utilizes two feature selection methods, namely multiple linear regression (MLR) and Pearson correlation coefficient, to rank the importance of various compositional features of biomass samples with their effect on the observed HHV. The selected features include fixed carbon, volatile matter, ash, carbon, nitrogen, oxygen, sulfur, and hydrogen content.
MLR is employed to establish a linear relationship between HHV and its influential variables. Coefficients from the MLR model indicate the strength and direction of influence. Carbon, ash, fixed carbon, sulfur, and hydrogen content are identified as the most important features affecting HHV. Pearson's correlation coefficient measures the strength and direction of the relationship between HHV and each influential variable. It confirms that carbon, ash, fixed carbon, hydrogen, and sulfur content are the most important features.
ML Model Design: Different ML models such as RF, GMDH, multilayer perceptron neural networks (MLPNN), cascade feedforward neural networks (CFFNN), and least-squares support vector regressors (LS-SVR) are made to predict biomass HHV. They use important variables found through feature selection for this prediction. These ML tools automatically adjust coefficients through optimization algorithms and involve hyperparameters determined through trial and error or other search techniques. To compare predictions with actual HHVs, statistical criteria including absolute average relative error (AARE), mean squared error (MSE), root mean squared error (RMSE), and regression coefficient (R) are used.
Selecting the Highest Accurate Model: A ranking test determines the best-performing model based on its average rank across the statistical criteria. The MLPNN is the best model, accurately predicting biomass HHV.
Performance Analysis: The MLPNN's performance is further analyzed using scatter plots, error analysis, and statistical characteristics. It consistently demonstrates strong compatibility with actual HHV measurements and maintains errors.
Validation by the Literature Model: Recent literature has applied recurrent neural networks (RNN) to predict biomass HHV from compositional analyses. Results revealed that the MLPNN model outperforms the RNN model in terms of statistical metrics, both in the training and testing phases. Visual comparisons through radar graphs reinforce the superior performance of the MLPNN. Overall, the MLPNN is the most accurate and reliable model for predicting biomass HHV based on the identified influential variables.
Conclusion
In summary, feature selection and ML techniques were applied to accurately predict biomass HHV using 532 experimental records. Experiment results show that the multilayer perceptron neural network model outperforms other models for estimating biomass HHV.
Journal reference:
- bdollahi, S.A., Ranjbar, S.F. and Razeghi Jahromi, D. (2023). Applying feature selection and machine learning techniques to estimate the biomass higher heating value. Scientific Reports 13, 16093. DOI: 10.1038/s41598-023-43496-x, https://www.nature.com/articles/s41598-023-43496-x
Article Revisions
- Jul 9 2024 - Fixed broken journal link.