In a recent article published in the journal Cleaner Environmental Systems, researchers proposed an innovative machine learning method to estimate missing data in road statistical datasets for material stock and flow analysis. They aimed to show how machine learning models can support national-level analyses of road material stock and flow, and how this can help assess the potential for reducing embodied carbon emissions from road construction.
Background
Road construction is a major driver of material demand and greenhouse gas (GHG) emissions in the global construction sector. Material flow analysis (MFA) is useful for estimating embedded emissions and material flows in transport infrastructure. However, it often struggles with data scarcity and incompleteness. Missing details, like road width, can impact the accuracy of MFA studies. Therefore, there is a need for new methods that can use a broader range of available data to predict missing information with more precise geographic detail.
About the Research
In this paper, the authors developed a machine learning method to predict missing road width data in a Swedish national road dataset using only open-source data and software. They assumed that the physical attributes of roads in a region are somewhat correlated and that these correlations can be captured by a machine learning regression model. Their approach involved four steps: 1) Data gathering and preprocessing, 2) Feature engineering, 3) Machine learning, and 4) Material stock and flow analysis.
The researchers used various data sources, including road shapefiles with attributes, building footprint shapefiles, socio-economic data, and material intensity data for road types. The features of the machine learning model included road, building, socio-economic, and network features.
They tested four machine learning algorithms: Random Forest, extra-gradient boosting (XGBoost), categorical boosting (CatBoost), and light gradient boosting machine (LightGBM). Model performance was evaluated using mean absolute error (MAE), mean absolute percentage error (MAPE), and coefficient of determination (R²).
Furthermore, the resulting machine learning model was used to create a hybrid dataset with real and synthetic road widths. This dataset was then used for a stock-driven MFA of Swedish roads up to 2045. Scenario-based emission factors were applied to estimate embodied emissions and potential reduction pathways.
Research Findings
The outcomes showed that the XGBoost model outperformed the other algorithms, achieving an MAE of 0.567 m, a MAPE of 12.5%, and an R² value of 0.784. This model accurately predicted missing road width data using open-source data. The predicted dataset was then utilized to estimate material stock, flows, and embodied emissions of Swedish roads from 2020 to 2045 from a supply chain perspective.
The study estimated that in 2020, the total material stock of Swedish roads was about 1.8 billion tonnes, including 1.6 billion tonnes of aggregates, 0.2 billion tonnes of asphalt, and 0.01 billion tonnes of steel. The total embodied emissions were about 16.8 million tonnes of carbon dioxide, with 11.4 million tonnes from asphalt, 4.9 million tonnes from steel, and 0.5 million tonnes from aggregates.
Furthermore, the authors examined two future scenarios: a Business-as-usual scenario and an Emission Reduction scenario. The Business-as-usual scenario assumed no changes in emissions from materials production after 2020, while the Emission Reduction scenario anticipated significant reductions in emissions from material production over the studied period. The study found that Swedish roads embodied emissions could be lowered by up to 51% in the Emission Reduction scenario compared to the Business-as-usual scenario by using available materials.
Applications
The research highlights how machine learning can be applied to MFA studies of roads, addressing data gaps and improving the accuracy of material stock and flow estimates. It shows how the proposed method can evaluate potential reductions in embodied carbon emissions from road construction and explore various decarbonization scenarios and pathways. The findings provide valuable insights for policymakers, road authorities, construction companies, and researchers interested in the environmental impacts of road infrastructure.
Conclusion
In summary, the novel machine learning technique effectively estimated missing data in road datasets and provided accurate estimates of material stock, flows, and embodied emissions of roads from 2020 to 2045. The researchers explored two future scenarios for material production and road construction, assessing potential reductions in embodied carbon emissions. They demonstrated the value of machine learning for national-level road material analyses and supporting decarbonization efforts. For future work, they recommended including various road types, components, and materials, as well as conducting uncertainty and sensitivity analyses.