In an article published in the journal Plos One, researchers focused on utilizing machine learning to develop a predictive model for digital transformation in businesses. They compared machine learning algorithms to determine the most accurate predictor, identified key factors influencing digital transformation, and proposed improvement strategies.
Through correlation analysis and interpretative analysis using Shapley additive explanation (SHAP) values, the authors offered insights into the impact of these factors on digital transformation, ultimately suggesting quantitative adjustment strategies for enhancing digital development in enterprises.
Background
The global economy is rapidly shifting towards digitalization, revolutionizing traditional business models and emphasizing the importance of digital transformation for enterprises. While existing research has extensively examined the benefits of digital transformation on firm development, there remains a gap in understanding how to enhance digital transformation capabilities at the firm level.
The present paper addressed this gap by leveraging machine learning, a powerful technology in computer science, to investigate the impact of various indicators on digital transformation in Chinese-listed manufacturing companies. Previous studies have demonstrated the benefits of digital transformation on innovation capabilities, corporate value, and environmental performance. However, they have primarily focused on the outcomes of digital transformation rather than strategies for enhancing its effectiveness.
By analyzing data from Chinese-listed manufacturing companies, this study contributed to the literature by identifying key indicators that influence digital transformation and proposing adjustment strategies to bolster its capability. By doing so, it provided actionable insights for enterprises seeking to accelerate their digitalization process and remain competitive in the digital economy. This research bridged the gap between theoretical understanding and practical implementation of digital transformation strategies, offering valuable guidance for businesses navigating the complexities of digitalization in today's economic landscape.
Data and methods
The study investigated the digital transformation capability (DCG) of Chinese-listed companies from 2014 to 2021. Initially, 22,776 samples were collected from authoritative databases, focusing on the manufacturing sector, resulting in 12,057 samples. DCG was assessed using Python web scraping on keywords from annual reports, with a logarithmic transformation applied for uniformity. Samples were categorized as having high or low DCG based on a threshold of 1.5, resulting in 6,280 low and 5,777 high DCG samples.
Feature engineering involved selecting financial and non-financial indicators, categorized into various aspects of company performance. Machine learning models including extreme random trees, gradient boosting machines, support vector machines, logistic regression, and multi-layer perceptron were employed for DCG prediction. Model validation utilized cross-validation and holdout methods to minimize overfitting. Evaluation metrics such as accuracy, precision, recall, and F1 score were derived from a confusion matrix, with performance compared using receiver operating characteristic curve (ROC) curves.
Hyperparameter optimization, crucial for model performance, was conducted through Bayesian optimization due to the dataset's complexity. The best hyperparameters for each model were selected based on accuracy. Overall, the study employed a comprehensive approach to investigate and predict DCG in Chinese listed companies, integrating data collection, feature engineering, machine learning modeling, and rigorous validation techniques.
Results and discussion
The performance comparison of machine learning models revealed that extreme random trees and gradient boosting machines outperformed support vector machines and multi-layer perceptrons. Extreme random trees exhibited superior accuracy, F1 score, recall, precision, and a larger area under the ROC curve, validating their optimal predictive accuracy.
Following model selection, feature screening was conducted to refine the dataset. Recursive feature elimination (RFE) and exhaustive feature selection (EFS) methods identified critical features influencing DCG. These features included research and development expenditure ratios, leverage ratios, and asset turnover ratios. To enhance interpretability, SHAP values were employed, highlighting the relative importance of features and their impact on DCG.
Features such as research and development expenditure ratios and leverage ratios demonstrated positive effects on DCG, while financial constraints and equity balance exerted negative influences. A quantitative adjustment strategy was proposed based on the predictive model and interpretability analysis. This strategy focuses on adjusting easily modifiable features like research and development expenditure ratios, leverage ratios, and asset turnover ratios to improve DCG.
A case study illustrated how adjustments in these features transformed companies from low to high DCG, demonstrating the practical application of the proposed strategy. Overall, the study provided insights into the factors influencing DCG and offered a systematic approach to enhance digital transformation in companies.
Conclusion
In conclusion, researchers utilized machine learning to predict digital transformation capability in Chinese listed companies. Extreme random trees and gradient-boosting machines outperformed other algorithms. Key indicators influencing digital transformation were identified through feature engineering and SHAP analysis.
The authors proposed quantitative adjustment strategies for enhancing digital transformation, providing actionable insights for businesses. Overall, they bridged the gap between theory and practice in digital transformation strategies, offering valuable guidance for companies navigating the complexities of digitalization.