In an article recently published in the journal Scientific Reports, researchers proposed a collaborative model based on the federated learning (FL) approach for passenger demand forecasting for autonomous taxis (ATs) within smart cities by preserving user data privacy.
Background
Smart cities are crucial for providing integrated, efficient, and sustainable urban environments that enhance the quality of life of residents while spurring economic development. In smart cities, ATs have emerged as a feasible solution for efficient and sustainable transportation by minimizing taxi and passenger waiting times, decreasing traffic congestion, and optimizing energy consumption.
The transition towards integration of autonomous vehicles (AV) in intelligent transportation systems (ITSs) has increased the importance of accurate passenger demand forecasting for ATs in urban areas, specifically smart cities to minimize passenger waiting time and optimize the ATs’ cruising time, which can improve passenger experience, reduce energy consumption, and increase transportation system efficiency.
Limitations of traditional approaches
Machine learning (ML) has been used to develop cost-effective forecasting models with improved performance in the AV industry. However, collecting a huge amount of data required to train these ML-based solutions is a significant challenge due to data security and privacy concerns.
Specifically, the ML-based solutions utilized in the transportation industry are often centralized, which implies storing and collecting sensitive passenger information in a centralized location. This arrangement can increase data privacy and security concerns and the possibility of connection delays and network latency. Additionally, substantial amounts of passenger data transmission over long distances can be highly expensive for transportation service providers.
FL as a possible solution
FL, an alternative ML approach, can enable clients to process data locally and train their models independently while maintaining the security and privacy of their data. FL can eliminate network issues and reduce communication overhead, which makes it a suitable solution for privacy-protected and secure prediction models for ATs.
FL-based models offer a decentralized learning approach, which allows AT companies to train secure and accurate models that can allocate tax to passengers efficiently while protecting passenger data safety and privacy. In the ATs’ context, FL facilitates the development and collaboration of passenger demand forecasting models in various regions without directly sharing the passenger data, which can effectively ensure data privacy.
The proposed FL-based approach
In this study, researchers have proposed a collaborative model using FL for passenger demand forecasting for ATs within smart city transportation systems to overcome the limitations of traditional forecasting approaches. The proposed approach can enable ATs in several regions of the smart city to collaboratively improve and learn their demand forecasting models through FL while preserving the passenger data privacy.
Multiple backpropagation neural networks were utilized as local models for collaborating to train the global model without any direct sharing of passenger data. The local model shared only the model updates with a global model that aggregated those shared updates, which was then sent back/distributed to local models for improvements.
Thus, this collaborative approach can reduce the communication costs and privacy concerns by enabling learning from each other’s data without direct data sharing. The proposed collaborative model framework consisted of four workflow steps, including data acquisition layer, preprocessing layer, training application layer, and validation layer.
Researchers evaluated the proposed approach using a real-time historical data of over 4500 taxis in Bangkok, Thailand. The data was collected using Internet of Things (IoT) devices installed in taxis. They used MATLAB2022b to compare the proposed FL-based approach with popular baselines utilized for regression-based problems and existing approaches mentioned in the literature on taxi demand forecasting systems to validate the model accuracy.
The baseline methods used as benchmarks for model evaluation were gradient boosting (GB), multi-layer perception (MLP), ensemble bagging (EN-BA), support vector regression (SVR), and random forest (RF). Different evaluation metrics, including R-squared (R2), mean absolute error (MAE), and root mean square error (RMSE), were used to determine the model accuracy in ATs demand forecasting.
Significance of the study
The FL-based approach displayed the best performance among all methods in the comparative analysis. The proposed model had the lowest MAE of 5.32, while SVR demonstrated the second-best performance with an MAE of 5.37. Similarly, the proposed model showed the lowest RMSE of 9.12, while MLP demonstrated the second-best performance with an RMSE of 10.69. The FL model also yielded the highest R2 value of 0.93, which was notably higher compared to the second-best R2 value of 0.89 achieved using MLP.
These simulation results displayed the superior overall performance and predictive accuracy of the proposed model while ensuring data privacy of users/passengers. Moreover, the proposed model outperformed the existing approaches mentioned in the literature based on the MAE and RMSE evaluation metrics.
To summarize, the findings of this study demonstrated the feasibility of using the proposed FL-based approach for accurate passenger demand forecasting for ATs by preserving user data privacy.