In an article published in the journal Sensors, researchers focused on enhancing pedestrian monitoring in crowded areas like train stations using artificial intelligence (AI) and computational vision (CV).
By employing convolutional neural networks (CNN) to analyze video footage, the authors tested two methods: tracking individual bounding boxes for three-dimensional (3D) kinematics and analyzing body key points for pose and activity. Their goal was to improve station design and management for safety and efficiency based on accurate passenger movement data.
Background
The increasing use of rail as mass urban transportation underscores the need for efficient station management. Optimal passenger distribution on platforms and during boarding and alighting can reduce dwell times, enhancing both safety and comfort. High-frequency services hinge on minimizing train dwell times, thus optimizing platform-train interactions is crucial. Current methods, such as radio frequency identification (RFID) ticket monitoring and cellphone usage analysis, offer limited insights into passenger behavior and station dynamics.
Previous research has utilized various technologies to monitor passenger movements and behaviors. Visual sensors, particularly CV systems, have proven effective. AI algorithms, especially those based on neural networks, have been trained to detect and track objects in crowded environments. Notably, you only look once (YOLO)v7, an open-source tool, has shown promise in detecting multiple object categories in real time. However, most studies focus on general object detection rather than detailed analysis of passenger movements in metro stations.
This paper addressed these gaps by proposing a comprehensive approach to monitor pedestrian behavior in crowded train stations. Leveraging a pre-trained YOLOv7 model with tracking capabilities, the system accurately detected and tracked passengers, converting camera coordinates to station coordinates. It further smoothened the data using a kinematic filter, deriving key metrics such as velocity and density.
Additionally, the study incorporated pose estimation to recognize specific pedestrian activities, enhancing the ability to manage safety and operational efficiency in stations. By providing detailed insights into passenger dynamics, this research aimed to improve the design and management of public transit spaces.
Materials and Methods
The researchers utilized YOLOv7, an object detection algorithm, enhanced with tracking capability, for detecting and tracking people and trains in video frames. YOLOv7, trained on the Microsoft common objects in context (MS COCO) dataset, offered efficient detection and tracking with high accuracy and real-time performance.
Tracking was facilitated by the simple online and real-time tracking (SORT) algorithm, enabling object continuity in two-dimensional (2D) space. Camera calibration was essential for converting pixel coordinates to real-world distances, employing forward imaging modeling and homogeneous coordinates. An alpha-beta filter addressed gaps in data by estimating missing information and smoothing trajectories based on constant velocity assumptions.
Evaluation against ground truth data revealed the algorithm's accuracy in detection and tracking. Activity recognition involved extracting poses from video frames using AlphaPose and employing a pre-trained random forest model to predict activities. Challenges included adapting the model trained on simulated data to real-world scenarios with occlusions and multiple individuals.
Experimental Results and Preliminary Behavior Analysis
The experimental results and behavioral analysis encompassed three distinct video scenarios captured in an underground train station. Initially, a proof-of-concept video with a limited number of individuals facilitated algorithm testing and validation. Camera calibration, crucial for accurate tracking, was achieved through a set of reference points covering the station's volume. Despite challenges like occlusions, the tracking algorithms maintained consistency, aided by an alpha-beta filter for trajectory smoothing.
In a subsequent video featuring a larger crowd and the presence of a train, tracking challenges intensified. Occlusions led to temporary loss and reassignment of tracking identifications (IDs), necessitating advanced filtering techniques. Performance evaluation of the YOLOv7 detection method revealed accurate bounding box assignments, albeit with occasional discrepancies due to occlusions and projection errors during 2D-to-3D conversion.
A longer-duration video allowed for insights into station dynamics, including the detection of individuals inside trains and spatial density analysis. Activity recognition using a random forest model showed promising results in identifying sitting and walking behaviors, albeit with some false alarms, highlighting the potential for model refinement.
Behavioral analysis integrated kinematic data and activity classification, revealing passengers' actions throughout different stages, from entering and waiting on the platform to boarding and alighting trains. This multi-faceted approach provided comprehensive insights into passenger behavior and station dynamics, laying the groundwork for future improvements in tracking accuracy and activity recognition algorithms.
Conclusion
In conclusion, the researchers successfully tested AI and CV techniques for monitoring pedestrian behavior in crowded train stations. Using YOLOv7 for detection and tracking, and AlphaPose for activity recognition, the researchers analyzed videos to extract kinematic and pose information.
Despite challenges like occlusions, the methods proved effective in capturing accurate passenger movements and activities, providing valuable insights for station management. The integration of kinematic and activity data highlighted potential improvements in safety and efficiency. Future enhancements, including model fine-tuning and adaptation to varying conditions, can further optimize this approach, contributing to better station design and operations.
Journal reference:
- Garcia, G., Velastin, S. A., Lastra, N., Ramirez, H., Seriani, S., & Farias, G. (2024). Train Station Pedestrian Monitoring Pilot Study Using an Artificial Intelligence Approach. Sensors, 24(11), 3377. https://doi.org/10.3390/s24113377, https://www.mdpi.com/1424-8220/24/11/3377