In a recent paper published in the journal Scientific Reports, researchers introduced a variant of the You Only Look Once version 5s (YOLOv5s) model, ODEL-YOLOv5s, to tackle the challenges of obstacle identification in a coal mine environment.
Background
The coal mine rail electric locomotive plays a pivotal role in transporting materials, underground operators, and gangue. It directly influences the efficiency of coal mining operations. The advent of driverless technology in coal mine rail electric locomotives has garnered significant attention. This technology exhibits the capacity to avert collisions, derailments, and accidents resulting from the challenging conditions of coal mine roadways and the unpredictability associated with manual operation.
In contrast to classical light detection and ranging (LiDAR)-based methods, deep learning target detection algorithms utilizing image processing techniques have gained traction due to their superior detection accuracy and speed. Implementing these algorithms in obstacle detection for driverless electric locomotives holds the promise of enhancing obstacle detection accuracy and real-time responsiveness, ultimately contributing to the safe operation of such locomotives.
The YOLOv5 model
The YOLOv5 model comprises three parts: the backbone, neck, and head. In the backbone, the Focus module slices and concatenates the input image, followed by convolution operations for feature extraction. In the neck, spatial pyramid pooling (SPP) and path aggregation networks (PANet) enhance feature maps, thereby improving information flow. The model incorporates cross-stage partial (CSP) modules. In the head, the YOLOv3 detection head enables multi-scale prediction.
The YOLOv5 loss function includes three components: classification loss, confidence loss, and localization loss. Localization loss typically uses IoU (intersection over union) to measure the overlap between predicted and ground truth-bounding boxes. GIoU (Generalized IoU) accounts for cases where there is no overlap, preventing convergence issues during training by considering the minimum enclosing rectangle of predicted and ground truth boxes. This ensures accurate obstacle area predictions. The GloU loss (LGIoU) acts as the bounding box regression loss in YOLOv5. It relates to the minimum enclosing rectangle and the union area of predicted and ground truth boxes.
Enhanced YOLOv5s for coal mine obstacle detection
In comparison to other YOLOv5 models, YOLOv5s is compact, computationally efficient, and delivers quicker detection. This makes it an ideal choice for embedded devices with limited resources in harsh coal mine conditions. Data augmentation techniques enhance model robustness. Saturation, exposure, and hue adjustments are applied for photometric distortion. Geometric distortion techniques are used, alongside cropping, shearing, rotating, and flipping, to diversify dataset characteristics.
The Convolutional Block Attention Module (CBAM) was introduced. It enhances focus on obstacle regions during training, refining the model's adaptive features. CBAM is incorporated into YOLOv5s to bolster detection accuracy. A dedicated small target prediction layer is introduced to address the challenge of detecting small obstacles effectively. This layer combines feature maps from both the backbone and neck, resulting in improved detection accuracy for small objects.
Cluster-NMS replaces weighted non-maximum suppression (NMS) for better redundancy removal. It offers faster reasoning and efficient suppression, particularly in dense or partially blocked obstacle scenarios, without significantly increasing iteration rounds. The revised ODEL-YOLOv5s model exhibits increased complexity with the addition of the CBAM attention mechanism, a new prediction scale, and more layers. These improvements are designed to enhance detection accuracy, speed, and robustness in the challenging coal mine roadway environment.
Experiments and analysis
Researchers created a custom obstacle detection image dataset, comprising miners, electric locomotives, and rocks. To ensure dataset diversity, images were taken considering varied positions, distances, angles, lighting conditions, and occlusion levels. Post-processing yielded 2000 dataset images. Each obstacle in the image was enclosed in a minimum-bounding rectangle and labeled "Miner," "E-L" (Electric Locomotive), or "Rock." Researchers evaluated the model using metrics such as recall, precision, AP, mAP, detection speed, model size, and computational load. ODEL-YOLOv5s outperformed conventional YOLOv5s with more stable training, lower loss, and higher precision, recall, and mAP.
Results and Analysis: To compare detection performance, ODEL-YOLOv5s, and other target detection algorithms were trained and tested using the same custom dataset. YOLOv3 and YOLOv4, with a high number of computations (GFLOPs), yielded slower inference times, achieving an average detection of 26.1 frames per second (FPS) and 20.5 FPS, rendering them unsuitable for real-time detection in harsh coal mine environments.
YOLOV3-Tiny and YOLOV4-Tiny achieved faster detection speeds (162.8 FPS and 156.2 FPS), but their simpler network structures hindered their ability to detect small obstacle rocks.
Variants of YOLOv5 have demonstrated mAPs exceeding 96 percent, but their larger model sizes and computational demands pose hardware requirements challenges. In contrast, the ODEL-YOLOv5s model reached a mAP of 98.9 percent, with the AP for small obstacle rocks at 97.9 percent. Although the model ODEL-YOLOv5s increased the model's weight and computational load compared to the conventional YOLOv5s model, it still maintained lower resource demands than other algorithms while achieving an average detection speed of 60.2 FPS, meeting the requirements for real-time detection.
Real Mine Roadway Scenarios: To validate the proposed model in real coal mine conditions, a comparison with the conventional YOLOv5s model was conducted. A further test using 1000 images from the Guqiao coal mine in Huainan City, Anhui Province, demonstrated the ODEL-YOLOv5s model's superiority. The model ODEL-YOLOv5s outperforms the conventional YOLOv5s model.
Conclusion
In summary, researchers introduced the ODEL-YOLOv5s model to address the challenges faced by driverless electric locomotives on coal mine roadways. Key findings include enhancing detection accuracy through dataset augmentation, improving focus via the CBAM attention mechanism, and enhancing small obstacle detection with a new layer. The proposed model achieves favorable outcomes and enhances the safety of driverless electric locomotives in coal mines.