In an article published in the journal Nature, researchers presented NLE-you only look once (NLE-YOLO), a low-light target detection network based on YOLOv5, addressing challenges in insufficient illumination and noise interference.
Utilizing innovative preprocessing techniques and feature extraction modules, including C2fLEFEM and AMC2fLEFEM, the network enhanced feature extraction, adapted to brightness changes, and effectively reduced noise impact. Experiments on the Exdark dataset demonstrated superior detection accuracy and performance compared to previous methods in low-light conditions.
Background
In recent years, rapid advancements in artificial intelligence (AI) have popularized object detection, a crucial component of computer vision, extensively used in diverse fields such as autonomous driving and video surveillance. Target detection algorithms face significant challenges in real-world scenarios, particularly in low-light conditions, where images exhibit issues like underexposure, color distortion, and reduced contrast. The limitations of hardware devices in such conditions pose a significant hurdle.
Prior research has attempted to enhance hardware devices, but that is expensive and not universally implementable. Alternatively, researchers focused on improving target detection algorithms to adapt to low-light conditions. While some methods used preprocessing techniques like low-light picture enhancement, they faced challenges in handling extreme conditions, leading to degradation in detection accuracy.
This paper introduced the NLE-YOLO low-light object detection model, addressing the shortcomings of existing techniques. The proposed model leveraged an enhanced image as input and incorporated a novel feature extraction module, C2fLEFEM, to suppress high-frequency noise and enhance crucial information. Additionally, the model introduced an attention mechanism receptive field block (AMRFB) module to broaden the receptive field and improve feature extraction.
Substituting the original detection head with a decoupled head, the NLE-YOLO model was tailored for low-light conditions, showcasing promising results on the ExDark dataset. The contributions included a novel low-light object identification network, innovative feature extraction, and attention modules, and adaptations for efficient performance in low-light scenarios, collectively filling gaps in existing approaches for enhanced low-light object detection.
Methods
Low-light conditions posed challenges to conventional object detection due to poor visibility, low contrast, and noise interference. The proposed NLE-YOLO addressed these issues by incorporating the C2fLEFEM module, which replaced the C3 module in the YOLOv5 backbone. This module effectively suppressed high-frequency noise while enhancing crucial information, improving the network's feature extraction performance.
The C2fLEFEM module comprised three sub-modules: low-frequency filter enhancement module (LEF), feature enhancement module (FEM), and C2f module. LEF employed low-frequency filtering to remove high-frequency noise, and FEM fused low-frequency enhanced features with original features while minimizing noise. The C2f module ensured gradient information preservation, contributing to better feature extraction.
Additionally, the authors introduced the AMC2fLEFEM module to enhance feature extraction capabilities in the YOLOv5 neck. This module combined the simple attention module (SimAM) attention mechanism with low-level feature extraction, effectively reducing noise impact and enriching semantic features.
The AMRFB module was proposed to overcome challenges in low-light conditions by enlarging the receptive field using atrous convolution and introducing the SimAM attention mechanism. This enhanced feature extraction while directing the network's attention to crucial target areas.
The SimSPPF module replaced the original SPPF module to address gradient issues in extreme low-light settings, ensuring better model speed and stability. To enhance detection efficacy in low-light conditions, the original detection head was substituted with a decoupling head. These innovations collectively contributed to the NLE-YOLO model's ability to address challenges in low-light object detection, offering improved feature extraction, attention mechanisms, and adaptability to varying lighting conditions.
Experimental Results
The experimental results showcased the proposed NLE-YOLO's superior performance. The experiments were conducted on an Intel Xeon Platinum 8350C processor with an RTX 3090 graphics card, using Python 3.8, Cuda 11.3, and PyTorch 1.10.0. The training parameters were optimized using a dynamic approach, adjusting optimization strategy, learning rate, momentum, and weight decay. The experiments utilized the Exdark dataset for low-light object detection, divided into training, validation, and testing sets.
The proposed NLE-YOLO network was compared with state-of-the-art models, and extensive evaluation metrics were employed, including precision, recall, mean average precision (mAP), and frames per second (FPS). The experiments demonstrated significant performance improvements, outperforming other models, especially when combined with the SGZ algorithm.
Quantitative and qualitative evaluations, including precision-recall curves and mAP graphs, highlighted the effectiveness of the proposed technique. Ablation experiments confirmed the importance of each module in the NLE-YOLO network, showcasing performance enhancements with the inclusion of modules like C2fLEFEM, AMC2fLEFEM, AMRFB, decoupled head, and SimSPPF.
Conclusion
In conclusion, the proposed NLE-YOLO model effectively addressed challenges in low-light object detection and noise interference during enhancement. Leveraging augmented images, novel modules like C2fLEFEM and AMRFB were introduced, enhancing feature extraction and attention mechanisms. The proposed model offered a promising solution for robust low-light object identification.