YOLO Enhances Pothole Detection for the Visually Impaired

Download PDF Copy

By Soham NandiReviewed by Susha Cheriyedath, M.Sc.Aug 18 2024

In an article published in the Decision Analytics Journal, researchers presented a deep learning model using the you-only-look-once (YOLO) algorithm to assist visually impaired individuals in detecting potholes through real-time camera data.

*Study: YOLO Enhances Pothole Detection for the Visually Impaired. Image Credit: photopixel/Shutterstock.com*

The model, integrated into an application, provided auditory or haptic feedback, enabling safer navigation. Achieving 82.7% accuracy and 30 frames per second (FPS) in live video, the model enhanced mobility and safety for visually impaired users by detecting nearby potholes.

Background

Object recognition systems have evolved significantly, driven by the need for high-speed and precise identification in various applications. The YOLO algorithm is very effective at detecting objects in real time, and it's used in things like self-driving cars and security systems.

Yet, visually impaired individuals face significant obstacles, particularly with identifying potholes—dangers that are challenging to spot due to their unpredictable shapes and sizes. Traditional methods like edge detection and template matching have not been very successful in solving this problem.

Research has revealed that deep learning models, particularly those utilizing YOLO, outperform traditional machine learning methods in detection accuracy. Yet, despite their advanced capabilities, many of these systems are either not easily portable or rely on internet access, which reduces their practicality for individuals with visual impairments.

This paper addressed these gaps by proposing a YOLOv5-based pothole detection system designed for real-time use on mobile devices, providing auditory or haptic feedback to enhance safety and independence for visually impaired travelers.

Methodology for Accessible Pothole Detection

The researchers introduced a mobile application designed to detect potholes on roads, specifically tailored for visually impaired users. This app leveraged the YOLOv5 algorithm for real-time object detection, integrating with Google text-to-speech (GTTS) to alert users of nearby potholes.

The application was unique in that it operated without a graphical user interface (GUI) and did not require user prompts; instead, it automatically recorded video via voice assistance upon app activation. When a pothole was detected, the user received an audible warning, enhancing road safety.

YOLOv5 was chosen for its efficiency and ability to process custom datasets quickly. It comprised three key components: the backbone, neck, and head. The backbone, based on cross-stage partial (CSP)-Darknet53—extracted image features. The neck, utilizing path aggregation network (PANet), enhanced feature representation.

Finally, the head generated the final predictions, including bounding box coordinates and class probabilities. Key processes included data augmentation, model training, and post-processing with techniques like non-maximum suppression to ensure accurate and robust pothole detection. The application did not require user registration and functioned effectively with standard mobile cameras, offering an accessible solution for road safety.

Results and Analysis

The experimental analysis focused on implementing the YOLOv5 model for pothole detection. The model was initially pre-trained on the common objects in context (COCO) dataset and later fine-tuned using a specialized dataset containing 9,240 images of potholes under various conditions. The dataset was divided into three subsets, 6,091 images for training, 2,094 for validation, and 1,055 for testing. The training process involved the use of txt format labels to help the model learn to detect and localize potholes accurately.

Data preprocessing and augmentation steps were crucial to enhancing the model's performance. To enhance the model's robustness under varying lighting and orientation conditions, techniques such as auto-orientation, contrast adjustment, image flipping, rotation, and saturation adjustments were utilized.

The YOLOv5 model was trained on Google Colab, employing the Adam optimizer along with a blend of loss functions, including binary cross-entropy, focal loss, and generalized intersection over union (GIoU) loss. The trained model was assessed using precision, recall, and mean average precision (mAP) metrics, resulting in a precision of 86.2%, a recall of 75.9%, and an mAP at 0.5 IoU of 82.5%. When tested on live video, the model detected potholes at a rate of 30 FPS with a resolution of 1280 × 720.

Conclusion

In conclusion, the researchers successfully developed a pothole detection model using the YOLO algorithm, achieving 82.7% accuracy and 30 FPS in real-time video. Integrated into a mobile application, the system provided auditory or haptic feedback to visually impaired users, enhancing their road safety.

While the current model was limited to detecting potholes, future improvements aim to expand its capabilities to identify additional hazards and enhance its performance across diverse road conditions, ensuring greater safety and independence for visually impaired individuals. Further development will also focus on optimizing the model for mobile devices and improving the detection range.

Journal reference:

Paramarthalingam, A., Sivaraman, J., Theerthagiri, P., Vijayakumar, B., & Baskaran, V. (2024). A deep learning model to assist visually impaired in pothole detection using computer vision. Decision Analytics Journal, 100507. DOI: 10.1016/j.dajour.2024.100507, https://www.sciencedirect.com/science/article/pii/S2772662224001115

Posted in: AI Research News

Comments (0)

Written by

Soham Nandi

Soham Nandi is a technical writer based in Memari, India. His academic background is in Computer Science Engineering, specializing in Artificial Intelligence and Machine learning. He has extensive experience in Data Analytics, Machine Learning, and Python. He has worked on group projects that required the implementation of Computer Vision, Image Classification, and App Development.

Download PDF Copy

Citations

Please use one of the following formats to cite this article in your essay, paper or report:

APA
Nandi, Soham. (2024, August 18). YOLO Enhances Pothole Detection for the Visually Impaired. AZoAi. Retrieved on July 05, 2025 from https://www.azoai.com/news/20240818/YOLO-Enhances-Pothole-Detection-for-the-Visually-Impaired.aspx.
MLA
Nandi, Soham. "YOLO Enhances Pothole Detection for the Visually Impaired". AZoAi. 05 July 2025. <https://www.azoai.com/news/20240818/YOLO-Enhances-Pothole-Detection-for-the-Visually-Impaired.aspx>.
Chicago
Nandi, Soham. "YOLO Enhances Pothole Detection for the Visually Impaired". AZoAi. https://www.azoai.com/news/20240818/YOLO-Enhances-Pothole-Detection-for-the-Visually-Impaired.aspx. (accessed July 05, 2025).
Harvard
Nandi, Soham. 2024. YOLO Enhances Pothole Detection for the Visually Impaired. AZoAi, viewed 05 July 2025, https://www.azoai.com/news/20240818/YOLO-Enhances-Pothole-Detection-for-the-Visually-Impaired.aspx.