In a paper published in the journal Buildings, researchers proposed a three-step computer vision (CV)--based framework for efficient crack detection and quantification in concrete structures.
The method utilized you only look once, version 8 (YOLOv8) for crack localization, Gaussian filtering, canny, findcontours for scale conversion, and the approximate polygonal DP (Douglas-pucker) algorithm (ApproxPolyDP) function and Hough transform for crack dimension quantification. The framework, validated on a large dataset, demonstrated high detection accuracy and precision in crack measurements. However, it faced challenges with smaller cracks, complex backgrounds, and the need for pre-marked reference frames, which limited its applicability.
Background
Past work has explored various methods for concrete crack detection, from traditional image processing techniques to advanced machine learning (ML) and deep learning (DL) approaches. Early methods, such as edge detection and thresholding, were prone to environmental noise, while machine learning methods required extensive preprocessing.
Convolutional neural networks (CNN) and YOLO-based models significantly improved detection speed and accuracy, but challenges remained in quantifying crack dimensions in real-world units. Reference-based methods offered more accurate measurements but introduced operational complexity and vulnerability to environmental factors.
Enhanced Crack Measurement
The YOLO series of target detection algorithms, particularly the latest YOLOv8, has gained significant attention for its efficient and accurate performance. This study utilized YOLOv8 to identify the bounding box of concrete cracks in damaged images. The core layer of YOLOv8 employs darknet-53 to extract feature maps, using a series of convolutional and residual blocks to enhance feature extraction. The analysts employed a feature pyramid network (FPN) and a path aggregation network (PAN) to enhance semantic and localization features.
The team introduced a reference frame with known dimensions (25 × 50 mm) to quantify crack dimensions in millimeters. The process involves marking the seal adjacent to the crack, greyscaling the image, and applying Gaussian filtering and canny edge detection to extract the reference frame's contour.
The ApproxPolyDP function is used to fit a polygon to the contour, simplifying it by removing redundant points. The perimeter and area of the reference frame are calculated, enabling the conversion of pixel measurements to real-world dimensions. After recognizing the crack using YOLOv8, the bounding box is cropped to focus on the crack. The findcontours algorithm and ApproxPolyDP function are applied again to extract and fit the crack's contour.
The pixel-wise perimeter and area are calculated, and using the previously determined conversion scales, the crack's length and area are converted to millimeter measurements. The maximum width of the crack is determined using a combination of canny edge detection and the Hough transform, identifying the largest internal tangent circle within the crack contour.
The team applied these methodologies to analyze crack images and quantify their dimensions accurately. By combining DL for crack detection with reference-based methods for dimension quantification, the study aimed to improve the precision of crack measurements in concrete structures, offering a robust approach to structural health monitoring.
Concrete Crack Analysis
The study on concrete crack detection utilized YOLOv8 to identify cracks in damaged images by leveraging the darknet-53 architecture for feature extraction. The YOLOv8 model was trained using a dataset comprising images from infrastructure evaluations and a publicly available crack dataset.
The training setup used Python, Cudatoolkit, Cudnn, and Pytorch on a computer with a Core i9 central processing unit (CPU).
Optimized with parameters such as a momentum value of 0.937, SGD optimizer, and a learning rate 0.01, the model training achieved convergence in approximately 4 hours over 150 epochs. Non-maximum suppression (NMS) was applied to refine the detection results by eliminating redundant bounding boxes, ensuring each detected crack was accurately represented.
The study used a reference frame with known dimensions to convert pixel measurements to real-world dimensions for crack dimension quantification. The process involved marking a seal near the crack, converting the image to binary, and applying Gaussian filtering and canny edge detection. The crack contours were extracted using the FindContours algorithm, and these measurements were then compared with actual dimensions obtained using a width meter and vernier caliper.
Results showed that the proposed method provided accurate crack size conversion, with length and width measurement errors within 7%. Despite the method's effectiveness, limitations include the necessity of a pre-stamped reference frame and sensitivity to background complexity, which affects the robustness of the Gaussian filtering and Canny edge detection algorithms. Further improvements are needed to enhance the accuracy of engineering inspections.
Conclusion
To sum up, using a smartphone efficiently, researchers developed a three-step CV framework to detect and measure concrete cracks. YOLOv8 achieved 95.7% accuracy in identifying cracks, while image processing techniques determined millimeter-wise dimensions using a reference frame with a quick response code (QR code).
Despite its effectiveness, the method requires pre-stamped reference frames and manual adjustments for complex backgrounds, leading to some limitations. Future work will enhance accuracy, expand data diversity, and explore automated seal stamping for improved field inspections.
Journal reference:
- Qi, Y., et al. (2024). A Three-Step Computer Vision-Based Framework for Concrete Crack Detection and Dimensions Identification. Buildings, 14:8, 2360–2360. DOI: 10.3390/buildings14082360, https://www.mdpi.com/2075-5309/14/8/2360