Poor generalization and unclear edge detection often afflict fabric defect detection (FDD) methods in the textile industry. Researchers recently tackled these challenges in the journal Engineering Applications of Artificial Intelligence by introducing a novel network named ‘U-SMR Net.’ This network integrated global context, defect details, and high-level semantics using residual network (ResNet)-50 and Swin Transformer modules.
Background
FDD is pivotal in maintaining textile product quality during production. Complex defects are often a challenge for traditional methods, which categorize them into stain-like and broken-like categories. Detecting broken-like defects remains challenging in modern textile complexity. Computer vision-based methods have been developed but may falter in complex scenes. Various approaches exist, each with strengths and limitations, including a multistage Generative Adversarial Network (GAN) framework, the genetic algorithm Gabor Faster Region convolutional neural networks (R-CNN), and Mobile-Unet.
Previous Work
For target detection in deep convolutional neural networks, models often incorporate codec structures for multi-scale feature extraction. These models often use pre-trained parameters from networks such as visual geometry group (VGG)-16, ResNet-50, and Dense-169 as backbone networks in fabric defect detection algorithms. Previous studies employed different methods for FDD, including the Atrous Spatial Pyramid Pooling (ASPP) module, Swin Transformer, a mix of ResNet and Swin Transformer, and different convolutional architectures. These methods have limitations. They may lose fine details during feature combinations and struggle with local information at different scales. To address these issues, the current study presents the U-SMR network model. It aims to improve feature extraction in the encoding stage and reduce information loss during multi-scale sampling.
The U-SMR Network Architecture
The U-SMR model combines Swin Transformer and ResNet-50 as backbone branches. The model used Swin Transformer Blocks (STBlocks) for global feature extraction and Multi-scale Residual Blocks (RBlocks) for local feature modeling. To effectively harness global context information during the downsampling process, residual and Swin transformer (RST) modules are introduced.
These RST-Residual modules are divided into two submodules: RBlocks and STBlocks, enabling the preservation and transmission of high-scale feature information to the lower-scale depth module. To improve model efficiency and manage memory, variable overlays are used in the code implementation, resulting in a staggered arrangement of RBlocks and SBlocks.
Moreover, a dual-branch encode (DBE) block is introduced to facilitate two-branch multi-scale, high-resolution parsing of input images to preserve local details. In the decoding stage, a recursive multi-level residual (RMR) module is incorporated to filter, refine, and enhance input characteristics. This three-tiered architecture combines multiscale features from six outputs to generate predictive probability images. To bolster prediction confidence, the study proposes the variant of binary cross-entropy loss (WBCE) for deep supervision.
The model's structure ensures accurate fabric defect detection while reducing edge detection ambiguity. It balances local and global features for improved accuracy, making it a robust solution for textile inspection tasks.
Experiments and Analysis
The data used in this study were sourced from the Zhejiang University Leaper (ZJU-Leaper) public dataset. This dataset served as the benchmark for FDD. The dataset organized fabric defect samples into four distinct groups (G1–G4), each presenting varying levels of difficulty in detection and unique background textures.
The proposed model was evaluated using several key metrics, including true positive rate (TPR), false positive rate (FPR), positive predictive value (PPV), negative predictive value (NPV), and f-measure. The f-measure incorporated TPR and PPV to effectively assess minor defect detection precision. The addition of three supplementary metrics—mean absolute error (MAE), PR curve, and receiver operating characteristic (ROC) curve—accounted for non-confident predictions.
To expedite computation, the ZJU dataset was downsampled and subsequently divided into training and test sets. The training data underwent augmentation and random rotation. The evaluation scheme adopted the division of G1–G4 into separate training groups (Group) for individual model training. Additionally, a combined training group (Total) was created, and each method underwent training and evaluation against all four test sets.
The Group training model (ZJU(Group)) consistently outperformed the Total training model (ZJU(Total)). USMR achieved excellent results in the f-measure metric and NPV on Group 2-4 datasets. Additionally, the model exhibited smooth PR and ROC curves, suggesting robust binary predictions.
The qualitative evaluation featured a visual comparison of the USMR method with other approaches. USMR demonstrated higher prediction confidence and more precise edge detection. It filtered background texture features effectively, generating saliency prediction maps with more detailed features. USMR achieved 75.33 percent generalization, demonstrating competitive performance while remaining lightweight.
The final experiment was extended to a fabric sample repository from the Industrial Automation Laboratory Fabric dataset (HKU), with competitive performance achieved against various complex fabric textures. Detailed evaluation metrics and visualization results highlight the model's capability to accurately identify defects and its generalization ability.
Conclusion
In summary, a new and precise network, U-SMR, was introduced to detect fabric defects, employing a hybrid ResNet and Swin Transformer backbone. It featured the Dual-Branch Pyramid module, enhancing perceptual field distribution, and the DBE and RMR modules for improved performance. The proposed loss ensured convergence, achieving top results on ZJU datasets. Ablation experiments validated its efficacy. However, the method might overlook small textile defects, warranting future accuracy enhancements.