In an article published in the journal Nature, researchers explored the application of convolutional neural networks (CNNs) in recognizing Shen embroidery, aiming to safeguard intangible cultural heritage.
By preprocessing a dataset of Shen embroidery and employing transfer learning with MobileNet V1 enhanced by spatial pyramid pooling (SPP), the authors achieved a recognition accuracy of 98.45%. The improved CNN provided crucial technical support for the intelligent preservation and promotion of Shen embroidery, a significant cultural art form.
Background
The rich heritage of Shen embroidery, originating in Nantong, China, faces threats from modern industrialization, necessitating innovative preservation methods. Traditional manual approaches to identifying and classifying Shen embroidery are labor-intensive and resource-heavy, prompting the exploration of computer vision (CV) technology as a solution.
While previous studies have leveraged CV techniques and deep learning (DL) for cultural heritage preservation, they often suffer from limited datasets, manual annotations, and varying accuracy due to annotator subjectivity. This paper aimed to bridge these gaps by introducing CNNs to automate Shen embroidery recognition. By enhancing and expanding the dataset through image processing techniques, the authors addressed the scarcity of Shen embroidery data. Experimenting with five image classification networks and employing transfer learning enhanced classification accuracy. Additionally, replacing the average pooling layer with SPP further refined the classification network, improving recognition performance.
The proposed methodology not only advanced the application of artificial intelligence (AI) in cultural heritage preservation but also provided valuable insights for Shen embroidery research and inheritance. By automating the recognition process, the researchers streamlined the identification and protection of Shen embroidery, contributing to the safeguarding of intangible cultural heritage for future generations.
Experimental Setup and Techniques
The authors utilized a dataset comprising 1264 Shen embroidery images sourced from the Shen Embroidery Museum and web scraping, later augmented through techniques like flipping and rotating. With each image augmented approximately 15 times, the dataset totaled 18,960 images, divided into training and validation sets in a 9:1 ratio, alongside a separate test set.
The experiments were conducted on a Windows 10 platform using Python under Anaconda, employing the PyTorch framework. The experimental setup featured an eight gigabytes (GB) graphics processing unit (GPU) GeForce GTX 1070Ti and an AMD Ryzen 5 1600X six-core processor. MobileNet V1, known for its lightweight and fast computation speed, served as the neural network architecture for Shen embroidery recognition.
Transfer learning was employed due to the limited Shen embroidery data, leveraging pre-trained models like AlexNet and residual network (ResNet) to avoid overfitting risks and reduce training parameters. SPP was integrated into the recognition network to enhance feature fusion and improve detection accuracy, especially for images with varying object sizes. The evaluation metric focused on accuracy, calculated based on the number of correct identifications of "shenxiu" and "fei" in the dataset.
Training Progress and Model Enhancement
The training results of the MobileNet V1 model for recognizing "shenxiu" images in Shen embroidery were presented. The experiment involved training the model for 200 epochs, with periodic checkpoints saved every five epochs to track progress. The training progress was visualized, where the x-axis denoted the training epochs and the y-axis represented the loss value.
The convergence of the model was observed through changes in the loss value, with stabilization indicating convergence. Both train and validation loss curves exhibited fluctuation initially, gradually stabilizing around 0.1 after approximately 100 epochs. However, notable fluctuations persisted during the convergence process.
Subsequently, the researchers enhanced the MobileNet V1 model and retrained it on the same dataset. The improved model exhibited similar training conditions, with 200 epochs and checkpoints saved every five epochs. The authors illustrated the training results of the enhanced MobileNet V1 model, showcasing stabilization of the loss curves around 0.12 after approximately 100 epochs. This indicated successful model fitting and convergence. The comparative analysis between the original and improved MobileNet V1 models provided insights into the effectiveness of the enhancements in optimizing the model's performance for Shen embroidery recognition.
Analyzing Model Performance and Enhancements
Various classification models were compared, including AlexNet, visual geometry group (VGG)16, ResNet50, MobileNet V1, and InceptionV3. MobileNet V1 exhibited the fastest convergence and highest performance during training. Confusion matrices revealed MobileNet V1's superior ability to distinguish between "shenxiu" and non-"shenxiu" images. Transfer learning significantly improved recognition accuracy, with the model achieving 97.86% accuracy, 1.11% higher than without transfer learning.
Furthermore, an improved MobileNet V1 model reached 98.45% accuracy after fine-tuning, showing exceptional performance in recognizing both "shenxiu" and non-"shenxiu" images. These findings suggested that MobileNet V1, especially when enhanced through transfer learning, was a promising model for Shen embroidery recognition, offering practical applications in preserving and promoting this intangible cultural heritage.
Conclusion
In conclusion, leveraging CNNs, particularly the enhanced MobileNet V1, proved instrumental in recognizing Shen embroidery, showcasing a remarkable accuracy of 98.45%. This innovative approach addressed the challenges posed by traditional manual methods and limited datasets, offering a promising solution for safeguarding and promoting intangible cultural heritage.
By automating recognition processes and refining model performance through techniques like transfer learning and SPP, the authors contributed significantly to the intelligent preservation and advancement of Shen embroidery, thus ensuring its legacy for future generations.