Compressing CNNs Boosts Efficiency

Download PDF Copy

By Muhammad OsamaReviewed by Susha Cheriyedath, M.Sc.Aug 7 2024

In a recent article published in the journal Scientific Reports, researchers introduced a new geometric method for compressing convolutional neural networks (CNNs) designed for classification tasks. Their goal was to speed up computation and improve generalization by removing non-informative components.

*Study: Compressing CNNs Boosts Efficiency. Image Credit: Who is Danny/Shutterstock.com*

Background

CNNs are powerful tools for solving complex decision-making tasks involving high-dimensional and large datasets. They achieve impressive results in various applications but require high computational power and memory usage. High-performance graphics processing units (GPUs) are often needed to train and adjust the billions of parameters in these networks.

Recalling such large networks can be time-consuming, even with advanced hardware. These challenges are more significant when deploying CNNs in real-world scenarios, particularly those requiring real-time processing, such as smart wearables and virtual or mixed-reality devices.

To address these issues, several techniques have been explored to optimize and accelerate CNNs. These methods include techniques that can be applied to pre-trained models to speed up the recall process and approaches that compact the model's structure and reduce the number of parameters needed for training from scratch. Some techniques involve reducing the numerical precision of parameters, applying quantization and Huffman coding, or representing weights using a binary scheme.

About the Research

In this paper, the authors proposed a novel compression method based on a geometric parameter called the Separation Index (SI). The SI evaluates the significance and redundancy of different elements within the CNN, such as filters, neurons, or entire layers. The proposed algorithm first removes the "ineffective" convolutional layers that do not significantly improve the SI and, consequently, the classification performance. It then selects the most informative filters from the remaining layers based on their SI values.

To design the fully connected layers (FCLs) that follow the compressed convolutional structure, the researchers introduced a method based on the Center-based Separation Index (CSI). This approach ensures that the complexity of the FCLs is optimized to maintain the generalization achieved by the filtering layers without over-parameterization.

Furthermore, several popular CNN architectures, including the visual geometry group’s 16-layer network (VGG-16), GoogLeNet, densely connected convolutional network 40 (DenseNet-40), and residual network 56 (ResNet-56), trained on the CIFAR-10 dataset, were employed to evaluate the presented compression technique.

Research Findings

The outcomes showed that the proposed SI-based compression outperformed state-of-the-art methods in terms of parameter reduction and computational efficiency while maintaining or even improving the accuracy of the original models. For example, when compressing the VGG-16 network, the authors pruned 87.5% of the parameters and reduced the floating point operations per second (FLOPs) by 76% without a significant drop in accuracy. Similarly, for GoogLeNet, the compression method achieved a 74.4% reduction in FLOPs and a 77.6% reduction in parameters, with the accuracy barely changing (95.00% vs. 95.05%).

The researchers also applied their compression technique to the DenseNet-40 and ResNet-56 architectures, achieving impressive results. For DenseNet-40, the method reduced the number of parameters by 78.8% and the FLOPs by 76.1%, with only a 0.04% drop in accuracy. For ResNet-56, the compression led to a 57.8% reduction in parameters and a 58.3% improvement in speed while maintaining nearly identical accuracy (93.49% vs. 93.52%).

Additionally, the authors provided an illustrative example of compressing the Inception V3 model and applying it to the "Cats-vs-Dogs" dataset, a transfer learning scenario. By leveraging the SI, they were able to identify and remove redundant layers and filters, resulting in a more compact model with fewer parameters and faster inference without compromising classification performance.

Applications

The presented technique has several practical implications, particularly in the deployment of CNNs on resource-constrained devices. By significantly reducing the size and computational requirements of these models, the technique enables their efficient implementation on mobile, wearable, and edge computing platforms.

This is crucial for real-time applications, such as smart wearables, virtual and mixed reality systems, and other Internet of Things (IoT) devices. Furthermore, the ability to maintain or even improve the generalization capabilities of compressed models is a valuable asset. This ensures that the optimized networks can still perform well on a wide range of data, making them suitable for diverse classification tasks.

Conclusion

In summary, the novel geometric approach proved effective for accelerating CNNs. It effectively identified and removed redundant elements within the CNN structure, leading to significant reductions in model size and computational requirements without compromising accuracy. As the demand for efficient AI-powered devices continues to grow, it could be a valuable contribution to the field of deep learning optimization, enabling the deployment of high-performance CNNs in resource-constrained environments.

Journal reference:

Saffar, M., Kalhor, A. & Habibnia, A. A geometric approach for accelerating neural networks designed for classification problems. Sci Rep 14, 17590 (2024). DOI: 10.1038/s41598-024-68172-6, https://www.nature.com/articles/s41598-024-68172-6

Posted in: AI Research News

Comments (0)

Written by

Muhammad Osama

Muhammad Osama is a full-time data analytics consultant and freelance technical writer based in Delhi, India. He specializes in transforming complex technical concepts into accessible content. He has a Bachelor of Technology in Mechanical Engineering with specialization in AI & Robotics from Galgotias University, India, and he has extensive experience in technical content writing, data science and analytics, and artificial intelligence.

Download PDF Copy

Citations

Please use one of the following formats to cite this article in your essay, paper or report:

APA
Osama, Muhammad. (2024, August 07). Compressing CNNs Boosts Efficiency. AZoAi. Retrieved on July 18, 2025 from https://www.azoai.com/news/20240807/Compressing-CNNs-Boosts-Efficiency.aspx.
MLA
Osama, Muhammad. "Compressing CNNs Boosts Efficiency". AZoAi. 18 July 2025. <https://www.azoai.com/news/20240807/Compressing-CNNs-Boosts-Efficiency.aspx>.
Chicago
Osama, Muhammad. "Compressing CNNs Boosts Efficiency". AZoAi. https://www.azoai.com/news/20240807/Compressing-CNNs-Boosts-Efficiency.aspx. (accessed July 18, 2025).
Harvard
Osama, Muhammad. 2024. Compressing CNNs Boosts Efficiency. AZoAi, viewed 18 July 2025, https://www.azoai.com/news/20240807/Compressing-CNNs-Boosts-Efficiency.aspx.