Enhancing Smart Cities with Dual-Branch Residual Networks for Urban Sound Classification

Download PDF Copy

By Ashutosh RoyReviewed by Susha Cheriyedath, M.Sc.Aug 2 2023

As smart cities continue to embrace advanced technologies such as the Internet of Things (IoT), cloud computing, and artificial intelligence, the need for efficient urban environmental sound classification systems has become paramount. These systems are crucial in various applications, including city management, security, and environmental monitoring. However, accurate classification of environmental sounds in urban settings remains challenging due to the dynamic nature of sounds and the interference of urban noise.

*Study: Enhancing Smart Cities with Dual-Branch Residual Networks for Urban Sound Classification. Image credit: jamesteohart/Shutterstock*

To address this issue, researchers from China Jiliang University and Hangzhou Aihua Intelligent Technology Co., Ltd. proposed a novel approach using dual-branch residual networks to enhance urban environmental sound classification. In an article published in the journal Sensors, researchers explore this cutting-edge system, its potential impact on smart cities, and its promising results.

Understanding audio signal processing and feature extraction

In audio signal processing, the extraction of essential features from sound signals is crucial for effective classification. The frequency domain analysis offers valuable insights into the characteristics of audio signals. Two common feature extraction methods used in sound classification are log-spectrogram and log-Mel spectrogram.

The log-spectrogram method involves obtaining the spectrum of each frame using a fast Fourier transform (FFT) and then combining these spectra to form the complete spectrogram. The log-spectrogram characteristics of the signal are extracted by taking the logarithm of the spectrogram. This method comprehensively represents the signal's frequency content and temporal dynamics, making it useful for many sound classification tasks.

On the other hand, the log-Mel spectrogram method converts the spectrogram of each frame to the Mel scale, which simulates the frequency division characteristics of the human ear. This transformation enhances noise suppression and captures more perceptually relevant information, making it particularly useful for tasks with low signal-to-noise ratios (SNRs).

The power of dual-branch residual networks

The proposed system introduces a dual-branch residual network to advance the state-of-the-art in urban environmental sound classification. The key idea behind this network is to extract both log-spectrogram and log-Mel spectrogram features separately and then integrate them effectively to prevent information loss.

The dual-branch residual network incorporates Res2Net modules, a powerful variant of residual networks, to increase the receptive field and improve the effective receptive field. By doing so, the network can capture long-range dependencies in the audio signals and avoid redundant information, leading to better feature representation.

Furthermore, the network employs a self-attention layer, which enhances the network's ability to focus on important features and suppress irrelevant ones. The self-attention mechanism improves the discriminative power of the features and contributes to the overall accuracy of the classification system.

Experimental results and comparison

To evaluate the effectiveness of the proposed system, the researchers conducted extensive experiments using the UrbanSound8K and ESC-50 datasets. These datasets contain various environmental sounds recorded in real urban environments, presenting diverse and complex challenges for sound classification algorithms.

The results demonstrated that the dual-branch residual network outperformed single-feature extraction methods significantly. The network achieved higher classification accuracy across multiple sound categories, including "Air conditioner," "Children playing," "Drilling," "Engine idling," and "Jackhammer."

The impact on smart cities

Accurate urban environmental sound classification holds immense potential for smart cities. By deploying such advanced systems, cities can benefit in several ways:

Environmental monitoring: The system can continuously monitor and analyze various sounds in the urban environment, enabling cities to assess noise pollution levels and identify areas where noise mitigation measures are required. This information can help city planners make informed decisions to create quieter, more livable spaces.

Public safety and security: The ability to classify sounds accurately can enhance public safety and security. The system can detect sounds related to emergencies, accidents, or criminal activities, allowing authorities to respond promptly and effectively.

Traffic management: The system's capability to identify specific sounds, such as those from engines or vehicles, can be leveraged for intelligent traffic management. It can detect traffic congestion, accidents, or anomalies, enabling real-time adjustments to traffic flow and signal timings.

Urban planning: The data collected from the sound classification system can provide valuable insights into the acoustic environment of different urban areas. Planners can use this information to design more sound-friendly neighborhoods and optimize urban layouts.

Health and well-being: Excessive noise pollution in urban environments can adversely affect residents' health and well-being. By accurately assessing noise levels and sources, cities can implement measures to reduce noise-related health risks and improve overall citizen satisfaction.

Conclusion

As smart cities continue to evolve and grow, integrating advanced technologies becomes critical for effective urban management. Urban environmental sound classification systems offer a promising solution to address the challenges posed by complex and variable environmental sounds in urban settings.

The proposed dual-branch residual network, with its feature fusion capabilities, shows significant potential in improving classification accuracy for environmental sounds. By accurately identifying and classifying various sounds, cities can take proactive measures to enhance citizen experience, safety, and overall quality of life.

As research in this domain progresses, we can expect even more sophisticated and efficient systems to support the evolution of smart cities worldwide. By harnessing the power of dual-branch residual networks and embracing advancements in artificial intelligence, smart cities can create sustainable, resilient, and citizen-centric urban environments.

Journal reference:

Zhang, Dongping, et al. An Automatic Classification System for Environmental Sound in Smart Cities. (2023). Sensors. 23:15, 6823. https://doi.org/10.3390/s23156823, www.mdpi.com/1424-8220/23/15/6823

Posted in: AI Research News

Comments (0)

Written by

Ashutosh Roy

Ashutosh Roy has an MTech in Control Systems from IIEST Shibpur. He holds a keen interest in the field of smart instrumentation and has actively participated in the International Conferences on Smart Instrumentation. During his academic journey, Ashutosh undertook a significant research project focused on smart nonlinear controller design. His work involved utilizing advanced techniques such as backstepping and adaptive neural networks. By combining these methods, he aimed to develop intelligent control systems capable of efficiently adapting to non-linear dynamics.

Download PDF Copy

Citations

Please use one of the following formats to cite this article in your essay, paper or report:

APA
Roy, Ashutosh. (2023, August 02). Enhancing Smart Cities with Dual-Branch Residual Networks for Urban Sound Classification. AZoAi. Retrieved on October 18, 2025 from https://www.azoai.com/news/20230802/Enhancing-Smart-Cities-with-Dual-Branch-Residual-Networks-for-Urban-Sound-Classification.aspx.
MLA
Roy, Ashutosh. "Enhancing Smart Cities with Dual-Branch Residual Networks for Urban Sound Classification". AZoAi. 18 October 2025. <https://www.azoai.com/news/20230802/Enhancing-Smart-Cities-with-Dual-Branch-Residual-Networks-for-Urban-Sound-Classification.aspx>.
Chicago
Roy, Ashutosh. "Enhancing Smart Cities with Dual-Branch Residual Networks for Urban Sound Classification". AZoAi. https://www.azoai.com/news/20230802/Enhancing-Smart-Cities-with-Dual-Branch-Residual-Networks-for-Urban-Sound-Classification.aspx. (accessed October 18, 2025).
Harvard
Roy, Ashutosh. 2023. Enhancing Smart Cities with Dual-Branch Residual Networks for Urban Sound Classification. AZoAi, viewed 18 October 2025, https://www.azoai.com/news/20230802/Enhancing-Smart-Cities-with-Dual-Branch-Residual-Networks-for-Urban-Sound-Classification.aspx.

Comments

The opinions expressed here are the views of the writer and do not necessarily reflect the views and opinions of AZoAi.

Post a new comment

(Logout)

Post

Sign in to keep reading

We're committed to providing free access to quality science. By registering and providing insight into your preferences you're joining a community of over 1m science interested individuals and help us to provide you with insightful content whilst keeping our service free.