Swin-APT: Enhancing Semantic Segmentation in Intelligent Transportation Systems

In an article recently published in the journal Applied Sciences, researchers proposed Swin-APT for image semantic segmentation and object detection tasks in intelligent transportation systems (ITSs).

Study: Swin-APT: Enhancing Semantic Segmentation in Intelligent Transportation Systems. Image credit: Generated using DALL.E.3
Study: Swin-APT: Enhancing Semantic Segmentation in Intelligent Transportation Systems. Image credit: Generated using DALL.E.3

Background

ITSs increasingly incorporate technologies such as artificial intelligence (AI) and the Internet of Things (IoT) to provide real-time traffic data-based traffic information services. AI has been used extensively in ITSs as the technology can reduce human involvement while maintaining a high accuracy. Pedestrians and vehicles are crucial elements of the dynamic and complex road environment in urban traffic networks. The raw data for ITS is obtained from object detection and semantic segmentation tailored specifically for smart transportation.

The trajectories of both pedestrians and vehicles can be derived from detection and segmentation methods, which enable the inference of possible safety hazards. Images contain a substantial amount of underlying semantic information, and computer vision technology, which plays a crucial role in ITSs, assists intelligent vehicles in understanding the scene semantics.

Although the existing algorithms can achieve scene analysis independently in complex scenarios through object detection and semantic segmentation, they need sequential processing, resulting in unnecessary time consumption. In autonomous driving scenes, integrating the requirements of several tasks into a unified model enables effective information sharing among the tasks, improving the overall autonomous driving perception system performance. Moreover, models must display good accuracy and meet the real-time performance and computational efficiency requirements in practical applications such as traffic control and autonomous vehicles.

The proposed approach

In this study, researchers proposed Swin-APT, a deep learning (DL)-based approach for semantic segmentation and object detection in ITSs. The study's objective was to use DL-based algorithms for scene understanding and to realize segmentation predictions on traffic lane datasets to assist in road condition analysis.

Swin-APT incorporated a Swin-Transformer-based lightweight network and a multiscale adapter network designed for object detection and image semantic segmentation tasks. The model prediction accuracy was improved while maintaining a small computational cost by this network.

Additionally, an inter-frame consistency module/module based on the inter-frame consistency of image frames was proposed to obtain more accurate road information from images. The module was introduced to measure information consistency and contrastive learning between adjacent image frames. The adapter network was used in the multi-scale feature space to improve the scene object recognition rate/identify scene objects of various scales effectively in downstream tasks.

In the Swin-APT architecture, the encoding part of the network is composed of four consecutive Swin-Transformer blocks, and the proposed adapter network together forms a feature pyramid structure, which encodes the images into high-level semantic features.

Subsequently, these high-level semantic features were fed into the inter-frame consistency module, which was utilized to learn consistent information from the two parallel consecutive frames to encode the images’ semantic meaning. Eventually, the image features were passed through task-specific heads for road marking detection and road segmentation.

Experimental evaluation and findings

Researchers performed extensive experiments using a road mark detection dataset, CeyMo, and four public road semantic segmentation datasets, including CeyMo, BDD100K, CamVid, and SYNTHIA, to validate the proposed approach and find a balance between computational cost and accuracy.

Mean intersection over union (mIoU) and accuracy were utilized as evaluation metrics for the road segmentation task, while mean average precision (mAP) was used as the evaluation metric for the road marking detection task. Experiments on road segmentation datasets demonstrated that the proposed Swin-APT was a feasible and effective approach compared to the existing models that were employed as baselines in this study.

Swin-APT outperformed all other methods, including A-YOLOM, HybridNets, PSPNet, YOLOv8n(seg), DLT-Net, and MultiNet, on the BDD100K dataset by achieving the highest mIoU of 91.2%. The mIoU achieved by the proposed model was even higher than the mIoU attained by the recent state-of-the-art model A-YOLOM, which indicated that Swin-APT is the best model in road segmentation on the BDD100K dataset.

Swin-APT was the second-best model when experiments were performed on the CamVid benchmark dataset and attained 81.3% mIoU. The proposed model outperformed DFANet A, DenseDecoder, VideoGCRF, and ETC-Mobile and demonstrated a slightly lower performance than the best model DeepLabV3Plus + SDCNetAug that achieved the highest mIoU of 81.7% on the CamVid dataset, which indicated the versatility of the Swin-APT and displayed its effectiveness for different real-world applications. Similarly, in the synthetic SYNTHIA dataset, Swin-APT consistently outperformed its variants in all individual classes, including “Cyclist”, “Pedestrian”, “Vegetation”, “Car”, “Sky”, “Road”, and “Building”.

Overall, the proposed model Swin-APT achieved an improvement of up to 13.1% mIoU compared to the baseline models. Additionally, experiments on road marking detection on the CeyMo dataset showed that the proposed model led to an improvement of 1.85% mAP compared to the baseline model.

Journal reference:
Samudrapom Dam

Written by

Samudrapom Dam

Samudrapom Dam is a freelance scientific and business writer based in Kolkata, India. He has been writing articles related to business and scientific topics for more than one and a half years. He has extensive experience in writing about advanced technologies, information technology, machinery, metals and metal products, clean technologies, finance and banking, automotive, household products, and the aerospace industry. He is passionate about the latest developments in advanced technologies, the ways these developments can be implemented in a real-world situation, and how these developments can positively impact common people.

Citations

Please use one of the following formats to cite this article in your essay, paper or report:

  • APA

    Dam, Samudrapom. (2023, December 18). Swin-APT: Enhancing Semantic Segmentation in Intelligent Transportation Systems. AZoAi. Retrieved on July 04, 2024 from https://www.azoai.com/news/20231218/Swin-APT-Enhancing-Semantic-Segmentation-in-Intelligent-Transportation-Systems.aspx.

  • MLA

    Dam, Samudrapom. "Swin-APT: Enhancing Semantic Segmentation in Intelligent Transportation Systems". AZoAi. 04 July 2024. <https://www.azoai.com/news/20231218/Swin-APT-Enhancing-Semantic-Segmentation-in-Intelligent-Transportation-Systems.aspx>.

  • Chicago

    Dam, Samudrapom. "Swin-APT: Enhancing Semantic Segmentation in Intelligent Transportation Systems". AZoAi. https://www.azoai.com/news/20231218/Swin-APT-Enhancing-Semantic-Segmentation-in-Intelligent-Transportation-Systems.aspx. (accessed July 04, 2024).

  • Harvard

    Dam, Samudrapom. 2023. Swin-APT: Enhancing Semantic Segmentation in Intelligent Transportation Systems. AZoAi, viewed 04 July 2024, https://www.azoai.com/news/20231218/Swin-APT-Enhancing-Semantic-Segmentation-in-Intelligent-Transportation-Systems.aspx.

Comments

The opinions expressed here are the views of the writer and do not necessarily reflect the views and opinions of AZoAi.
Post a new comment
Post

While we only use edited and approved content for Azthena answers, it may on occasions provide incorrect responses. Please confirm any data provided with the related suppliers or authors. We do not provide medical advice, if you search for medical information you must always consult a medical professional before acting on any information provided.

Your questions, but not your email details will be shared with OpenAI and retained for 30 days in accordance with their privacy principles.

Please do not ask questions that use sensitive or confidential information.

Read the full Terms & Conditions.

You might also like...
AR and Computer Vision Revolutionize Bridge Inspections