Deep Learning Enhances Urban Building Mapping

In a paper published in the journal Scientific Reports, researchers tackled the challenges of extracting building footprints from high-resolution aerial and satellite images in urban areas. They proposed automating this process by integrating red, green, and blue (RGB) orthophotos with digital surface models (DSM) to create a consistent four-band dataset, enhancing pixel-to-pixel data fusion.

Architecture of U-Net. Image Credit: https://www.nature.com/articles/s41598-024-64231-0
Architecture of U-Net. Image Credit: https://www.nature.com/articles/s41598-024-64231-0

Using deep convolutional networks for semantic image segmentation version 3 (DeepLabv3) algorithms for pixel-based segmentation, they achieved superior accuracy and detailed building boundary delineation over a 21 km² area in Turin, Italy. This method significantly reduced training time compared to conventional approaches like U-shaped networks (U-Net). The study demonstrated the potential of this integrated approach for applications in 3D modeling, change detection, and urban planning, supporting urban management tasks.

Background

Past work in building footprint segmentation includes rule-based methods, machine learning, and deep learning, often enhanced by data fusion techniques. Rule-based methods faced adaptability challenges, while machine learning improved detection but struggled with data alignment and computational demands. Using convolutional neural networks (CNN) like DeepLabv3, deep learning showed high accuracy, especially when incorporating multi-source data such as RGB orthophotos and DSM.

Data fusion significantly improved segmentation accuracy by enhancing contrast and boundary delineation. Despite advancements, challenges like misalignment and the complexity of multi-source data integration persist, leading to innovative solutions like generative adversarial networks (GANs).

Building Footprint Segmentation

The study utilized two primary raster layers through aerial photogrammetry campaigns: an RGB orthomosaic with a 25 cm/pixel resolution providing spectral information and a DSM raster layer with a 50 cm/pixel resolution offering elevation data.

Pixel-level data fusion was employed to create a four-band integrated dataset, enhancing spectral and elevation information crucial for accurate building footprint segmentation. This process involved resampling the DSM to match the RGB orthomosaic's resolution, cropping both datasets to the same extent, stacking them along the band dimension, and normalizing pixel values.

The research focused on two leading deep learning algorithms for pixel-based semantic segmentation: U-Net and DeepLabv3. U-Net, known for its encoder-decoder architecture with skip connections, excels in capturing local and global features and recovering fine-grained details. However, its fixed kernel size may limit contextual information capture.

DeepLabv3, on the other hand, uses atrous convolution and atrous spatial pyramid pooling (ASPP) to handle large receptive fields and multi-scale contextual information efficiently. However, it may produce lower-resolution output maps. Both algorithms were evaluated for accuracy and boundary delineation on standalone and integrated datasets.

The team manually digitized 450 buildings for training and validation and converted them into binary masks with a 256×256-pixel size. The dataset was split into 80% training and 20% validation sets. The exercise involved the TensorFlow framework and architecture geographic information system (ArcGIS) for data preparation. Key training parameters included the softmax activation function, cross-entropy loss function, Adam optimizer, and 8 and 20 epochs batch size. U-Net and DeepLabv3 used ResNet-50 as their backbone to enhance feature extraction and segmentation accuracy.

Enhancing Urban Segmentation

The results and analysis section investigates the impact of data fusion and elevation information on building footprint segmentation through various evaluation metrics. These metrics, derived from the confusion matrix, encompass precision, accuracy, recall, F1 score, and intersection over union (IoU), comprehensively evaluating how well models distinguish building pixels from non-building pixels. Notably, DeepLabv3 integrated emerges as the top performer, showcasing significant improvements in the recall, F1 score, and IoU compared to other configurations.

It highlights the effectiveness of integrating RGB and DSM data to enhance segmentation accuracy, particularly in complex urban environments. Computational efficiency is also emphasized, with DeepLabv3 demonstrating faster training times due to its efficient use of atrous convolutions, underscoring its suitability for practical deployment. Furthermore, the performance evaluation delves into the nuanced interactions between model architecture and dataset complexity.

DeepLabv3's advanced features, such as atrous spatial pyramid pooling, prove crucial in leveraging the richer feature set provided by the integrated dataset. This capability allows DeepLabv3 to excel in capturing multi-scale contextual information essential for precise segmentation, as evidenced by its superior results across all metrics evaluated. Despite the computational overhead associated with the integrated dataset, the substantial gains in segmentation quality justify its use, emphasizing the pivotal role of data fusion and elevation information in enhancing urban mapping applications.

The results and analysis section combines quantitative metrics with qualitative visualizations to demonstrate the impact of data fusion and model architecture on building footprint segmentation accuracy. Visual comparisons across varied urban scenarios highlight U-Net integrated's improved performance with elevation data in dense areas, while DeepLabv3 excels in handling complex geometries and terrain variations.

This comprehensive approach validates the effectiveness of integrating RGB and DSM data, providing practical insights for optimizing segmentation workflows in urban environments. Ultimately, the study underscores how these advancements enhance building footprint delineation, offering significant benefits for urban mapping and planning strategies.

Conclusion

To sum up, this study utilized integrated high-resolution datasets that combined RGB orthophotos with DSMs to automate the extraction of building footprints in urban areas. By employing DeepLabv3 algorithms, the segmentation process effectively utilized height information derived from DSMs, resulting in precise delineation of creating boundaries.

Evaluation conducted in Turin, Italy, underscored the approach's advantages: superior accuracy and reduced training time compared to traditional methods like U-Net. These outcomes highlight the potential of this approach for enhancing applications such as 3D modeling, change detection, and urban planning.

Journal reference:
Silpaja Chandrasekar

Written by

Silpaja Chandrasekar

Dr. Silpaja Chandrasekar has a Ph.D. in Computer Science from Anna University, Chennai. Her research expertise lies in analyzing traffic parameters under challenging environmental conditions. Additionally, she has gained valuable exposure to diverse research areas, such as detection, tracking, classification, medical image analysis, cancer cell detection, chemistry, and Hamiltonian walks.

Citations

Please use one of the following formats to cite this article in your essay, paper or report:

  • APA

    Chandrasekar, Silpaja. (2024, June 19). Deep Learning Enhances Urban Building Mapping. AZoAi. Retrieved on December 23, 2024 from https://www.azoai.com/news/20240619/Deep-Learning-Enhances-Urban-Building-Mapping.aspx.

  • MLA

    Chandrasekar, Silpaja. "Deep Learning Enhances Urban Building Mapping". AZoAi. 23 December 2024. <https://www.azoai.com/news/20240619/Deep-Learning-Enhances-Urban-Building-Mapping.aspx>.

  • Chicago

    Chandrasekar, Silpaja. "Deep Learning Enhances Urban Building Mapping". AZoAi. https://www.azoai.com/news/20240619/Deep-Learning-Enhances-Urban-Building-Mapping.aspx. (accessed December 23, 2024).

  • Harvard

    Chandrasekar, Silpaja. 2024. Deep Learning Enhances Urban Building Mapping. AZoAi, viewed 23 December 2024, https://www.azoai.com/news/20240619/Deep-Learning-Enhances-Urban-Building-Mapping.aspx.

Comments

The opinions expressed here are the views of the writer and do not necessarily reflect the views and opinions of AZoAi.
Post a new comment
Post

While we only use edited and approved content for Azthena answers, it may on occasions provide incorrect responses. Please confirm any data provided with the related suppliers or authors. We do not provide medical advice, if you search for medical information you must always consult a medical professional before acting on any information provided.

Your questions, but not your email details will be shared with OpenAI and retained for 30 days in accordance with their privacy principles.

Please do not ask questions that use sensitive or confidential information.

Read the full Terms & Conditions.

You might also like...
Archaeoscape Bridges Deep Learning and ALS to Transform Archaeological Discoveries