GLanCE: Revolutionizing Global Land Cover Estimation with Cloud Computing and Machine Learning

In a paper published in the journal Scientific Data, researchers introduced an innovative methodology using cutting-edge cloud computing platforms like Google Earth Engine. They established a vast global database encompassing nearly two million training units across seven primary and nine secondary land cover classes from 1984 to 2020.

Study: GLanCE: Revolutionizing Global Land Cover Estimation with Cloud Computing and Machine Learning. Image credit: Generated using DALL.E.3
Study: GLanCE: Revolutionizing Global Land Cover Estimation with Cloud Computing and Machine Learning. Image credit: Generated using DALL.E.3

Leveraging machine learning (ML) algorithms and Landsat imagery, they ensured data quality and representation across diverse biogeographic regions. This resource held immense value for studies ranging from land cover changes to urban development.

Background

The accuracy of remote sensing-derived land cover maps hinges on robust training data, balancing size and quality across various classification algorithms. Existing global datasets are limited by coverage, resolution, or temporal scope, prompting the Global Land Cover Estimation (GLanCE) project. GLanCE aims to create a high-resolution, comprehensive training database spanning four decades, leveraging cloud computing and ML for accuracy and ecological representation.

GLanCE Training Data Collection Overview

The GLanCE project conducted meticulous training data collection involving trained image analysts from Boston University. Analysts used various tools, including the Google Earth Engine API and high-resolution imagery, to interpret land cover characteristics through a detailed process. Each entry in the database represented individual Landsat pixels, termed as training units.

A stringent quality assessment involved expert reviews and an ML-based cross-validation process to ensure accuracy, eliminating discrepancies in labeled training units. The team comprised six to 12 trained analysts who underwent consistent training, periodic refresher courses, and weekly meetings for quality enhancement. This comprehensive approach resulted in continent-specific databases with up to 23 land cover attributes per unit.

GLanCE collected training data from multiple sources, including the System for Terrestrial Ecosystem Parameterization (STEP) database, Landsat spectral-temporal features via unsupervised clustering, and feedback training units to rectify classification errors. These sources delivered homogeneous and heterogeneous land cover representations, emphasizing thematic detail and completeness.

Additionally, GLanCE supplemented its dataset by harmonizing and standardizing publicly available, collaborator-contributed, and team-collected datasets, ensuring alignment with the GLanCE land cover classification key. Despite these efforts, the team encountered underrepresented land cover classes, prompting them to augment the dataset with data from various sources such as the World Settlement Footprint and Global Surface Water products. They utilized algorithms to select candidate training units in this augmentation process.

Pre-processing and harmonizing supplementary datasets involved several steps, including interpreter confidence filtering, legend harmonization, comparison with existing land cover products, and visual inspection to ensure quality. While striving to align these datasets with GLanCE standards, limitations persisted due to limited control over external dataset accuracy and consistency. Despite these constraints, the iterative approach ensured continual refinement and enhancement of the training dataset's quality and relevance for global land cover mapping.

Overview of GLanCE Training Dataset

The GLanCE training dataset, available under the Creative Commons license CC-BY-4.0 from Source Cooperative, comprises two hierarchical sets of land cover classes: seven broad (Level One) and nine secondary (Level Two) classes. This classification scheme is designed primarily for land cover and aligns with standard systems like the IPCC and FAO Land Cover Classification.

Based on Landsat pixels from 1984 to 2020, each training unit includes Level One & Two labels and additional attributes like LC_Confidence and Segment_Type for stability or transitional status. Around 79% of the dataset represents stable land cover, while 21% denotes change, encompassing abrupt and gradual transitions like forest regrowth or coastal water dynamics. The dataset's global distribution covers various land cover classes, yet only 50% of the units contain Level Two legend information due to missing land use details in supplementary datasets.

This extensive dataset amalgamates disparate sources, forming a comprehensive global training database. It includes publicly available, collaborator-provided, GLanCE-collected, and Boston University team-collected data. Notably, GLanCE data offers detailed ancillary information and changed records over an extended period, often spanning 20 years between 1999 and 2019. However, distribution among continents varies, with Europe and South America exhibiting higher data availability, the latter encompassing time segment lengths of up to 35 years in some cases. Despite these variations, the dataset serves as the most extensive and detailed publicly accessible global land cover and land use training resource to date, enabling various applications in the field.

GLanCE Dataset: Rigorous Validation Analysis

The GLanCE Dataset underwent a robust validation process employing a two-step ML-based cross-validation approach to ensure data quality. Researchers used climate variables and spatial clustering to remove approximately 15% of the training data uniformly across continents, targeting misclassification and confusion between classes. This process also compared classification results against reference data inspired by previous studies. Despite high accuracy across most continents, challenges persisted in distinguishing certain land cover types, notably herbaceous vegetation and shrubs, reflecting mixed representations within training units, highlighting difficulties even at a 30m spatial resolution.

Conclusion

To sum up, the GLanCE Dataset underwent meticulous validation procedures, utilizing sophisticated ML techniques and spatial analysis to ensure data accuracy. Across continents, about 15% of the training data underwent careful removal to address misclassification issues and confusion among distinct land cover classes.

Challenges persisted while achieving accuracy, particularly in differentiating specific land cover types like herbaceous vegetation and shrubs due to their mixed representations within training units. These findings underscore the complexity of delineating specific land cover categories, even at an acceptable spatial resolution of 30 meters. This dataset presents a comprehensive global land cover analysis resource yet emphasizes the need for continued research to address nuanced classification challenges.

Journal reference:

Article Revisions

  • Dec 18 2023 - Fixed error in journal paper hyperlink
Silpaja Chandrasekar

Written by

Silpaja Chandrasekar

Dr. Silpaja Chandrasekar has a Ph.D. in Computer Science from Anna University, Chennai. Her research expertise lies in analyzing traffic parameters under challenging environmental conditions. Additionally, she has gained valuable exposure to diverse research areas, such as detection, tracking, classification, medical image analysis, cancer cell detection, chemistry, and Hamiltonian walks.

Citations

Please use one of the following formats to cite this article in your essay, paper or report:

  • APA

    Chandrasekar, Silpaja. (2023, December 17). GLanCE: Revolutionizing Global Land Cover Estimation with Cloud Computing and Machine Learning. AZoAi. Retrieved on December 22, 2024 from https://www.azoai.com/news/20231213/GLanCE-Revolutionizing-Global-Land-Cover-Estimation-with-Cloud-Computing-and-Machine-Learning.aspx.

  • MLA

    Chandrasekar, Silpaja. "GLanCE: Revolutionizing Global Land Cover Estimation with Cloud Computing and Machine Learning". AZoAi. 22 December 2024. <https://www.azoai.com/news/20231213/GLanCE-Revolutionizing-Global-Land-Cover-Estimation-with-Cloud-Computing-and-Machine-Learning.aspx>.

  • Chicago

    Chandrasekar, Silpaja. "GLanCE: Revolutionizing Global Land Cover Estimation with Cloud Computing and Machine Learning". AZoAi. https://www.azoai.com/news/20231213/GLanCE-Revolutionizing-Global-Land-Cover-Estimation-with-Cloud-Computing-and-Machine-Learning.aspx. (accessed December 22, 2024).

  • Harvard

    Chandrasekar, Silpaja. 2023. GLanCE: Revolutionizing Global Land Cover Estimation with Cloud Computing and Machine Learning. AZoAi, viewed 22 December 2024, https://www.azoai.com/news/20231213/GLanCE-Revolutionizing-Global-Land-Cover-Estimation-with-Cloud-Computing-and-Machine-Learning.aspx.

Comments

The opinions expressed here are the views of the writer and do not necessarily reflect the views and opinions of AZoAi.
Post a new comment
Post

While we only use edited and approved content for Azthena answers, it may on occasions provide incorrect responses. Please confirm any data provided with the related suppliers or authors. We do not provide medical advice, if you search for medical information you must always consult a medical professional before acting on any information provided.

Your questions, but not your email details will be shared with OpenAI and retained for 30 days in accordance with their privacy principles.

Please do not ask questions that use sensitive or confidential information.

Read the full Terms & Conditions.

You might also like...
Machine Learning Predicts Recovery in Endurance Athletes But Requires Personalized Strategies