Explore how cutting-edge technology uncovers ancient secrets beneath dense jungle canopies, revolutionizing archaeology and preserving heritage.
Archaeoscape. Our proposed dataset contains 888 km2 of aerial laser scans taken in Cambodia. The 3D point cloud LiDAR data (left) was processed to obtain a digital terrain model (middle). Archaeologists have drawn and field-verified 31,411 individual polygons by delineating anthropogenic features (right).
*Important notice: arXiv publishes preliminary scientific reports that are not peer-reviewed and, therefore, should not be regarded as definitive, used to guide development decisions, or treated as established information in the field of artificial intelligence research.
An article recently posted on the arXiv preprint* server explored the transformative role of airborne laser scanning (ALS) technology in archaeology. The researchers introduced a new dataset called 'Archaeoscape,' designed to bridge the gap between traditional archaeological methods and modern deep learning techniques. This large-scale, open-access dataset aims to address the challenges of analyzing ALS data, particularly in densely vegetated areas. It is over four times larger than comparable datasets, marking a significant leap in scale for ALS archaeology.
Advancement of Airborne Laser Scanning Technology
ALS represents a significant advancement in geospatial technology. This technology allows scientists to penetrate dense vegetation areas and reveal archaeological features that would otherwise remain hidden. This technology has transformed archaeology by producing high-resolution elevation models and detailed imagery, making it easier to identify subtle human-made structures. Previous studies have demonstrated ALS's effectiveness in uncovering hidden landscapes and enhancing the understanding of past human activities.
However, the application of deep learning to ALS data has been limited, partly due to the reliance of models on RGB-based pretraining and the lack of expert-annotated datasets. This has hindered the development of specialized models for analyzing the large volumes of data ALS surveys generate. To address this gap, the researchers developed a robust dataset to advance archaeological research and promote the integration of modern computational techniques in the field.
Archaeoscape: A Novel and Comprehensive Dataset
In this paper, the authors introduced Archaeoscape, a dataset covering 888 square kilometers and containing annotations of 31,141 archaeological features from the Angkorian era in Cambodia. It is the first open-access dataset mainly designed for ALS archaeology, providing data, annotations, and models tailored to this field. Archaeoscape includes annotations streamlined into five main categories—Temple, Mound, Hydrology, Void, and Background—making it user-friendly for analysis.
The study evaluated several modern segmentation models to showcase the effectiveness of computer vision techniques in detecting subtle human-made structures beneath dense jungle canopies. The methodology included extensive data acquisition campaigns using advanced light detection and ranging (LiDAR) systems from the Khmer Archaeology Lidar Consortium (KALC, 2012) and the Cambodian Archaeological Lidar Initiative (CALI, 2015). Helicopter-mounted Leica LiDAR systems captured high-resolution aerial imagery and three-dimensional (3D) point clouds, which were processed into normalized Digital Terrain Models (nDTMs) and orthophotos. Then, expert teams annotated the data, employing a classification system refined into five main categories: Temple, Mound, Hydrology, Void, and Background. This streamlined system improves usability and addresses challenges related to class imbalance.
Archaeoscape overview. We show the vectorial annotations overlaid onto the relative elevation maps for each parcel, and their assignment to the training, validation or test splits. The position and orientation of the parcels is arbitrary. The geometry of the annotations has been simplified to reduce the file size of the paper. Best viewed on a computer screen.
Findings of Deep Learning Techniques and Novel Dataset
The study highlighted the unique challenges of detecting ancient features using ALS technology. Traditional convolutional neural network (CNN)-based models like U-Net, previously dominant in archaeological studies, were found to have surprising competitiveness but also significant limitations when applied to the intricate patterns in ALS data. The authors demonstrated that modern segmentation models, such as Vision Transformers (ViTs), showed greater effectiveness in identifying archaeological features. However, challenges remain due to the subtlety of objects and the complexity of their surrounding environments.
The outcomes showed that the Archaeoscape dataset outperformed existing datasets. Its extensive spatial coverage and detailed feature annotations provide a valuable resource for advancing archaeological research and integrating modern computational methods. Additionally, the study underscored the importance of open-access policies in fostering collaboration and innovation within the archaeological community. The authors also highlighted ethical measures to prevent misuse of the dataset, such as partitioning the data, stripping georeferencing, and requiring user agreements to ensure responsible usage.
Practical Applications
Beyond academics, this research has significant implications for heritage management and conservation. The Archaeoscape dataset provides a robust framework for applying deep learning techniques, enabling archaeologists to automate the detection and analysis of archaeological features with greater efficiency and accuracy.
Integrating ALS technology with computational techniques opens new ways to explore previously inaccessible sites, uncover hidden structures, and enhance understanding of ancient civilizations, their landscapes, and environmental interactions. The dataset's open-access nature promotes collaboration, engaging experts from fields such as computer vision, remote sensing, and archaeology to advance knowledge in ALS-based research.
Conclusion and Future Directions
In summary, the introduction of Archaeoscape represents a significant advancement in the intersection of archaeology and technology. The dataset not only addresses the critical gap in expert-annotated resources but also sets a benchmark for future research in ALS archaeology. The study highlighted the importance of open-access policies and modern deep-learning approaches to advancing the analysis of archaeological data.
As the field evolves, the authors suggest further exploration of innovative solutions to the unique challenges presented by ALS data. Future research will focus on improving model performance by addressing domain shifts and enhancing the integration of elevation data with other channels. Integrating advanced machine learning techniques with traditional archaeological methods can uncover new insights into the complexities of human history and inhabited landscapes. Overall, the Archaeoscape dataset is an important resource for scientists seeking to harness technology to uncover the rich details of archaeological heritage.
*Important notice: arXiv publishes preliminary scientific reports that are not peer-reviewed and, therefore, should not be regarded as definitive, used to guide development decisions, or treated as established information in the field of artificial intelligence research.
Journal reference:
- Preliminary scientific report.
Perron, Y., & et, al. Archaeoscape: Bringing Aerial Laser Scanning Archaeology to the Deep Learning Era. arXiv, 2024, 2412, 05203. DOI: 10.48550/arXiv.2412.05203, https://arxiv.org/abs/2412.05203