High-Resolution Aerial Dataset of Above-Ground Storage Tanks

In an article published in the journal Scientific Data, researchers from Duke University developed an original large-scale, multi-class dataset of above-ground storage tanks (ASTs) from high-resolution aerial imagery across the contiguous United States. They presented a validation procedure to ensure the quality and reliability of the annotations.

Study: High-Resolution Aerial Dataset of Above-Ground Storage Tanks. Image credit: AUUSanAKUL/Shutterstock
Study: High-Resolution Aerial Dataset of Above-Ground Storage Tanks. Image credit: AUUSanAKUL/Shutterstock

Background

ASTs are large containers or vessels used to store liquids, typically industrial chemicals, petroleum products, or water, in various industries, including chemical and petroleum production, processing, refining, and transport. These tanks are situated above the ground surface, in contrast to underground storage tanks, and are vulnerable to natural and anthropogenic hazards such as hurricanes, floods, fires, explosions, and sabotage.

However, suitable data for checking system vulnerabilities and failures, calculating current production and capacity, and evaluating the state of energy and other infrastructure are not easily available to regulators, researchers, and other decision-makers. In some cases, remotely sensed imagery has been used to develop AST datasets for specialized purposes. Yet, they are limited by sparse annotations, lack of geographic coverage, missing location data, limited availability, and simplified classifications. Therefore, there is a need for a publicly available dataset of ASTs with high-resolution aerial imagery.

About the Research

In the present paper, an innovative dataset of ASTs was designed using the high-resolution aerial imagery collected by the National Agriculture Imagery Program (NAIP) of the United States Department of Agriculture (USDA). The NAIP imagery covered the continental US during the season of agricultural growth and had a minimum ground sampling distance of 60 cm, allowing the identification of objects ranging from 3 to 69 m in diameter. The study selected NAIP tiles based on the presence of relevant infrastructure and identifiable objects. The selected tiles were then broken into 512-by-512-pixel images for annotation.

The authors used a graphical user interface (GUI) tool called LabelImg to manually annotate and classify the objects of interest in the images. The objects were categorized into five classes: external floating roof tanks, closed roof tanks, spherical pressure tanks, narrow closed roof tanks, sedimentation tanks, and water towers. The research implemented a validation procedure to minimize the errors and ensure the data quality and consistency of the annotations.

The procedure involved three steps: (1) checking for missed objects, (2) adjusting the bounding boxes, and (3) confirming the class labels. Furthermore, the validation procedure was evaluated by comparing it with a ground truth dataset created by an expert. The researchers also obtained geospatial information for each tank, image, and tile, as well as the diameter of each object, developed tile-level annotations, and compiled the full dataset.

Research Findings

The outcomes showed that the paper compiled the individual images and corresponding annotations into a broader dataset along with metadata. The dataset consists of 142,107 objects distributed between seven classes, with narrow closed roof tanks and closed roof tanks comprising 35% and 51% of the dataset, respectively. It covers 48 states and is consistent with existing petroleum datasets that were not utilized to create the dataset.

The final dataset is publicly available in a Figshare repository in various formats, such as JavaScript object notation (JSON), Geographic JSON (GeoJSON), Environmental Systems Research Institute (ESRI) shapefile, eXtensible markup language (XML), and Joint photographic experts’ group (JPG).

The study reported the results of the validation procedure and the comparison with the real-world dataset. It found that the validation procedure improved the coverage and quality of the annotations, as well as the accuracy of the class labels. The research also found that the validation procedure achieved an average precision of 0.99 and recall of 0.952 across tank classes, indicating that the bounding boxes were correctly correlated with objects of interest, and fewer than 5% of objects were missed. Moreover, it analyzed the causes of missed objects and found that smaller tanks were harder to identify.

Applications

The proposed dataset can be used for various applications. Following are some of the potential uses:

  • Training and testing data for object detection algorithms, particularly for remotely sensed imagery and ASTs.
  • Providing geospatial data for AST risk assessments, facilitating the estimation of potential impacts from natural and anthropogenic hazards on ASTs and their surroundings.
  • Generating petroleum storage and net capacity estimates by considering the location, size, and type of ASTs.
  • Supporting petrochemical market evaluations and economic assessments based on the distribution and characteristics of ASTs.
  • Contributing to machine learning or computer vision tasks.

Conclusion

In summary, the novel dataset contains high-resolution aerial imagery, geospatial coordinates, border vertices, and orthorectified imagery for over 130,000 ASTs from five labeled classes. It is publicly accessible and serves a variety of purposes, including production and capacity estimation, risk and hazard assessment, and infrastructure evaluation. Additionally, the study introduced a quality checking approach to maintain the quality and reliability of the annotations.

The authors suggested that future work could include updating the dataset with more recent imagery, expanding geographic coverage to other areas, and adding more classes and features to the dataset. They argued that the novel dataset can be a valuable resource for regulators and decision-makers interested in ASTs and associated risks and benefits.

Journal reference:
Muhammad Osama

Written by

Muhammad Osama

Muhammad Osama is a full-time data analytics consultant and freelance technical writer based in Delhi, India. He specializes in transforming complex technical concepts into accessible content. He has a Bachelor of Technology in Mechanical Engineering with specialization in AI & Robotics from Galgotias University, India, and he has extensive experience in technical content writing, data science and analytics, and artificial intelligence.

Citations

Please use one of the following formats to cite this article in your essay, paper or report:

  • APA

    Osama, Muhammad. (2024, January 18). High-Resolution Aerial Dataset of Above-Ground Storage Tanks. AZoAi. Retrieved on January 15, 2025 from https://www.azoai.com/news/20240118/High-Resolution-Aerial-Dataset-of-Above-Ground-Storage-Tanks.aspx.

  • MLA

    Osama, Muhammad. "High-Resolution Aerial Dataset of Above-Ground Storage Tanks". AZoAi. 15 January 2025. <https://www.azoai.com/news/20240118/High-Resolution-Aerial-Dataset-of-Above-Ground-Storage-Tanks.aspx>.

  • Chicago

    Osama, Muhammad. "High-Resolution Aerial Dataset of Above-Ground Storage Tanks". AZoAi. https://www.azoai.com/news/20240118/High-Resolution-Aerial-Dataset-of-Above-Ground-Storage-Tanks.aspx. (accessed January 15, 2025).

  • Harvard

    Osama, Muhammad. 2024. High-Resolution Aerial Dataset of Above-Ground Storage Tanks. AZoAi, viewed 15 January 2025, https://www.azoai.com/news/20240118/High-Resolution-Aerial-Dataset-of-Above-Ground-Storage-Tanks.aspx.

Comments

The opinions expressed here are the views of the writer and do not necessarily reflect the views and opinions of AZoAi.
Post a new comment
Post

While we only use edited and approved content for Azthena answers, it may on occasions provide incorrect responses. Please confirm any data provided with the related suppliers or authors. We do not provide medical advice, if you search for medical information you must always consult a medical professional before acting on any information provided.

Your questions, but not your email details will be shared with OpenAI and retained for 30 days in accordance with their privacy principles.

Please do not ask questions that use sensitive or confidential information.

Read the full Terms & Conditions.

You might also like...
Green AI: Exploring Sustainable Strategies for Energy-Efficient Systems and Eco-Friendly Innovations