MFWD Dataset for High-Quality Weed Species Analysis in Maize and Sorghum Fields

In an article recently published in the journal Scientific Data, researchers presented a manually curated and annotated dataset of diverse weed species in sorghum and maize for computer vision.

Study: MFWD Dataset for High-Quality Weed Species Analysis in Maize and Sorghum Fields. Image credit: RachenStocker/Shutterstock
Study: MFWD Dataset for High-Quality Weed Species Analysis in Maize and Sorghum Fields. Image credit: RachenStocker/Shutterstock

Importance of sustainable weed management

In crop production, weeds are undesirable as they adversely impact crop development by competing with the crop for water, space, sunlight, and nutrients. Thus, weeds lead to decreased crop productivity, increased agricultural production cost, and greater challenges during harvesting.

Additionally, weeds can be hosts for diseases and insects, increasing the need for control strategies. However, weeds can have positive effects on soil structure and biodiversity. Thus, only highly invasive and competitive weed species must be removed to ensure more sustainable agriculture.

Sustainable weed management strategies are crucial to feed the global population while conserving biodiversity and ecosystems. Thus, site-specific weed control strategies based on automation are necessary to decrease the additional effort and time needed for weeding. Several studies have demonstrated various methods, including computer vision-based methods, to detect weeds automatically in greenhouses or on the field.

The need for high-quality data

Although machine vision-based methods have displayed their effectiveness for weed detection, they require high-quality data on the species in a specific agricultural area. Specifically, the validation, assessment, and development of such systems depend on the availability of high-quality weed diversity data in a particular region.

Although many datasets are publicly available, they lack many precise plant phenotyping aspects. Several datasets also lack data variability, which limits their use in many studies. A few datasets, including Pl@ntNet-300k and Open Plant Phenotype Database (OPPD), have more than 100,000 annotated plant samples.

However, Pl@ntNet-300k is only utilized for classification tasks without tracking the growth stages of the plant, which is essential to obtain critical insights into growth dynamics. Similarly, OPPD lacks instance and semantic segmentation masks, which are crucial for precise phenotyping. Moreover, a plant's bounding box information is often insufficient, as the information is too coarse for many weed management applications, necessitating semantic segmentation or more accurate instance segmentation masks.

The proposed dataset

In this study, researchers presented a high-quality dataset of different plant species, designated as the moving fields weed Dataset (MFWD), which captures the growth of 28 common weed species found in maize and sorghum fields in Germany. They acquired 94,321 images in a high-throughput, fully automated phenotyping facility to track more than 5,000 individual plants at high temporal and spatial resolution. Using a high throughput phenotyping system ensured a high degree of automation as the system was equipped with an automatic irrigation system and controlled illumination.

Additionally, a rich set of ground truth information/semantic and instance segmentation masks of a subset, curated manually, was also provided to make this dataset suitable for weed management tasks. This information can be utilized for object detection and instance segmentation, plant species classification, and multiple object tracking.

Plant images captured multiple times per day were included in the dataset, including the images captured during the evening when some species’ appearance changes owing to their dependence on sunlight. Data was also generated from multiple varieties of maize and sorghum by focusing on different seedling weeds that are common in agricultural areas where these crops are grown.

A greenhouse experiment was conducted at the moving fields (MF) facility of the Bavarian State Research Center for Agriculture to generate a dataset containing high-quality images that capture the initial growth dynamics of individual plants of multiple weed species.

The selection criteria used for including the plant/weed species in the dataset were the commonality of those species in sorghum and maize fields in Germany, commercial availability, and the ability to grow in climatically controlled conditions of a greenhouse.

One of the Scanalyzer three-dimensional (3D) imaging cabins of the MF facility was employed to generate high-resolution, well-illuminated, top-down images of the experimental units. The open-source software computer vision annotation tool (CVAT) was used as a self-hosted solution to label the complete dataset.

Significance of the study

Researchers performed a simple baseline experiment on the image classification task to evaluate the feasibility of using the MFWD dataset. Specifically, they focused on multi-species classification for sorghum and excluded all maize images.

The dataset of 27 plant species used for this experiment contained 167,505 images, which were split into training, validation, and test sets. Two deep learning-based model architectures, including EfficientNet_b0 and ResNet-10, were utilized to assess the classification performance.

EfficientNet_b0 with a 5.4x10−4 learning rate displayed the best results on the validation set with a 90% f1-score after all models were trained using the training dataset. The best-performing model was then applied to the test set to determine its generalization abilities on an unseen dataset. The model realized a weighted f1-score of 90.57%, which indicated a good generalization performance in the MFWD dataset.

To summarize, the findings of this study demonstrated that pre-training a model using the proposed MFWD dataset and fine-tuning the model to the target task could be a feasible strategy for scaling up weed detection in agricultural landscapes.

Journal reference:
Samudrapom Dam

Written by

Samudrapom Dam

Samudrapom Dam is a freelance scientific and business writer based in Kolkata, India. He has been writing articles related to business and scientific topics for more than one and a half years. He has extensive experience in writing about advanced technologies, information technology, machinery, metals and metal products, clean technologies, finance and banking, automotive, household products, and the aerospace industry. He is passionate about the latest developments in advanced technologies, the ways these developments can be implemented in a real-world situation, and how these developments can positively impact common people.

Citations

Please use one of the following formats to cite this article in your essay, paper or report:

  • APA

    Dam, Samudrapom. (2024, January 31). MFWD Dataset for High-Quality Weed Species Analysis in Maize and Sorghum Fields. AZoAi. Retrieved on July 06, 2024 from https://www.azoai.com/news/20240131/MFWD-Dataset-for-High-Quality-Weed-Species-Analysis-in-Maize-and-Sorghum-Fields.aspx.

  • MLA

    Dam, Samudrapom. "MFWD Dataset for High-Quality Weed Species Analysis in Maize and Sorghum Fields". AZoAi. 06 July 2024. <https://www.azoai.com/news/20240131/MFWD-Dataset-for-High-Quality-Weed-Species-Analysis-in-Maize-and-Sorghum-Fields.aspx>.

  • Chicago

    Dam, Samudrapom. "MFWD Dataset for High-Quality Weed Species Analysis in Maize and Sorghum Fields". AZoAi. https://www.azoai.com/news/20240131/MFWD-Dataset-for-High-Quality-Weed-Species-Analysis-in-Maize-and-Sorghum-Fields.aspx. (accessed July 06, 2024).

  • Harvard

    Dam, Samudrapom. 2024. MFWD Dataset for High-Quality Weed Species Analysis in Maize and Sorghum Fields. AZoAi, viewed 06 July 2024, https://www.azoai.com/news/20240131/MFWD-Dataset-for-High-Quality-Weed-Species-Analysis-in-Maize-and-Sorghum-Fields.aspx.

Comments

The opinions expressed here are the views of the writer and do not necessarily reflect the views and opinions of AZoAi.
Post a new comment
Post

While we only use edited and approved content for Azthena answers, it may on occasions provide incorrect responses. Please confirm any data provided with the related suppliers or authors. We do not provide medical advice, if you search for medical information you must always consult a medical professional before acting on any information provided.

Your questions, but not your email details will be shared with OpenAI and retained for 30 days in accordance with their privacy principles.

Please do not ask questions that use sensitive or confidential information.

Read the full Terms & Conditions.

You might also like...
AR and Computer Vision Revolutionize Bridge Inspections