Event-Based Segmentation Dataset (ESD): A Leap Forward in Neuromorphic Vision for Object Segmentation

In a paper published in the journal Scientific Data, researchers addressed limitations in conventional cameras by utilizing event-based cameras to combat issues like motion blur and low dynamic range. The paper introduced the event-based segmentation dataset (ESD), acknowledging the limited availability of specialized datasets for assessing segmentation algorithms, particularly in scenarios involving occluded scenes.

Study: Event-Based Segmentation Dataset (ESD): A Leap Forward in Neuromorphic Vision for Object Segmentation. Image credit: ImageFlow/Shutterstock
Study: Event-Based Segmentation Dataset (ESD): A Leap Forward in Neuromorphic Vision for Object Segmentation. Image credit: ImageFlow/Shutterstock

This high-quality 3D spatial-temporal dataset comprised 145 sequences with 14,166 annotated red, green, and blue (RGB) frames and a substantial event count from two stereo-configured event-based cameras. ESD pioneered a densely annotated benchmark for tabletop object segmentation, providing event-wise depth, annotated instance labels, and corresponding RGBDepth (RGBD) frames to offer the research community an exceptional and challenging segmentation dataset.

Related Work

Previous research has focused on the demand for versatile robots in the fourth industrial revolution, particularly emphasizing gripper-equipped robots for efficient grasping tasks in unstructured environments. While traditional vision sensors like RGB and RGBD have been predominant, limitations in power consumption and motion blur have led to the exploration of neuromorphic vision sensors. Despite significant advancements in traditional segmentation datasets, event-based instance segmentation of tabletop objects still needs to be explored.

ESD Experimental Overview

Researchers meticulously design the experimental setup for the ESD to tackle challenges in object segmentation, particularly in unstructured environments and under conditions of occlusion. The acquisition of depth information, crucial for segmentation tasks, is facilitated through a stereo camera setup and multi-view stereo (MVS) techniques, along with a monocular depth estimation model trained on extensive datasets.

The hardware setup involves a universal robot 10 (UR10) robot with three cameras: an RGBD camera (intel D435) and two event cameras (Davis 346c). Researchers position the event cameras to actively offer stereo configuration, providing depth information through coordinated mapping.

Synchronization connectors ensure the simultaneous triggering of events. The relative tilt angle between the two event cameras is set at five degrees for complete overlap, enhancing the effectiveness of the stereo setup. The experimental protocol involves collecting data for two subsets, ESD-1 (training) and ESD-2 (testing), focusing on up to ten and five objects under various conditions such as lighting, camera movement speeds, trajectories, and occlusion scenarios.

Through a transformation process, researchers actively derive depth information for mono event-based cameras by mapping it from the RGBD camera frame into the event coordinate system. This approach enables the development of algorithms for segmentation using depth information with mono event-based cameras, enhancing real-time depth estimation tasks.

The dataset extends its versatility by providing stereo event camera configuration and RGBD information. The UR10 robot offers stable camera movement control, with cameras fixed on the end effector. The event cameras are synchronized, and data collection involves diverse experimental conditions to create a comprehensive and challenging dataset.

Researchers conducted manual annotation of RGB frames using a computer vision annotation tool (CVAT), addressing challenges such as occlusion and motion blur resulting from the low sampling rate of the RGBD camera. They implemented a two-step labeling process to ensure precision, where initial annotations are refined based on the analysis of corresponding events. Events are automatically labeled according to the annotated RGB masks, enabling the generation of a continuous data stream for segmentation tasks.

The automatic annotation process involves transformations between RGBD and event camera coordinates, using an iterative closest point (ICP) algorithm for alignment. Researchers actively divide the dataset into training and testing subsets, each featuring up to ten and five objects, respectively, with variations in lighting conditions, camera heights, and occlusion scenarios. The challenging attributes of the dataset are demonstrated through visualizations of examples from ESD-1 and ESD-2, highlighting conditions such as the number of objects and occlusions among them.

Technical Validation

The ESD dataset is meticulously validated using standard segmentation metrics, focusing on instance and semantic segmentation tasks. The evaluation metrics include accuracy and mean intersection over union (mIoU). Pixel accuracy, calculated as the percentage of correctly classified pixels, and mIoU, effective for imbalanced binary and multi-class segmentation, are employed for quantifying testing results.

Several well-known segmentation methods, including fully convolutional network (FCN), U-shaped network (U-Net), and deep lab version 3 (DeepLabV3+), are applied to evaluate manually labeled RGB frames from the ESD dataset. Results indicate challenges related to image blurring caused by camera motion, impacting accuracy and mIoU scores. However, ESD's tabletop setting, specifically designed for object segmentation, provides a relatively favorable environment for distinguishing foreground and background.

Transfer learning approaches are employed to extract features from both RGB frames and event streams. Testing results reveal that integrated RGB and events data significantly enhance segmentation accuracy for known objects. However, challenges arise in segmenting unknown objects, highlighting the complexity associated with this task. The dataset's attributes, including varying camera trajectories, movement speeds, lighting conditions, and object scenarios, contribute to a comprehensive evaluation.

Conclusion

In conclusion, the ESD offers a robust platform for advancing object segmentation research in complex environments, particularly addressing challenges like occlusion. The carefully crafted experimental setup, diverse conditions, and RGB and event data integration showcase the dataset's versatility. Despite notable success in known object segmentation, challenges persist, especially in segmenting unknown objects. The evaluation metrics and detailed analyses contribute valuable insights, providing researchers with a comprehensive benchmark for further advancements in neuromorphic vision-based segmentation algorithms.

Journal reference:
Silpaja Chandrasekar

Written by

Silpaja Chandrasekar

Dr. Silpaja Chandrasekar has a Ph.D. in Computer Science from Anna University, Chennai. Her research expertise lies in analyzing traffic parameters under challenging environmental conditions. Additionally, she has gained valuable exposure to diverse research areas, such as detection, tracking, classification, medical image analysis, cancer cell detection, chemistry, and Hamiltonian walks.

Citations

Please use one of the following formats to cite this article in your essay, paper or report:

  • APA

    Chandrasekar, Silpaja. (2024, February 02). Event-Based Segmentation Dataset (ESD): A Leap Forward in Neuromorphic Vision for Object Segmentation. AZoAi. Retrieved on November 24, 2024 from https://www.azoai.com/news/20240202/Event-Based-Segmentation-Dataset-(ESD)-A-Leap-Forward-in-Neuromorphic-Vision-for-Object-Segmentation.aspx.

  • MLA

    Chandrasekar, Silpaja. "Event-Based Segmentation Dataset (ESD): A Leap Forward in Neuromorphic Vision for Object Segmentation". AZoAi. 24 November 2024. <https://www.azoai.com/news/20240202/Event-Based-Segmentation-Dataset-(ESD)-A-Leap-Forward-in-Neuromorphic-Vision-for-Object-Segmentation.aspx>.

  • Chicago

    Chandrasekar, Silpaja. "Event-Based Segmentation Dataset (ESD): A Leap Forward in Neuromorphic Vision for Object Segmentation". AZoAi. https://www.azoai.com/news/20240202/Event-Based-Segmentation-Dataset-(ESD)-A-Leap-Forward-in-Neuromorphic-Vision-for-Object-Segmentation.aspx. (accessed November 24, 2024).

  • Harvard

    Chandrasekar, Silpaja. 2024. Event-Based Segmentation Dataset (ESD): A Leap Forward in Neuromorphic Vision for Object Segmentation. AZoAi, viewed 24 November 2024, https://www.azoai.com/news/20240202/Event-Based-Segmentation-Dataset-(ESD)-A-Leap-Forward-in-Neuromorphic-Vision-for-Object-Segmentation.aspx.

Comments

The opinions expressed here are the views of the writer and do not necessarily reflect the views and opinions of AZoAi.
Post a new comment
Post

While we only use edited and approved content for Azthena answers, it may on occasions provide incorrect responses. Please confirm any data provided with the related suppliers or authors. We do not provide medical advice, if you search for medical information you must always consult a medical professional before acting on any information provided.

Your questions, but not your email details will be shared with OpenAI and retained for 30 days in accordance with their privacy principles.

Please do not ask questions that use sensitive or confidential information.

Read the full Terms & Conditions.