In a paper published in the journal Scientific Data, researchers introduced InsectSound1000, a dataset containing over 169,000 labeled sound samples from 12 insect species. Samples range from the loud buzz of Bombus terrestris to frequencies inaudible to humans like Aphidoletes aphidimyza.
Recordings were made in an anechoic box using a four-channel low-noise measurement microphone array. Each sample is a 2500 kHz four-channel wave-file with a 16 kHz sample rate and 32-bit resolution. This dataset has significant potential for training deep-learning (DL) models and developing acoustic insect recognition systems for pest and ecological monitoring.
Background
Past work developed InsectSound1000, a high-quality insect recording dataset that supports greenhouse pest management. Initial studies showcased the dataset's utility in training robust acoustic insect detection algorithms. These algorithms, leveraging DL, demonstrated promising classification accuracy even amidst simulated environmental noise.
InsectSound1000's potential extends beyond pest management, offering insights into broader ecological surveys. Its extensive size and quality enable efficient model pretraining, which enhances insect sound recognition systems across varied hardware.
Optimizing Insect Sound Recording
The recording setup for the InsectSound1000 dataset was meticulously designed to meet stringent requirements for noise shielding, reverberation control, and practical usability. The setup featured a double-wall anechoic box constructed from medium-density fiber (MDF) wood panels. This design effectively minimized environmental noise interference, particularly in the low-frequency range critical for insect sound recordings.
The inner box, suspended within the outer box by springs, was lined with layers of open-cell acoustic foam to absorb reverberations and shield against high-frequency noise. Special attention was paid to mitigating the influence of artificial surroundings on insect sounds, ensuring that recordings closely resembled real-world conditions in greenhouse environments. A microphone array comprising four low-noise microphones was strategically arranged to capture insect sounds from multiple angles, enhancing the dataset's value for acoustic classification tasks.
The study focused on optimizing the recording environment for capturing insect sounds in a laboratory setting by testing various types of insect containers to assess their impact on recorded sounds, including Petri dishes, styrofoam take-out soup containers, mesh insect-rearing bags, and cages of different sizes. Ultimately, it was found that larger cages constructed from mesh material minimized reverberations and footstep sounds, with dimensions of 32.5 cm × 32.5 cm × 77 cm being ideal for fitting into an anechoic box.
In addition, including plants and soil inside the cage aimed to replicate a greenhouse environment, thereby reducing unwanted sounds generated by insect interactions with the cage itself. Despite challenges like fungal gnat infections, steps like soil disinfection and clay pellets were taken to maintain a conducive environment for insects and recordings.
In terms of recording hardware and process, meticulous attention was given to every component of the measurement chain to ensure optimal recording quality. It involved carefully selecting equipment, such as amplifiers and converters, and setting appropriate filters and gain levels. Recordings were made overnight to minimize external noise interference, with specific adjustments made to lighting schedules to accommodate nocturnal insect activity.
Moreover, a sophisticated signal processing pipeline was employed to extract sound events from raw recordings, downsample them to a standardized format, and filter out unwanted background noise. While external noise was mitigated during the extraction process, steps were taken to ensure that any remaining noise would not significantly impact the usability of the resulting dataset for training DL models.
Experimental Assessment
The experimental assessment aimed to quantify the noise isolation properties of the self-built anechoic enclosure used to record the InsectSound1000 dataset. By conducting comparative measurements with a speaker placed outside the box and a microphone array positioned inside, the study evaluated the box's ability to attenuate external noise across different frequency ranges.
The results indicated that while the anechoic box effectively reduced high-frequency noise, it provided minimal shielding against low-frequency sounds, particularly those below 450 Hz. Despite this limitation, the setup demonstrated satisfactory performance within project constraints, offering substantial noise reduction for frequencies above 1000 Hz and ensuring relatively clean recordings of insect sounds within the desired frequency range.
Recording sessions were conducted overnight to minimize external disturbances. Additionally, the analysts employed signal processing techniques such as aggressive high-pass filtering during data extraction to reduce background noise in the final dataset. Despite the likelihood of some residual background noise in the raw recordings, the combination of physical noise shielding measures and tailored signal processing techniques contributed to creating a dataset with high labeling accuracy and sound quality suitable for machine learning applications.
This approach allowed for the development of the InsectSound1000 dataset, which offers superior quality and labeling accuracy compared to existing insect sound datasets. It facilitated advancements in automated data processing and machine learning (ML) models for insect sound classification.
Conclusion
To sum up, InsectSound1000 was a comprehensive dataset comprising labeled sound samples from 12 insect species, ranging from audible buzzes to ultrasonic emissions. It was compiled from extensive recordings within an anechoic box using a four-channel microphone array. Each sample provided valuable data for training deep learning models. With potential applications in automated insect sensing for pest and ecological monitoring, the dataset's high-quality recordings and detailed methodology supported further research and extension.