A new AI-driven drone detection system revolutionizes search and rescue operations by precisely identifying individuals in challenging environments, including dense forests and disaster zones, utilizing both visible and infrared imagery to overcome obstacles such as poor lighting, occlusion, and scale variation.

Challenges in APD. (A) Size and perspective variability: From various aerial viewpoints and altitudes, most personnel objects appear smaller than 20 pixels. (B) Sparse distribution: In SaR operations, persons are often thinly dispersed across vast areas. (C) Lighting and visibility issues: Lighting conditions substantially influence the visibility of persons, which can be easily obscured by background distractions. (D) Device constraints: Deploying AI algorithms on devices with limited computational capabilities remains challenging, restricting real-time processing and analysis.
Search and rescue operations frequently encounter challenges due to unpredictable weather, rugged terrain, and limited resources. Traditional methods rely heavily on experienced personnel, making the process time-consuming and labor-intensive. While unmanned aerial vehicles (UAVs) offer a promising alternative, their effectiveness has been constrained by difficulties in detecting small or partially visible individuals in aerial imagery. These limitations underscore the need for enhanced APD technologies to improve the accuracy and efficiency of rescue efforts.
Published in the Journal of Remote Sensing, researchers from Northwestern Polytechnical University and Yan'an University have developed an AI-powered APD system designed to improve the detection of individuals in aerial images captured by drones. The system addresses critical challenges, including occlusion, scale variation, and changing lighting conditions, aiming to enhance the precision and reliability of SaR operations, particularly in remote and inaccessible areas.

The structured taxonomy of APD methodologies, based on deep learning, is classified into 4 broad categories: (A) Object-aware methods primarily focused on detecting smaller objects, which are enhanced by a wide range of advanced detection technologies. (B) Sample-oriented methods address the improvement of detectors suitable for sparse samples, incorporating few-shot, zero-shot, and synthetic instances. (C) Information-fusion methods mainly concentrate on the traditional fusion mechanisms of data derived from visible and infrared imagery. It is further supplemented by feature fusions based on the transformer and decision-level integrations enabled by ensemble learning. (D) Lightweight methods procedure encapsulates the strategic sparsity of parameters across channels and spatial dimensionality, augmented by network pruning and the incorporation of low-precision quantization strategies. Differentiation among these categories is rendered by applying unique color markers for each type. This illustration presents only a select few representative methods from each category.
The research team compiled the VTSaR dataset, which includes diverse environments, human behaviors, and multiple capture angles. The dataset integrates visible and infrared images alongside synthetic data to establish a comprehensive benchmark for APD. Testing of several detection algorithms demonstrated notable improvements in detection accuracy and efficiency. The proposed system performed well under challenging conditions, offering better results compared to existing technologies in handling occlusions and variations in scale and lighting.
"Our research contributes to the development of more effective Aerial Person Detection for search and rescue missions," said Dr. Xiangqing Zhang, the study's lead researcher. "By integrating AI with multimodal data fusion, we have designed a system that improves detection capabilities in complex environments, making SaR operations more efficient and reliable."
Illustration of the collection and processing procedures employed by the self-built SaR system. The scenarios discussed originate from areas in Xinyang and Huizhou, China. In this case, samples were collected, encompassing both RGB and infrared bands, ultimately creating the VTSaR dataset.
The study utilized a specially designed unmanned helicopter equipped with a dual-camera gimbal system to capture aerial images from diverse environments, including urban, suburban, maritime, and wilderness areas. The VTSaR dataset comprises three versions: Unaligned VTSaR (UA-VTSaR), Aligned VTSaR (A-VTSaR), and Aligned Synthetic VTSaR (AS-VTSaR), which collectively contain a total of 19,956 real-world instances and 54,749 synthetic instances. Researchers tested models such as YOLOv8-s and EfficientViT, achieving a precision of 95.03% and a mean average precision (mAP) of 94.91%. The study emphasized the importance of combining visible and infrared imagery to enhance detection performance across different environmental conditions.
Beyond SaR, the APD system has potential applications in disaster response, security monitoring, and law enforcement. Improved accuracy in APD could support efforts to locate missing persons, patrol high-risk areas, and respond to emergencies more effectively. As AI and UAV technologies advance, the system may also be adapted for applications such as wildlife monitoring and border surveillance, further contributing to safety and security efforts.
This study presents an AI-powered APD system that improves SaR efficiency by addressing key technical challenges. By leveraging AI-driven analysis and multimodal data fusion, the system offers a more precise and adaptable approach to locating individuals in complex environments. These advancements contribute to making rescue operations more effective and increasing the likelihood of timely intervention in critical situations.
Source:
Journal reference: