1.
Advances in UAV detection: integrating multi-sensor systems and AI for
enhanced accuracy and efficiency
Vladislav Semenyuk Liliya Kurmasheva a
a , Ildar Kurmashev a,* , Alberto Lupidi , Alessandro Cantelli-Forti
The researchers were motivated by the escalating threat posed by unauthorized drones and the
limitations of existing single-sensor UAV detection systems. Their aim was to enhance UAV
detection accuracy, range, and reliability by integrating multiple sensor technologies—radar, RF,
acoustic, and optical—with artificial intelligence (AI) and machine learning (ML). This
comprehensive review outlines the progress made since 2020 in detection methods, highlighting
advances like radar spectrogram analysis with CNNs, RF fingerprinting, acoustic signature
recognition, and deep learning-enhanced sensor fusion. The study finds that combining multiple
sensors significantly improves performance, with some methods reaching detection accuracies
above 99%. However, key problems include high system complexity, environmental interferences,
limited detection under adverse conditions, and challenges in real-time adaptability. Future work
involves developing miniaturized, portable systems, improving real-time processing using edge
AI, enhancing sensor fusion algorithms, and ensuring robustness against countermeasures and
cyber threats. These innovations are crucial for securing airspace and protecting critical
infrastructure.
2. RFAG-YOLO:AReceptive Field Attention-Guided YOLO Network for
Small-Object Detection in UAV Images
Chengmeng Wei andWenhongWang
The researchers were motivated by the critical challenge of detecting small objects in UAV
imagery, which is hampered by low resolution, complex backgrounds, and scale variation. To
overcome these limitations, they proposed RFAG-YOLO, an improved version of YOLOv8
tailored for UAV applications. They introduced three key innovations: the Receptive Field
Network (RFN) block to enhance local feature capture, an improved FasterNet backbone for
efficient multi-scale feature extraction, and a Scale-Aware Feature Amalgamation (SAF) module
to dynamically fuse features across different scales. Their model achieved a 38.9% mAP50 on the
VisDrone2019 dataset—substantially outperforming YOLOv7, YOLOv10, and YOLOv11—while
using significantly fewer parameters than YOLOv8s, making it ideal for resource-constrained
UAV platforms. However, limitations include increased computational complexity from RFN
blocks and constraints from fixed input image resolution. Future work involves optimizing
lightweight attention mechanisms and supporting higher-resolution inputs to further enhance
small-object detection in real-world UAV scenarios.
3. Remote Sensing Analysis of the LIDAR Drone Mapping System
for Detecting Damages to Buildings, Roads, and Bridges Using the Faster
CNN Method
The researchers were motivated by the need for efficient, accurate, and real-time damage detection
in civil infrastructure like buildings, roads, and bridges, especially after disasters. Manual
inspections are labor-intensive, subjective, and often unsafe. To address these limitations, the
study introduced an advanced system that integrates LIDAR-equipped UAVs with the Faster
Convolutional Neural Network (Faster CNN) and CycleGAN-based deep learning models for
crack and damage detection. The system leveraged high-resolution 3D point cloud data and image
processing techniques to automatically detect various structural damages. Their proposed method
achieved a high detection accuracy of 95.88%, surpassing previous techniques. However,
limitations included reliance on UAV camera angles, difficulty in GPS-denied environments, and
the need for beacon enhancements for large structures. The study faced issues with noise, image
resolution, and real-time adaptability. Future work involves enhancing GPS-independent
navigation, reducing false positives, improving edge-device compatibility, and expanding
deployment to larger and more complex infrastructures for broader smart city applications.
4. Vision-Based UAV Detection and Tracking UsingDeep Learningand
KalmanFilter NancyAlshaer1 | RehamAbdelfatah2 | TawfikIsmail2,3 |HaithamMahmoud4
The motivation behind this research stems from the increasing use of UAVs across various sectors,
which poses security and safety challenges—particularly in environments where traditional
detection methods like radar or acoustic sensors fail. To address this, the researchers proposed a
robust two-stage vision-based UAV detection and tracking system integrating deep learning and
Kalman filtering. Specifically, they employed YOLO-based models (v3, v4, v5, and YOLOx) for
real-time object detection and coupled them with Kalman Filter (KF) and Extended Kalman Filter
(EKF) for tracking. A novel dataset with over 10,000 annotated images was created, including
various consumer-grade drones and complex backgrounds. The results demonstrated that
YOLOv5 achieved the best detection performance with 99% mAP@0.5 and 75 FPS, while the
addition of KF significantly reduced tracking errors, achieving as low as 6.23 pixels RMSE.
Despite its success, the system struggles with occlusion, blur, and environmental noise, and EKF’s
complexity did not significantly outperform simpler KF. The future work involves integrating
more advanced deep learning models, improving resilience under diverse weather and lighting
conditions, expanding multi-drone tracking, and enhancing the scalability and adaptability of the
system in dynamic real-world environments.
5. RipScout: Realtime ML-Assisted Rip Current Detection and Automated
Data Collection Using UAVs
Fahim Hasan Khan , Donald Stewart James Davis , Akila de Silva , Ashleigh Palinkas ,
Senior Member, IEEE, and Alex Pang
The motivation behind this research lies in the significant danger posed by rip currents, which are
responsible for numerous beach-related drownings globally. Traditional detection methods are
either manual or require expensive, non-portable equipment, making large-scale, real-time
detection impractical. To address these issues, the researchers developed RipScout—a
lightweight, machine learning-assisted system that integrates a drone, a mobile device, and real-
time object detection models for efficient rip current identification and automated data collection.
They evaluated three ML models (SSD-MobileNetV2, YOLOv8m, and EfficientDet D2),
ultimately selecting EfficientDet D2 for its optimal balance of accuracy (up to 94.8%) and speed
(17 FPS) suitable for mobile platforms. The system enables non-expert users to collect rip current
data up to four times faster than manual methods, and also contributes a novel multiview sediment
rip dataset. However, limitations include reduced detection in poor lighting conditions, and false
positives (17%) under real-world complexity. Future work aims to expand detection under varied
weather and lighting conditions using data augmentation and generative ML, broaden detection
to additional rip current types, and apply the system architecture to other domains like aquatic
hazard detection and search and rescue missions. RipScout thus sets a strong foundation for
scalable, real-world environmental monitoring using accessible drone technology and mobile AI.
6. Drone Detection and Tracking with YOLO and a Rule-based Method
Purbaditya Bhattacharya Dept. of Signal Proc.
The researchers were driven by the increasing need for real-time, reliable drone detection in urban
and restricted areas due to growing UAV activity, which poses safety and privacy risks. Their work
focuses on extending an existing infrared drone detection system by integrating high-resolution
color and multi-channel datasets, and enhancing detection with YOLOv7-based deep learning
models. They curated a comprehensive dataset comprising over 140,000 images captured using
infrared and visual cameras across diverse environments. The YOLOv7 model was trained with
attention-based modules like Channel Attention (CAT) and Pixel Attention (PxAT), showing
improved detection accuracy—achieving up to 0.976 mAP@0.5 on infrared images—while
maintaining real-time performance. However, challenges persisted, including detection failures
under low contrast, occlusion, and false positives in infrared data. Moreover, their proposed rule-
based tracking method, using confidence thresholds and cross-correlation, improved detection
continuity but still relied on successful initial detections. The system showed potential for real-
time deployment, achieving 15 FPS with 350 ms latency in test conditions. Future work includes
training with alternative models to validate generalizability, enhancing dataset quality through
better sensor calibration, improving image registration for multi-modal data fusion, and
developing robust field-testing setups to simulate real-world surveillance scenarios. This research
significantly contributes to advancing autonomous drone surveillance systems by balancing
accuracy, efficiency, and practical deployment constraints.
7. Remote Sensing Surveillance Using Multilevel Feature Fusion and Deep
Neural Network
laiba zahoor1, haifa f. alhasson mohammedalatiyyah 2,(member, ieee), mohammed
alnusayri 3, 4,dina abdulaziz alhammadi5, ahmad jalal 1,6, and hui liu 7,
The researchers were motivated by the increasing need for accurate human action recognition
(HAR) in drone-based surveillance systems, especially under complex aerial conditions such as
variable lighting, dynamic movements, and occlusions. Traditional static-camera HAR systems
often fail in such settings due to limited viewpoints and environmental constraints. To address
this, the authors proposed a comprehensive UAV-based HAR framework that integrates YOLO
for human detection with multilevel feature fusion and a deep neural network (DNN) for
classification. The system extracts and processes image frames using Gaussian blur, grayscale
conversion, and background removal before identifying human keypoints and constructing
skeletal models. Features like joint angles, geodesic distances, 3D point cloud data, and MSER
are fused to provide rich representations of human motion, which are then optimized using
Quadratic Discriminant Analysis and classified using a DNN. The system was tested on UAV
Gesture, UAV Human, and UCF-ARG datasets, achieving action recognition accuracies of
90.15%, 72.37%, and 76.50%, respectively. Despite these strong results, the research encountered
challenges such as varying performance across datasets, computational limitations (only 3 FPS on
CPU), and reduced accuracy in more complex scenes. Future work involves optimizing the system
for real-time performance, improving feature extraction techniques for better generalization, and
deploying the model on resource-constrained edge devices. This research offers a promising step
toward reliable, drone-based action recognition, with broad implications for security, search and
rescue, and disaster management applications.
8. Lightweight multidimensional feature enhancement algorithm LPS-YOLO
for UAV remote sensing target detection
Yong Lu & Minghao Sun
The researchers were motivated by the growing demand for accurate and efficient small-target
detection in UAV remote sensing imagery, where traditional methods struggle due to limited
feature extraction capabilities and high background interference. To address this, they proposed
LPS-YOLO—a lightweight, high-performance target detection model built upon the YOLOv8
architecture. They replaced the Conv module with SPDConv to enhance fine-grained feature
extraction and introduced innovative modules such as the Separate Kernel Attention Pyramid
Pooling (SKAPP), One-way Feature Transfer Pyramid (OFTP), and Easy Bidirectional Feature
Pyramid Network (E-BiFPN) to improve feature fusion and context preservation while
minimizing parameter count. Their results showed significant improvements over baseline
models, including a 17.3% increase in mAP on VisDrone2019 and a 14.5% F1 score improvement
on DOTAv2, with up to 42.5% reduction in parameters. However, the main problem lies in reduced
real-time processing capability (lower FPS) due to the added complexity from advanced modules
like SKAPP. Despite these limitations, the model demonstrated strong performance in detecting
low-visibility, small-scale targets across diverse conditions. For future work, the researchers plan
to optimize the network’s computational efficiency using techniques like content-defined
sampling, expand applications to detect natural hazards such as ground cracks and fires, and
improve real-time adaptability for deployment on resource-constrained UAVs. This study offers a
practical and scalable solution for enhancing UAV-based surveillance, emergency response, and
remote sensing tasks.