Efficient Small Object Detection You Only Look Once: A Small Object Detection Algorithm for Aerial Images

Aerial images have distinct characteristics, such as varying target scales, complex backgrounds, severe occlusion, small targets, and dense distribution. As a result, object detection in aerial images faces challenges like difficulty in extracting small target information and poor integration of spa...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Sensors (Basel, Switzerland) Ročník 24; číslo 21; s. 7067
Hlavní autoři: Luo, Jie, Liu, Zhicheng, Wang, Yibo, Tang, Ao, Zuo, Huahong, Han, Ping
Médium: Journal Article
Jazyk:angličtina
Vydáno: Switzerland MDPI AG 02.11.2024
MDPI
Témata:
ISSN:1424-8220, 1424-8220
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Aerial images have distinct characteristics, such as varying target scales, complex backgrounds, severe occlusion, small targets, and dense distribution. As a result, object detection in aerial images faces challenges like difficulty in extracting small target information and poor integration of spatial and semantic data. Moreover, existing object detection algorithms have a large number of parameters, posing a challenge for deployment on drones with limited hardware resources. We propose an efficient small-object YOLO detection model (ESOD-YOLO) based on YOLOv8n for Unmanned Aerial Vehicle (UAV) object detection. Firstly, we propose that the Reparameterized Multi-scale Inverted Blocks (RepNIBMS) module is implemented to replace the C2f module of the Yolov8n backbone extraction network to enhance the information extraction capability of small objects. Secondly, a cross-level multi-scale feature fusion structure, wave feature pyramid network (WFPN), is designed to enhance the model’s capacity to integrate spatial and semantic information. Meanwhile, a small-object detection head is incorporated to augment the model’s ability to identify small objects. Finally, a tri-focal loss function is proposed to address the issue of imbalanced samples in aerial images in a straightforward and effective manner. In the VisDrone2019 test set, when the input size is uniformly 640 × 640 pixels, the parameters of ESOD-YOLO are 4.46 M, and the average mean accuracy of detection reaches 29.3%, which is 3.6% higher than the baseline method YOLOv8n. Compared with other detection methods, it also achieves higher detection accuracy with lower parameters.
Bibliografie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:1424-8220
1424-8220
DOI:10.3390/s24217067