A Multi-Scale Feature-Fusion Multi-Object Tracking Algorithm for Scale-Variant Vehicle Tracking in UAV Videos

Unmanned Aerial Vehicle (UAV) vehicle-tracking technology has extensive potential for application in various fields. In the actual tracking process, the relative movement of the UAV and vehicles will bring large target-scale variations (i.e., size and aspect ratio change), which leads to missed dete...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Remote sensing (Basel, Switzerland) Jg. 17; H. 6; S. 1014
Hauptverfasser: Liu, Shanshan, Shen, Xinglin, Xiao, Shanzhu, Li, Hanwen, Tao, Huamin
Format: Journal Article
Sprache:Englisch
Veröffentlicht: Basel MDPI AG 01.03.2025
Schlagworte:
ISSN:2072-4292, 2072-4292
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Unmanned Aerial Vehicle (UAV) vehicle-tracking technology has extensive potential for application in various fields. In the actual tracking process, the relative movement of the UAV and vehicles will bring large target-scale variations (i.e., size and aspect ratio change), which leads to missed detection and ID switching. Traditional tracking methods usually use multi-scale estimation to adaptively update the target scale for variable-scale detection and tracking. However, this approach requires selecting multiple scaling factors and generating a large number of bounding boxes, which results in high computational costs and affects real-time performance. To tackle the above issue, we propose a novel multi-target tracking method based on the BoT-SORT framework. Firstly, we propose an FB-YOLOv8 framework to solve the missed detection problem. This framework incorporates a Feature Alignment Aggregation Module (FAAM) and a Bidirectional Path Aggregation Network (BPAN) to enhance the multi-scale feature fusion. Secondly, we propose a multi-scale feature-fusion network (MSFF-OSNet) to extract appearance features, which solves the ID switching problem. This framework integrates the Feature Pyramid Network (FPN) and Convolutional Block Attention Module (CBAM) into OSNet to capture multilevel pixel dependencies and combine low-level and high-level features. By effectively integrating the FB-YOLOv8 and MSFF-OSNet modules into the tracking pipeline, the accuracy and stability of tracking are improved. Experiments on the UAVDT dataset achieved 46.1% MOTA and 65.3% IDF1, which outperforms current state-of-the-art trackers. Furthermore, experiments conducted on sequences with scale variations have substantiated the improved tracking stability of our proposed method under scale-changing conditions.
Bibliographie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:2072-4292
2072-4292
DOI:10.3390/rs17061014