A two stage multi object tracking algorithm with transformer and attention mechanism
In the field of engineering safety, multi-object tracking encounters difficulties in effectively conducting object detection due to occlusion, as well as the issue of experiencing frequent switching of target identity ID switches (IDs). In response to the issues above, this paper proposes a multi-ob...
Uložené v:
| Vydané v: | Scientific reports Ročník 15; číslo 1; s. 31414 - 14 |
|---|---|
| Hlavní autori: | , , , |
| Médium: | Journal Article |
| Jazyk: | English |
| Vydavateľské údaje: |
London
Nature Publishing Group UK
26.08.2025
Nature Publishing Group Nature Portfolio |
| Predmet: | |
| ISSN: | 2045-2322, 2045-2322 |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Shrnutí: | In the field of engineering safety, multi-object tracking encounters difficulties in effectively conducting object detection due to occlusion, as well as the issue of experiencing frequent switching of target identity ID switches (IDs). In response to the issues above, this paper proposes a multi-object tracking model that integrates improved You Only Look Once Version 8 (YOLOv8) and High-Performance Multi-Object Tracking by Tracking Bytes (ByteTrack). The model architecture is based on the paradigm of tracking-by-detection. In the detection stage, we combine the coordinate attention mechanism to propose the Coordinate Attention Spatial Pyramid Pooling - Fast Conv (CASPPFC) module, and combine it with improved Efficient Vision Transformer (EfficientViT) to enhance the YOLOv8 backbone network, effectively reducing false positives and false negatives caused by occlusion. In the first stage of tracking association, we propose the Omni-Scale Network-Coordinate Attention (OSNet-CA) network as the Re-identification (Re-ID) feature extraction method to capture target information more effectively. In the second stage of association, we adopt the Efficient Intersection over Union (EIoU) improvement method to comprehensively consider the positional relationships between targets. The effectiveness of the improved model is validated on the Multi-Object Tracking 2017 (MOT17) and Multi-Object Tracking 2020 (MOT20) datasets. The results indicate that our tracking model achieved 80.5% Multiple Object Tracking Accuracy (MOTA), 79.3% Identification F1 Score (IDF1), and 64.2% Higher Order Tracking Accuracy (HOTA) on the MOT17 test set, and 77.8% MOTA, 76.9% IDF1, and 62.4% HOTA on the MOT20 test set. This tracking model can achieve high-precision pedestrian tracking, effectively reducing ID switches and enhancing tracking robustness, timely detection of hazardous events in the engineering safety field to ensure personnel safety. |
|---|---|
| Bibliografia: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 |
| ISSN: | 2045-2322 2045-2322 |
| DOI: | 10.1038/s41598-025-16389-4 |