Bibliographic Details
| Title: |
Transformer-Based Multi-Target Object Detection and Tracking Framework for Robust Spatio-Temporal Memory in Dynamic Environments |
| Authors: |
Tareq Mahmod AlZubi, Umar Raza Mukhtar |
| Source: |
IEEE Access, Vol 13, Pp 47146-47164 (2025) |
| Publisher Information: |
Institute of Electrical and Electronics Engineers (IEEE), 2025. |
| Publication Year: |
2025 |
| Subject Terms: |
Multi-object tracking, transformer, encoder-decoder, multi-scale pyramid, Electrical engineering. Electronics. Nuclear engineering, memory network, TK1-9971 |
| Description: |
Multiple object tracking (MOT) is a critical task in computer vision, applied in areas such as autonomous driving, sports analytics, and human activity recognition. Traditional online MOT approaches, which rely on separate detection and re-identification (re-ID) stages, often suffer from high computational costs and suboptimal accuracy. To address these challenges, we propose MOTTNet, a unified framework that integrates detection and tracklet mapping using transformer-based architectures. MOTTNet leverages a spatiotemporal memory network to store past observations of tracked objects, allowing for more robust identity preservation and trajectory prediction over time. The Target Proposal Module generates object candidates using a transformer encoder-decoder structure, and the memory decoder integrates both proposal and track vectors to predict object locations and identities in real-time, minimizing post-processing. The framework incorporates a multi-scale attention pyramid to handle variations in object scale and utilizes a deformable transformer for generating accurate object proposals. A memory encoding-decoding aggregates object features and associates them across frames, allowing robust detection and identity preservation over time. MOTTNet achieves IDF1 scores of 79.8 and 80.4, MOTA scores of 79.3 and 80.1, and HOTA scores of 73.2 and 72.7, respectively, on on the MOT17 and MOT20 datasets. Additionally, MOTTNet exhibits lower identity switch rates, significantly outperforming existing transformer-based and traditional tracking models. |
| Document Type: |
Article |
| ISSN: |
2169-3536 |
| DOI: |
10.1109/access.2025.3551672 |
| Access URL: |
https://doaj.org/article/7711794e983b42f1b873d4646dba7ac4 |
| Rights: |
CC BY |
| Accession Number: |
edsair.doi.dedup.....7f5128d2cbce412862d1f40995c4848c |
| Database: |
OpenAIRE |