Rethinking the Competition Between Detection and ReID in Multiobject Tracking

Due to balanced accuracy and speed, one-shot models which jointly learn detection and identification embeddings, have drawn great attention in multi-object tracking (MOT). However, the inherent differences and relations between detection and re-identification (ReID) are unconsciously overlooked beca...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on image processing Jg. 31; S. 3182 - 3196
Hauptverfasser:	Liang, Chao, Zhang, Zhipeng, Zhou, Xue, Li, Bing, Zhu, Shuyuan, Hu, Weiming
Format:	Journal Article
Sprache:	Englisch
Veröffentlicht:	New York IEEE 2022 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Schlagworte:	Bells Competition Computational modeling Detectors Feature extraction ID embedding Misalignment Multiobject tracking Multiple target tracking Object detection one-shot reciprocal representation learning Representations scale-aware attention Semantics Target tracking Task analysis
ISSN:	1057-7149, 1941-0042, 1941-0042
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Due to balanced accuracy and speed, one-shot models which jointly learn detection and identification embeddings, have drawn great attention in multi-object tracking (MOT). However, the inherent differences and relations between detection and re-identification (ReID) are unconsciously overlooked because of treating them as two isolated tasks in the one-shot tracking paradigm. This leads to inferior performance compared with existing two-stage methods. In this paper, we first dissect the reasoning process for these two tasks, which reveals that the competition between them inevitably would destroy task-dependent representations learning. To tackle this problem, we propose a novel reciprocal network (REN) with a self-relation and cross-relation design so that to impel each branch to better learn task-dependent representations. The proposed model aims to alleviate the deleterious tasks competition, meanwhile improve the cooperation between detection and ReID. Furthermore, we introduce a scale-aware attention network (SAAN) that prevents semantic level misalignment to improve the association capability of ID embeddings. By integrating the two delicately designed networks into a one-shot online MOT system, we construct a strong MOT tracker, namely CSTrack. Our tracker achieves the state-of-the-art performance on MOT16, MOT17 and MOT20 datasets, without other bells and whistles. Moreover, CSTrack is efficient and runs at 16.4 FPS on a single modern GPU, and its lightweight version even runs at 34.6 FPS. The complete code has been released at https://github.com/JudasDie/SOTS
Bibliographie:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23
ISSN:	1057-7149 1941-0042 1941-0042
DOI:	10.1109/TIP.2022.3165376