Semantics-Preserving Graph Propagation for Zero-Shot Object Detection

Most existing object detection models are restricted to detecting objects from previously seen categories, an approach that tends to become infeasible for rare or novel concepts. Accordingly, in this paper, we explore object detection in the context of zero-shot learning, i.e., Zero-Shot Object Dete...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:IEEE transactions on image processing Ročník 29; s. 1
Hlavní autori: Yan, Caixia, Zheng, Qinghua, Chang, Xiaojun, Luo, Minnan, Yeh, Chung-Hsing, Hauptmann, Alexander G.
Médium: Journal Article
Jazyk:English
Vydavateľské údaje: United States IEEE 01.01.2020
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Predmet:
ISSN:1057-7149, 1941-0042, 1941-0042
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Popis
Shrnutí:Most existing object detection models are restricted to detecting objects from previously seen categories, an approach that tends to become infeasible for rare or novel concepts. Accordingly, in this paper, we explore object detection in the context of zero-shot learning, i.e., Zero-Shot Object Detection (ZSD), to concurrently recognize and localize objects from novel concepts. Existing ZSD algorithms are typically based on a simple mapping-transfer strategy that is susceptible to the domain shift problem. To resolve this problem, we propose a novel Semantics-Preserving Graph Propagation model for ZSD based on Graph Convolutional Networks (GCN). More specifically, we employ a graph construction module to flexibly build category graphs by incorporating diverse correlations between category nodes; this is followed by two semantics preserving modules that enhance both category and region representations through a multi-step graph propagation process. Compared to existing mapping-transfer based methods, both the semantic description and semantic structural knowledge exhibited in prior category graphs can be effectively leveraged to boost the generalization capability of the learned projection function via knowledge transfer, thereby providing a solution to the domain shift problem. Experiments on existing seen/unseen splits of three popular object detection datasets demonstrate that the proposed approach performs favorably against state-of-the-art ZSD methods.
Bibliografia:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:1057-7149
1941-0042
1941-0042
DOI:10.1109/TIP.2020.3011807