DLI-Net: Dual Local Interaction Network for Fine-Grained Sketch-Based Image Retrieval

Fine-grained sketch-based image retrieval (FG-SBIR) is considered an ideal method of image retrieval due to the rich and easily accessible characteristics of sketches. It aims to find the most similar photo from the photo gallery based on the input sketch. Most previous works follow the paradigm tha...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:IEEE transactions on circuits and systems for video technology Ročník 32; číslo 10; s. 7177 - 7189
Hlavní autoři: Sun, Haifeng, Xu, Jiaqing, Wang, Jingyu, Qi, Qi, Ge, Ce, Liao, Jianxin
Médium: Journal Article
Jazyk:angličtina
Vydáno: New York IEEE 01.10.2022
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Témata:
ISSN:1051-8215, 1558-2205
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Fine-grained sketch-based image retrieval (FG-SBIR) is considered an ideal method of image retrieval due to the rich and easily accessible characteristics of sketches. It aims to find the most similar photo from the photo gallery based on the input sketch. Most previous works follow the paradigm that extracting global feature first and then projecting the features of sketch and photo to unified embedding feature space using triplet loss. However, the global feature is not appropriate for extracting the crucial fine-grained information. Based on this principle, we propose a Dual Local Interaction Network (DLI-Net). DLI-Net explores an effective and efficient way to utilize local features for FG-SBIR. Specifically, we first propose a Local Feature Extractor to extract mid-level local features. Then, in response to the problems brought by local features, we propose a Dual Interaction Module, which contains Self Interaction Module and Cross Interaction Module. Self Interaction Module speeds up retrieval by eliminating the redundant local features of background. Cross Interaction Module solves the spatial misalignment by making the sketches interact with photos. Extensive experiments on six commonly used datasets show that our DLI-Net outperforms state-of-the-art competitors by a significant margin with a reasonable retrieval speed. Moreover, to the best of our knowledge, DLI-Net is the first model that beats humans on all six datasets. Besides, DLI-Net also performs best on cross-category fine-grained sketch-based image retrieval task, which further demonstrates local features are more appropriate for FG-SBIR.
Bibliografie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1051-8215
1558-2205
DOI:10.1109/TCSVT.2022.3171972