A Generalized Few-Shot Object Detection Method via Extraction of Base-Novel Commonality With Memory Distillation of Category Prototypes

Generalized few-shot object detection aims to improve detection accuracy for novel classes while maintaining high performance on base classes. Traditional fine-tuning approaches often blur feature boundaries, leading to misclassification of novel samples as base classes or background. Additionally,...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE transactions on circuits and systems for video technology Vol. 35; no. 7; pp. 6979 - 6992
Main Authors:	Su, Junchi, Gao, Xin, Lu, Heping, Li, Baofeng, Zhai, Feng, Fang, Xiao, Wang, Taizhi, Li, Qiangwei
Format:	Journal Article
Language:	English
Published:	New York IEEE 01.07.2025 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects:	Accuracy Adaptation models Autoencoders catastrophic forgetting of base classes Commonality Data mining Feature extraction Generalized few-shot object detection memory bank of category prototypes Metalearning Modules Object detection Object recognition Prototypes Shape Training variational prototype refinement module
ISSN:	1051-8215, 1558-2205
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Generalized few-shot object detection aims to improve detection accuracy for novel classes while maintaining high performance on base classes. Traditional fine-tuning approaches often blur feature boundaries, leading to misclassification of novel samples as base classes or background. Additionally, differences in data distributions between base and novel classes can cause the model to "forget" base knowledge. This paper proposes a novel generalized few-shot detection method that leverages memory distillation of category prototypes. The approach includes two key components: a variational prototype refinement module (VPRM) and a memory bank of category prototypes (MBCP). The variational prototype refinement module introduces a class-agnostic feature fusion mechanism based on the original variational autoencoder. First, the mean and variance of the original distribution of base class are estimated in the base class training stage. The noise variables are converted into memory prototypes with strong generalization ability through reparameterization and stored. Second, the stored memory prototypes are fused with class-agnostic features of novel classes in the fine-tuning stage, which significantly alleviates the problem of base class bias when processing novel classes. In the base class training phase, the category prototype memory bank stores the base class memory prototypes extracted by the variational prototype refinement module and selects the best memory items by dynamically updating the category confidence and intersection-over-union threshold. This memory item can be used not only to constrain features of base classes to alleviate catastrophic forgetting of base classes but also to fuse with features of novel classes, adaptively extracting class-agnostic common information to strengthen the feature representation of the novel class. Experiments on PASCAL VOC and MS-COCO show superior average precision in both single-round and multi-round tests, outperforming existing state-of-the-art methods.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	1051-8215 1558-2205
DOI:	10.1109/TCSVT.2025.3542292