A Generalized Few-Shot Object Detection Method via Extraction of Base-Novel Commonality With Memory Distillation of Category Prototypes
Generalized few-shot object detection aims to improve detection accuracy for novel classes while maintaining high performance on base classes. Traditional fine-tuning approaches often blur feature boundaries, leading to misclassification of novel samples as base classes or background. Additionally,...
Saved in:
| Published in: | IEEE transactions on circuits and systems for video technology Vol. 35; no. 7; pp. 6979 - 6992 |
|---|---|
| Main Authors: | , , , , , , , |
| Format: | Journal Article |
| Language: | English |
| Published: |
New York
IEEE
01.07.2025
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Subjects: | |
| ISSN: | 1051-8215, 1558-2205 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Generalized few-shot object detection aims to improve detection accuracy for novel classes while maintaining high performance on base classes. Traditional fine-tuning approaches often blur feature boundaries, leading to misclassification of novel samples as base classes or background. Additionally, differences in data distributions between base and novel classes can cause the model to "forget" base knowledge. This paper proposes a novel generalized few-shot detection method that leverages memory distillation of category prototypes. The approach includes two key components: a variational prototype refinement module (VPRM) and a memory bank of category prototypes (MBCP). The variational prototype refinement module introduces a class-agnostic feature fusion mechanism based on the original variational autoencoder. First, the mean and variance of the original distribution of base class are estimated in the base class training stage. The noise variables are converted into memory prototypes with strong generalization ability through reparameterization and stored. Second, the stored memory prototypes are fused with class-agnostic features of novel classes in the fine-tuning stage, which significantly alleviates the problem of base class bias when processing novel classes. In the base class training phase, the category prototype memory bank stores the base class memory prototypes extracted by the variational prototype refinement module and selects the best memory items by dynamically updating the category confidence and intersection-over-union threshold. This memory item can be used not only to constrain features of base classes to alleviate catastrophic forgetting of base classes but also to fuse with features of novel classes, adaptively extracting class-agnostic common information to strengthen the feature representation of the novel class. Experiments on PASCAL VOC and MS-COCO show superior average precision in both single-round and multi-round tests, outperforming existing state-of-the-art methods. |
|---|---|
| Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ISSN: | 1051-8215 1558-2205 |
| DOI: | 10.1109/TCSVT.2025.3542292 |