Complementary Learning Subnetworks Towards Parameter-Efficient Class-Incremental Learning

Uloženo v:
Podrobná bibliografie
Název: Complementary Learning Subnetworks Towards Parameter-Efficient Class-Incremental Learning
Autoři: Depeng Li, Zhigang Zeng, Wei Dai, Ponnuthurai Nagaratnam Suganthan
Zdroj: IEEE Transactions on Knowledge and Data Engineering. 37:3240-3252
Informace o vydavateli: Institute of Electrical and Electronics Engineers (IEEE), 2025.
Rok vydání: 2025
Témata: complementary learning system, class-incremental learning, streaming data modeling, Non-stationary data
Popis: In the scenario of class-incremental learning (CIL), deep neural networks have to adapt their model parameters to non-stationary data distributions, e.g., the emergence of new classes over time. To mitigate the catastrophic forgetting phenomenon, typical CIL methods either cumulatively store exemplars of old classes for retraining model parameters from scratch or progressively expand model size as new classes arrive, which, however, compromises their practical value due to little attention paid to parameter efficiency. In this paper, we contribute a novel solution, effective control of the parameters of a well-trained model, by the synergy between two complementary learning subnetworks. Specifically, we integrate one plastic feature extractor and one analytical feed-forward classifier into a unified framework amenable to streaming data. In each CIL session, it achieves non-overwritten parameter updates in a cost-effective manner, neither revisiting old task data nor extending previously learned networks; Instead, it accommodates new tasks by attaching a tiny set of declarative parameters to its backbone, in which only one matrix per task or one vector per class is kept for knowledge retention. Experimental results on a variety of task sequences demonstrate that our method achieves competitive results against state-of-the-art CIL approaches, especially in accuracy gain, knowledge transfer, training efficiency, and task-order robustness. Furthermore, a graceful forgetting implementation on previously learned trivial tasks is empirically investigated to make its non-growing backbone (i.e., a model with limited network capacity) suffice to train on more incoming tasks.
Druh dokumentu: Article
Popis souboru: application/pdf
ISSN: 2326-3865
1041-4347
DOI: 10.1109/tkde.2025.3550809
Přístupová URL adresa: https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=105000289812&origin=inward
https://hdl.handle.net/10576/64812
Rights: IEEE Copyright
Přístupové číslo: edsair.doi.dedup.....782ceaa037e54aa376096a9dece570fa
Databáze: OpenAIRE
Popis
Abstrakt:In the scenario of class-incremental learning (CIL), deep neural networks have to adapt their model parameters to non-stationary data distributions, e.g., the emergence of new classes over time. To mitigate the catastrophic forgetting phenomenon, typical CIL methods either cumulatively store exemplars of old classes for retraining model parameters from scratch or progressively expand model size as new classes arrive, which, however, compromises their practical value due to little attention paid to parameter efficiency. In this paper, we contribute a novel solution, effective control of the parameters of a well-trained model, by the synergy between two complementary learning subnetworks. Specifically, we integrate one plastic feature extractor and one analytical feed-forward classifier into a unified framework amenable to streaming data. In each CIL session, it achieves non-overwritten parameter updates in a cost-effective manner, neither revisiting old task data nor extending previously learned networks; Instead, it accommodates new tasks by attaching a tiny set of declarative parameters to its backbone, in which only one matrix per task or one vector per class is kept for knowledge retention. Experimental results on a variety of task sequences demonstrate that our method achieves competitive results against state-of-the-art CIL approaches, especially in accuracy gain, knowledge transfer, training efficiency, and task-order robustness. Furthermore, a graceful forgetting implementation on previously learned trivial tasks is empirically investigated to make its non-growing backbone (i.e., a model with limited network capacity) suffice to train on more incoming tasks.
ISSN:23263865
10414347
DOI:10.1109/tkde.2025.3550809