Decoupled Knowledge Distillation
State-of-the-art distillation methods are mainly based on distilling deep features from intermediate layers, while the significance of logit distillation is greatly overlooked. To provide a novel viewpoint to study logit distillation, we re-formulate the classical KD loss into two parts, i.e., targe...
Saved in:
| Published in: | Proceedings (IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Online) pp. 11943 - 11952 |
|---|---|
| Main Authors: | , , , , |
| Format: | Conference Proceeding |
| Language: | English |
| Published: |
IEEE
01.06.2022
|
| Subjects: | |
| ISSN: | 1063-6919 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Be the first to leave a comment!