Model Compression Using Optimal Transport

Model compression methods are important to allow for easier deployment of deep learning models in compute, memory and energy-constrained environments such as mobile phones. Knowledge distillation is a class of model compression algorithms where knowledge from a large teacher network is transferred t...

Celý popis

Uloženo v:

Podrobná bibliografie
Vydáno v:	Proceedings / IEEE Workshop on Applications of Computer Vision s. 3645 - 3654
Hlavní autoři:	Lohit, Suhas, Jones, Michael
Médium:	Konferenční příspěvek
Jazyk:	angličtina
Vydáno:	IEEE 01.01.2022
Témata:	Computational modeling Computer vision Deep learning Deep Learning -> Efficient Training and Inference Methods for Networks Deep Learning; Object Detection/Recognition/Categorization; Statistical Methods; Learning and Optimization Image coding Knowledge engineering Mobile handsets Training
ISSN:	2642-9381
On-line přístup:	Získat plný text
Tagy:	Přidat tag Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!

Popis
Shrnutí:	Model compression methods are important to allow for easier deployment of deep learning models in compute, memory and energy-constrained environments such as mobile phones. Knowledge distillation is a class of model compression algorithms where knowledge from a large teacher network is transferred to a smaller student network thereby improving the student's performance. In this paper, we show how optimal transport-based loss functions can be used for training a student network which encourages learning student network parameters that help bring the distribution of student features closer to that of the teacher features. We present image classification results on CIFAR-100, SVHN and ImageNet and show that the proposed optimal transport loss functions perform comparably to or better than other loss functions.
ISSN:	2642-9381
DOI:	10.1109/WACV51458.2022.00370