A Novel Deep Learning Model Compression Algorithm

In order to solve the problem of large model computing power consumption, this paper proposes a novel model compression algorithm. Firstly, this paper proposes an interpretable weight allocation method for the loss between a student network (a network model with poor performance), a teacher network...

Full description

Saved in:

Bibliographic Details
Published in:	Electronics (Basel) Vol. 11; no. 7; p. 1066
Main Authors:	Zhao, Ming, Li, Meng, Peng, Sheng-Lung, Li, Jie
Format:	Journal Article
Language:	English
Published:	Basel MDPI AG 01.04.2022
Subjects:	20th century Accuracy Algorithms Deep learning Distillation Knowledge Machine learning Model accuracy Neural networks Power consumption Probability distribution Teachers
ISSN:	2079-9292, 2079-9292
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	In order to solve the problem of large model computing power consumption, this paper proposes a novel model compression algorithm. Firstly, this paper proposes an interpretable weight allocation method for the loss between a student network (a network model with poor performance), a teacher network (a network model with better performance) and real label. Then, different from the previous simple pruning and fine-tuning, this paper performs knowledge distillation on the pruned model, and quantifies the residual weights of the distilled model. The above operations can further reduce the model size and calculation cost while maintaining the model accuracy. The experimental results show that the weight allocation method proposed in this paper can allocate a relatively appropriate weight to the teacher network and real tags. On the cifar-10 dataset, the pruning method combining knowledge distillation and quantization can reduce the memory size of resnet32 network model from 3726 KB to 1842 KB, and the accuracy can be kept at 93.28%, higher than the original model. Compared with similar pruning algorithms, the model accuracy and operation speed are greatly improved.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	2079-9292 2079-9292
DOI:	10.3390/electronics11071066