Learning Low-Rank Representations for Model Compression

Vector Quantization (VQ) is an appealing model compression method to obtain a tiny model with less accuracy loss. While methods to obtain better codebooks and codes under fixed clustering dimensionality have been extensively studied, optimizations via the reduction of subvector dimensionality are no...

Celý popis

Uložené v:

Podrobná bibliografia
Vydané v:	Proceedings of ... International Joint Conference on Neural Networks s. 1 - 9
Hlavní autori:	Zhu, Zezhou, Dong, Yuan, Zhao, Zhong
Médium:	Konferenčný príspevok..
Jazyk:	English
Vydavateľské údaje:	IEEE 18.06.2023
Predmet:	Codes Deep learning Dimensionality reduction Estimation Image coding low-rank representation model compression Neural network compression Training Vector quantization
ISSN:	2161-4407
On-line prístup:	Získať plný text
Tagy:	Pridať tag Žiadne tagy, Buďte prvý, kto otaguje tento záznam!

Popis
Shrnutí:	Vector Quantization (VQ) is an appealing model compression method to obtain a tiny model with less accuracy loss. While methods to obtain better codebooks and codes under fixed clustering dimensionality have been extensively studied, optimizations via the reduction of subvector dimensionality are not carefully considered. This paper reports our recent progress on model compression with the combination of dimensionality reduction and vector quantization, proposing Low-Rank Representation Vector Quantization (LR 2 VQ). LR 2 VQ joins low-rank representation with subvector clustering to construct a new kind of building block that is optimized by end-to-end training. In our method, the compression ratio could be directly controlled by the dimensionality of subvectors, and the final accuracy is solely determined by clustering dimensionality \tilde{d} . We recognize \tilde{d} as a trade-off between low-rank approximation error and clustering error and carry out both theoretical analysis and experimental observations that empower the estimation of the proper \tilde{d} before fine-tuning. With a proper \tilde{d} , we evaluate LR 2 VQ with ResNet-18/ResNet-50 on ImageNet classification datasets, achieving 2.8%/1.0% top-1 accuracy improvements over the current state-of-the-art model compression algorithms with 43×/31× compression factor.
ISSN:	2161-4407
DOI:	10.1109/IJCNN54540.2023.10191936