Learning Low-Precision Structured Subnetworks Using Joint Layerwise Channel Pruning and Uniform Quantization

Pruning and quantization are core techniques used to reduce the inference costs of deep neural networks. Among the state-of-the-art pruning techniques, magnitude-based pruning algorithms have demonstrated consistent success in the reduction of both weight and feature map complexity. However, we find...

Full description

Saved in:

Bibliographic Details
Published in:	Applied sciences Vol. 12; no. 15; p. 7829
Main Authors:	Zhang, Xinyu, Colbert, Ian, Das, Srinjoy
Format:	Journal Article
Language:	English
Published:	Basel MDPI AG 01.08.2022
Subjects:	Algorithms channel pruning Energy consumption joint pruning layerwise pruning Neural networks Performance evaluation quantization Sparsity
ISSN:	2076-3417, 2076-3417
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Be the first to leave a comment!