Learning Low-Precision Structured Subnetworks Using Joint Layerwise Channel Pruning and Uniform Quantization

Pruning and quantization are core techniques used to reduce the inference costs of deep neural networks. Among the state-of-the-art pruning techniques, magnitude-based pruning algorithms have demonstrated consistent success in the reduction of both weight and feature map complexity. However, we find...

Celý popis

Uloženo v:

Podrobná bibliografie
Vydáno v:	Applied sciences Ročník 12; číslo 15; s. 7829
Hlavní autoři:	Zhang, Xinyu, Colbert, Ian, Das, Srinjoy
Médium:	Journal Article
Jazyk:	angličtina
Vydáno:	Basel MDPI AG 01.08.2022
Témata:	Algorithms channel pruning Energy consumption joint pruning layerwise pruning Neural networks Performance evaluation quantization Sparsity
ISSN:	2076-3417, 2076-3417
On-line přístup:	Získat plný text
Tagy:	Přidat tag Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!

Buďte první, kdo okomentuje tento záznam!