Compressing and Fine-tuning DNNs for Efficient Inference in Mobile Device-Edge Continuum

Pruning deep neural networks (DNN) is a well-known technique that allows for a sensible reduction in inference cost. However, this may severely degrade the accuracy achieved by the model unless the latter is properly fine-tuned, which may, in turn, result in increased computational cost and latency....

Celý popis

Uložené v:

Podrobná bibliografia
Vydané v:	2024 IEEE International Mediterranean Conference on Communications and Networking (MeditCom) s. 305 - 310
Hlavní autori:	Singh, Gurtaj, Chukhno, Olga, Campolo, Claudia, Molinaro, Antonella, Chiasserini, Carla Fabiana
Médium:	Konferenčný príspevok..
Jazyk:	English
Vydavateľské údaje:	IEEE 08.07.2024
Predmet:	Accuracy Artificial neural networks Computational modeling Costs Edge computing Edge-mobile device continuum Machine learning Machine learning pipeline ML model compression Mobile handsets Pipelines
On-line prístup:	Získať plný text
Tagy:	Pridať tag Žiadne tagy, Buďte prvý, kto otaguje tento záznam!

Buďte prvý, kto okomentuje tento záznam!