A comprehensive review of model compression techniques in machine learning

This paper critically examines model compression techniques within the machine learning (ML) domain, emphasizing their role in enhancing model efficiency for deployment in resource-constrained environments, such as mobile devices, edge computing, and Internet of Things (IoT) systems. By systematical...

Celý popis

Uložené v:

Podrobná bibliografia
Vydané v:	Applied intelligence (Dordrecht, Netherlands) Ročník 54; číslo 22; s. 11804 - 11844
Hlavní autori:	Dantas, Pierre Vilar, Sabino da Silva, Waldir, Cordeiro, Lucas Carvalho, Carvalho, Celso Barbosa
Médium:	Journal Article
Jazyk:	English
Vydavateľské údaje:	New York Springer US 01.11.2024 Springer Nature B.V
Predmet:	Artificial Intelligence Complexity Computer Science Computing time Edge computing Efficiency Internet of Things Machine learning Machines Manufacturing Mechanical Engineering Optimization Processes System effectiveness Architectural innovations Technological evolution in machine learning Lightweight design approaches Computational efficiency Model generalization Neural network compression
ISSN:	0924-669X, 1573-7497
On-line prístup:	Získať plný text
Tagy:	Pridať tag Žiadne tagy, Buďte prvý, kto otaguje tento záznam!

Popis
Shrnutí:	This paper critically examines model compression techniques within the machine learning (ML) domain, emphasizing their role in enhancing model efficiency for deployment in resource-constrained environments, such as mobile devices, edge computing, and Internet of Things (IoT) systems. By systematically exploring compression techniques and lightweight design architectures, it is provided a comprehensive understanding of their operational contexts and effectiveness. The synthesis of these strategies reveals a dynamic interplay between model performance and computational demand, highlighting the balance required for optimal application. As machine learning (ML) models grow increasingly complex and data-intensive, the demand for computational resources and memory has surged accordingly. This escalation presents significant challenges for the deployment of artificial intelligence (AI) systems in real-world applications, particularly where hardware capabilities are limited. Therefore, model compression techniques are not merely advantageous but essential for ensuring that these models can be utilized across various domains, maintaining high performance without prohibitive resource requirements. Furthermore, this review underscores the importance of model compression in sustainable artificial intelligence (AI) development. The introduction of hybrid methods, which combine multiple compression techniques, promises to deliver superior performance and efficiency. Additionally, the development of intelligent frameworks capable of selecting the most appropriate compression strategy based on specific application needs is crucial for advancing the field. The practical examples and engineering applications discussed demonstrate the real-world impact of these techniques. By optimizing the balance between model complexity and computational efficiency, model compression ensures that the advancements in AI technology remain sustainable and widely applicable. This comprehensive review thus contributes to the academic discourse and guides innovative solutions for efficient and responsible machine learning practices, paving the way for future advancements in the field. Graphical abstract
Bibliografia:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	0924-669X 1573-7497
DOI:	10.1007/s10489-024-05747-w