Coding for Large-Scale Distributed Machine Learning

This article aims to give a comprehensive and rigorous review of the principles and recent development of coding for large-scale distributed machine learning (DML). With increasing data volumes and the pervasive deployment of sensors and computing machines, machine learning has become more distribut...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Entropy (Basel, Switzerland) Jg. 24; H. 9; S. 1284
Hauptverfasser:	Xiao, Ming, Skoglund, Mikael
Format:	Journal Article
Sprache:	Englisch
Veröffentlicht:	Basel MDPI AG 01.09.2022 MDPI
Schlagworte:	ADMM Algorithms Analysis Blacklisting Coding Coding theory Cognitive tasks Communication Communications networks Computation Distributed processing (Computers) Efficiency Error analysis error-control coding gradient coding Internet of Things Large-scale systems Machine learning Methods Neural networks Nodes Optimization Privacy random codes Review Sweden
ISSN:	1099-4300, 1099-4300
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	This article aims to give a comprehensive and rigorous review of the principles and recent development of coding for large-scale distributed machine learning (DML). With increasing data volumes and the pervasive deployment of sensors and computing machines, machine learning has become more distributed. Moreover, the involved computing nodes and data volumes for learning tasks have also increased significantly. For large-scale distributed learning systems, significant challenges have appeared in terms of delay, errors, efficiency, etc. To address the problems, various error-control or performance-boosting schemes have been proposed recently for different aspects, such as the duplication of computing nodes. More recently, error-control coding has been investigated for DML to improve reliability and efficiency. The benefits of coding for DML include high-efficiency, low complexity, etc. Despite the benefits and recent progress, however, there is still a lack of comprehensive survey on this topic, especially for large-scale learning. This paper seeks to introduce the theories and algorithms of coding for DML. For primal-based DML schemes, we first discuss the gradient coding with the optimal code distance. Then, we introduce random coding for gradient-based DML. For primal–dual-based DML, i.e., ADMM (alternating direction method of multipliers), we propose a separate coding method for two steps of distributed optimization. Then coding schemes for different steps are discussed. Finally, a few potential directions for future works are also given.
Bibliographie:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 ObjectType-Review-3 content type line 23
ISSN:	1099-4300 1099-4300
DOI:	10.3390/e24091284