An asynchronous distributed training algorithm based on Gossip communication and Stochastic Gradient Descent

Cyber–Physical Systems (CPS) applications are playing an increasingly important role in our lives, hence the use of centralized distributed machine learning in CPS to secure applications to start widespread use. However, existing centralized distributed machine learning (ML) algorithms have signific...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Computer communications Jg. 195; S. 416 - 423
Hauptverfasser: Tu, Jun, Zhou, Jia, Ren, Donglin
Format: Journal Article
Sprache:Englisch
Veröffentlicht: Elsevier B.V 01.11.2022
Schlagworte:
ISSN:0140-3664, 1873-703X
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Cyber–Physical Systems (CPS) applications are playing an increasingly important role in our lives, hence the use of centralized distributed machine learning in CPS to secure applications to start widespread use. However, existing centralized distributed machine learning (ML) algorithms have significant shortcomings in CPS scenarios. As a result, its synchronization algorithm has high latency and sensitivity to drop-off, which affects the security of CPS. Therefore, this paper combining the Gossip protocol with Stochastic Gradient Descent (SGD), this paper proposes a communication framework Gossip Ring SGD (GR-SGD) for machine learning. GR-SGD is decentralized and asynchronous, and solves the problem of long communication waiting time. This paper uses the ImageNet data set and the ResNet model to verify the feasibility of the algorithm and compares it with Ring AllReduce and D-PSGD. Moreover, this paper also indicates that some data redundancy can reduce communication overhead and increase system fault tolerance, it can be better applied to CPS and all kinds of machine learning models.
ISSN:0140-3664
1873-703X
DOI:10.1016/j.comcom.2022.09.010