An asynchronous distributed training algorithm based on Gossip communication and Stochastic Gradient Descent

Cyber–Physical Systems (CPS) applications are playing an increasingly important role in our lives, hence the use of centralized distributed machine learning in CPS to secure applications to start widespread use. However, existing centralized distributed machine learning (ML) algorithms have signific...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Computer communications Ročník 195; s. 416 - 423
Hlavní autoři: Tu, Jun, Zhou, Jia, Ren, Donglin
Médium: Journal Article
Jazyk:angličtina
Vydáno: Elsevier B.V 01.11.2022
Témata:
ISSN:0140-3664, 1873-703X
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Cyber–Physical Systems (CPS) applications are playing an increasingly important role in our lives, hence the use of centralized distributed machine learning in CPS to secure applications to start widespread use. However, existing centralized distributed machine learning (ML) algorithms have significant shortcomings in CPS scenarios. As a result, its synchronization algorithm has high latency and sensitivity to drop-off, which affects the security of CPS. Therefore, this paper combining the Gossip protocol with Stochastic Gradient Descent (SGD), this paper proposes a communication framework Gossip Ring SGD (GR-SGD) for machine learning. GR-SGD is decentralized and asynchronous, and solves the problem of long communication waiting time. This paper uses the ImageNet data set and the ResNet model to verify the feasibility of the algorithm and compares it with Ring AllReduce and D-PSGD. Moreover, this paper also indicates that some data redundancy can reduce communication overhead and increase system fault tolerance, it can be better applied to CPS and all kinds of machine learning models.
ISSN:0140-3664
1873-703X
DOI:10.1016/j.comcom.2022.09.010