2D-THA-ADMM: communication efficient distributed ADMM algorithm framework based on two-dimensional torus hierarchical AllReduce

Model synchronization refers to the communication process involved in large-scale distributed machine learning tasks. As the cluster scales up, the synchronization of model parameters becomes a challenging task that has to be coordinated among thousands of workers. Firstly, this study proposes a h i...

Celý popis

Uloženo v:

Podrobná bibliografie
Vydáno v:	International journal of machine learning and cybernetics Ročník 15; číslo 2; s. 207 - 226
Hlavní autoři:	Wang, Guozheng, Lei, Yongmei, Zhang, Zeyu, Peng, Cunlu
Médium:	Journal Article
Jazyk:	angličtina
Vydáno:	Berlin/Heidelberg Springer Berlin Heidelberg 01.02.2024
Témata:	Artificial Intelligence Complex Systems Computational Intelligence Control Engineering Mechatronics Original Article Pattern Recognition Robotics Systems Biology Hierarchical AllReduce Two-dimensional torus Synchronization algorithm ADMM
ISSN:	1868-8071, 1868-808X
On-line přístup:	Získat plný text
Tagy:	Přidat tag Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!

Popis
Shrnutí:	Model synchronization refers to the communication process involved in large-scale distributed machine learning tasks. As the cluster scales up, the synchronization of model parameters becomes a challenging task that has to be coordinated among thousands of workers. Firstly, this study proposes a h ierarchical A llReduce algorithm structured on a two - d imensional t orus (2D-THA), which utilizes a hierarchical structure to synchronize model parameters and maximize bandwidth utilization. Secondly, this study introduces a distributed consensus algorithm called 2D-THA-ADMM, which combines the 2D-THA synchronization algorithm with the alternating direction method of multipliers (ADMM). Thirdly, we evaluate the model parameter synchronization performance of 2D-THA and the scalability of 2D-THA-ADMM on the Tianhe-2 supercomputing platform using real public datasets. Our experiments demonstrate that 2D-THA significantly reduces synchronization time by 63.447 % compared to MPI_Allreduce. Furthermore, the proposed 2D-THA-ADMM algorithm exhibits excellent scalability, with a training speed increase of over 3 × compared to the state-of-the-art methods, while maintaining high accuracy and computational efficiency.
ISSN:	1868-8071 1868-808X
DOI:	10.1007/s13042-023-01903-9