Communication Algorithm-Architecture Co-Design for Distributed Deep Learning

Large-scale distributed deep learning training has enabled developments of more complex deep neural network models to learn from larger datasets for sophisticated tasks. In particular, distributed stochastic gradient descent intensively invokes all-reduce operations for gradient update, which domina...

Celý popis

Uložené v:

Podrobná bibliografia
Vydané v:	Proceedings - International Symposium on Computer Architecture s. 181 - 194
Hlavní autori:	Huang, Jiayi, Majumder, Pritam, Kim, Sungkeun, Muzahid, Abdullah, Yum, Ki Hwan, Kim, Eun Jung
Médium:	Konferenčný príspevok..
Jazyk:	English
Vydavateľské údaje:	IEEE 01.06.2021
Predmet:	algorithm-architecture co-design all-reduce data-parallel training Deep learning distributed deep learning interconnection network Network topology Schedules Scheduling Stochastic processes Topology Training
ISSN:	2575-713X
On-line prístup:	Získať plný text
Tagy:	Pridať tag Žiadne tagy, Buďte prvý, kto otaguje tento záznam!

Buďte prvý, kto okomentuje tento záznam!