Merak: An Efficient Distributed DNN Training Framework with Automated 3D Parallelism for Giant Foundation Models

Foundation models are in the process of becoming the dominant deep learning technology. Pretraining a foundation model is always time-consuming due to the large scale of both the model parameter and training dataset. Besides being computing-intensive, the pretraining process is extremely memory- and...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on parallel and distributed systems Jg. 34; H. 5; S. 1 - 13
Hauptverfasser:	Lai, Zhiquan, Li, Shengwei, Tang, Xudong, Ge, Keshi, Liu, Weijie, Duan, Yabo, Qiao, Linbo, Li, Dongsheng
Format:	Journal Article
Sprache:	Englisch
Veröffentlicht:	New York IEEE 01.05.2023 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Schlagworte:	Algorithms Automation Computation Computational modeling Computer memory Critical path Data models Deep learning Distributed Systems Foundation model training Mathematical models Parallel processing Parameters Pipelines Pipelining (computers) Resource utilization Solid modeling Tensors Three-dimensional displays Training
ISSN:	1045-9219, 1558-2183
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Schreiben Sie den ersten Kommentar!