CAPTURE: Memory-Centric Partitioning for Distributed DNN Training with Hybrid Parallelism

Deep Learning (DL) model sizes are increasing at a rapid pace, as larger models typically offer better statistical performance. Modern Large Language Models (LLMs) and image processing models contain billions of trainable parameters. Training such massive neural networks incurs significant memory re...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Proceedings - International Conference on High Performance Computing S. 76 - 86
Hauptverfasser:	Dreuning, Henk, Verstoep, Kees, Bal, Henri E., van Nieuwpoort, Rob V.
Format:	Tagungsbericht
Sprache:	Englisch
Veröffentlicht:	IEEE 18.12.2023
Schlagworte:	Costs Deep Learning GPU Graphics processing units HPC Hybrid Parallelism Memory Memory management Predictive models Tensors Throughput Training
ISSN:	2640-0316
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Schreiben Sie den ersten Kommentar!