SSDTrain: An Activation Offloading Framework to SSDs for Faster Large Language Model Training

The growth rate of the GPU memory capacity has not been able to keep up with that of the size of large language models (LLMs), hindering the model training process. In particular, activations-the intermediate tensors produced during forward propagation and reused in backward propagation-dominate the...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	2025 62nd ACM/IEEE Design Automation Conference (DAC) S. 1 - 7
Hauptverfasser:	Wu, Kun, Park, Jeongmin Brian, Zhang, Xiaofan, Hidayetoglu, Mert, Mailthody, Vikram Sharma, Huang, Sitao, Lumetta, Steve, Hwu, Wen-Mei
Format:	Tagungsbericht
Sprache:	Englisch
Veröffentlicht:	IEEE 22.06.2025
Schlagworte:	Adaptation models Graphics processing units Large language models Large-scale systems Nonvolatile memory Parallel processing Pipelines Tensors Throughput Training
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Schreiben Sie den ersten Kommentar!