Democratizing AI: Open-source Scalable LLM Training on GPU-based Supercomputers

Training and fine-tuning large language models (LLMs) with hundreds of billions to trillions of parameters requires tens of thousands of GPUs, and a highly scalable software stack. In this work, we present a novel four-dimensional hybrid parallel algorithm implemented in a highly scalable, portable,...

Full description

Saved in:
Bibliographic Details
Published in:SC24: International Conference for High Performance Computing, Networking, Storage and Analysis pp. 1 - 14
Main Authors: Singh, Siddharth, Singhania, Prajwal, Ranjan, Aditya, Kirchenbauer, John, Geiping, Jonas, Wen, Yuxin, Jain, Neel, Hans, Abhimanyu, Shu, Manli, Tomar, Aditya, Goldstein, Tom, Bhatele, Abhinav
Format: Conference Proceeding
Language:English
Published: IEEE 17.11.2024
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Be the first to leave a comment!
You must be logged in first