CAPTURE: Memory-Centric Partitioning for Distributed DNN Training with Hybrid Parallelism
Deep Learning (DL) model sizes are increasing at a rapid pace, as larger models typically offer better statistical performance. Modern Large Language Models (LLMs) and image processing models contain billions of trainable parameters. Training such massive neural networks incurs significant memory re...
Saved in:
| Published in: | Proceedings - International Conference on High Performance Computing pp. 76 - 86 |
|---|---|
| Main Authors: | , , , |
| Format: | Conference Proceeding |
| Language: | English |
| Published: |
IEEE
18.12.2023
|
| Subjects: | |
| ISSN: | 2640-0316 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Be the first to leave a comment!