ParGNN: A Scalable Graph Neural Network Training Framework on multi-GPUs

Full-batch Graph Neural Network (GNN) training is indispensable for interdisciplinary applications. Although fullbatch training has advantages in convergence accuracy and speed, it still faces challenges such as severe load imbalance and high communication traffic overhead. In order to address these...

Full description

Saved in:

Bibliographic Details
Published in:	2025 62nd ACM/IEEE Design Automation Conference (DAC) pp. 1 - 7
Main Authors:	Gu, Junyu, Li, Shunde, Cao, Rongqiang, Wang, Jue, Wang, Zijian, Liang, Zhiqiang, Liu, Fang, Li, Shigang, Zhou, Chunbao, Wang, Yangang, Chi, Xuebin
Format:	Conference Proceeding
Language:	English
Published:	IEEE 22.06.2025
Subjects:	Accuracy Computation and communication overlapping Convergence Design automation Faces Full-batch distributed training Graph neural network Graph neural networks Graphics processing units Load balancing Load management Partitioning algorithms Pipelines Training
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Full-batch Graph Neural Network (GNN) training is indispensable for interdisciplinary applications. Although fullbatch training has advantages in convergence accuracy and speed, it still faces challenges such as severe load imbalance and high communication traffic overhead. In order to address these challenges, we propose ParGNN, an efficient full-batch training system for GNNs, which adopts a profiler-guided adaptive load balancing method along with graph over-partition to alleviate load imbalance. Based on the over-partition results, we present a subgraph pipeline algorithm to overlap communication and computation while maintaining the accuracy of GNN training. Extensive experiments demonstrate that ParGNN can not only obtain the highest accuracy but also reach the preset accuracy in the shortest time. In the end-to-end experiments performed on the four datasets, ParGNN outperforms the two state-of-theart full-batch GNN systems, PipeGCN and DGL, achieving the highest speedup of 2.7 \times and 21.8 \times times respectively.
DOI:	10.1109/DAC63849.2025.11133102