ParGNN: A Scalable Graph Neural Network Training Framework on multi-GPUs
Full-batch Graph Neural Network (GNN) training is indispensable for interdisciplinary applications. Although fullbatch training has advantages in convergence accuracy and speed, it still faces challenges such as severe load imbalance and high communication traffic overhead. In order to address these...
Saved in:
| Published in: | 2025 62nd ACM/IEEE Design Automation Conference (DAC) pp. 1 - 7 |
|---|---|
| Main Authors: | , , , , , , , , , , |
| Format: | Conference Proceeding |
| Language: | English |
| Published: |
IEEE
22.06.2025
|
| Subjects: | |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Full-batch Graph Neural Network (GNN) training is indispensable for interdisciplinary applications. Although fullbatch training has advantages in convergence accuracy and speed, it still faces challenges such as severe load imbalance and high communication traffic overhead. In order to address these challenges, we propose ParGNN, an efficient full-batch training system for GNNs, which adopts a profiler-guided adaptive load balancing method along with graph over-partition to alleviate load imbalance. Based on the over-partition results, we present a subgraph pipeline algorithm to overlap communication and computation while maintaining the accuracy of GNN training. Extensive experiments demonstrate that ParGNN can not only obtain the highest accuracy but also reach the preset accuracy in the shortest time. In the end-to-end experiments performed on the four datasets, ParGNN outperforms the two state-of-theart full-batch GNN systems, PipeGCN and DGL, achieving the highest speedup of 2.7 \times and 21.8 \times times respectively. |
|---|---|
| DOI: | 10.1109/DAC63849.2025.11133102 |