ParGNN: A Scalable Graph Neural Network Training Framework on multi-GPUs
Full-batch Graph Neural Network (GNN) training is indispensable for interdisciplinary applications. Although fullbatch training has advantages in convergence accuracy and speed, it still faces challenges such as severe load imbalance and high communication traffic overhead. In order to address these...
Gespeichert in:
| Veröffentlicht in: | 2025 62nd ACM/IEEE Design Automation Conference (DAC) S. 1 - 7 |
|---|---|
| Hauptverfasser: | , , , , , , , , , , |
| Format: | Tagungsbericht |
| Sprache: | Englisch |
| Veröffentlicht: |
IEEE
22.06.2025
|
| Schlagworte: | |
| Online-Zugang: | Volltext |
| Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
| Abstract | Full-batch Graph Neural Network (GNN) training is indispensable for interdisciplinary applications. Although fullbatch training has advantages in convergence accuracy and speed, it still faces challenges such as severe load imbalance and high communication traffic overhead. In order to address these challenges, we propose ParGNN, an efficient full-batch training system for GNNs, which adopts a profiler-guided adaptive load balancing method along with graph over-partition to alleviate load imbalance. Based on the over-partition results, we present a subgraph pipeline algorithm to overlap communication and computation while maintaining the accuracy of GNN training. Extensive experiments demonstrate that ParGNN can not only obtain the highest accuracy but also reach the preset accuracy in the shortest time. In the end-to-end experiments performed on the four datasets, ParGNN outperforms the two state-of-theart full-batch GNN systems, PipeGCN and DGL, achieving the highest speedup of 2.7 \times and 21.8 \times times respectively. |
|---|---|
| AbstractList | Full-batch Graph Neural Network (GNN) training is indispensable for interdisciplinary applications. Although fullbatch training has advantages in convergence accuracy and speed, it still faces challenges such as severe load imbalance and high communication traffic overhead. In order to address these challenges, we propose ParGNN, an efficient full-batch training system for GNNs, which adopts a profiler-guided adaptive load balancing method along with graph over-partition to alleviate load imbalance. Based on the over-partition results, we present a subgraph pipeline algorithm to overlap communication and computation while maintaining the accuracy of GNN training. Extensive experiments demonstrate that ParGNN can not only obtain the highest accuracy but also reach the preset accuracy in the shortest time. In the end-to-end experiments performed on the four datasets, ParGNN outperforms the two state-of-theart full-batch GNN systems, PipeGCN and DGL, achieving the highest speedup of 2.7 \times and 21.8 \times times respectively. |
| Author | Chi, Xuebin Wang, Jue Wang, Zijian Li, Shigang Gu, Junyu Zhou, Chunbao Liang, Zhiqiang Cao, Rongqiang Li, Shunde Wang, Yangang Liu, Fang |
| Author_xml | – sequence: 1 givenname: Junyu surname: Gu fullname: Gu, Junyu email: jygu@cnic.cn organization: Chinese Academy of Sciences,Computer Network Information Center,Beijing,China – sequence: 2 givenname: Shunde surname: Li fullname: Li, Shunde email: lishunde@cnic.cn organization: Chinese Academy of Sciences,Computer Network Information Center,Beijing,China – sequence: 3 givenname: Rongqiang surname: Cao fullname: Cao, Rongqiang email: caorq@sccas.cn organization: Chinese Academy of Sciences,Computer Network Information Center,Beijing,China – sequence: 4 givenname: Jue surname: Wang fullname: Wang, Jue email: wangjue@sccas.cn organization: Chinese Academy of Sciences,Computer Network Information Center,Beijing,China – sequence: 5 givenname: Zijian surname: Wang fullname: Wang, Zijian email: wangzj@cnic.cn organization: Chinese Academy of Sciences,Computer Network Information Center,Beijing,China – sequence: 6 givenname: Zhiqiang surname: Liang fullname: Liang, Zhiqiang email: zqliang@cnic.cn organization: Chinese Academy of Sciences,Computer Network Information Center,Beijing,China – sequence: 7 givenname: Fang surname: Liu fullname: Liu, Fang email: liufang@sccas.cn organization: Chinese Academy of Sciences,Computer Network Information Center,Beijing,China – sequence: 8 givenname: Shigang surname: Li fullname: Li, Shigang email: shigangli.cs@gmail.com organization: Beijing University of Posts and Telecommunications,School of Computer Science,Beijing,China – sequence: 9 givenname: Chunbao surname: Zhou fullname: Zhou, Chunbao email: zhoucb@sccas.cn organization: Chinese Academy of Sciences,Computer Network Information Center,Beijing,China – sequence: 10 givenname: Yangang surname: Wang fullname: Wang, Yangang email: wangyg@sccas.cn organization: Chinese Academy of Sciences,Computer Network Information Center,Beijing,China – sequence: 11 givenname: Xuebin surname: Chi fullname: Chi, Xuebin email: chi@sccas.cn organization: Chinese Academy of Sciences,Computer Network Information Center,Beijing,China |
| BookMark | eNo1j9tKw0AURUfQB639A5H5gdS5ZWaObyHaVCixYPtcTpITDeZSpini3zd4eVqb9bBg37DLfuiJsXspFlIKeHhKUqu9gYUSKp6U1FoKdcHm4MBPOxZaGH_NVhsMWZ4_8oS_ldhi0RLPAh4-eE6ngO2E8WsIn3wbsOmb_p0vA3b0o4aed6d2bKJsszvesqsa2yPN_zhju-XzNl1F69fsJU3WEUoHY6QqiLFEVaEFYZVxhbFxUUqDWIP3hTGqhKq2gETWkwOyiABOO5BGqVjP2N1vtyGi_SE0HYbv_f9BfQat4kjq |
| ContentType | Conference Proceeding |
| DBID | 6IE 6IH CBEJK RIE RIO |
| DOI | 10.1109/DAC63849.2025.11133102 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan (POP) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP) 1998-present |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| EISBN | 9798331503048 |
| EndPage | 7 |
| ExternalDocumentID | 11133102 |
| Genre | orig-research |
| GrantInformation_xml | – fundername: Chinese Academy of Sciences funderid: 10.13039/501100002367 |
| GroupedDBID | 6IE 6IH CBEJK RIE RIO |
| ID | FETCH-LOGICAL-a179t-2d95aca2da6906247b465bc14aaf988b442c9df69aee68e79e6aa997379142253 |
| IEDL.DBID | RIE |
| IngestDate | Wed Oct 01 07:05:15 EDT 2025 |
| IsPeerReviewed | false |
| IsScholarly | true |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-a179t-2d95aca2da6906247b465bc14aaf988b442c9df69aee68e79e6aa997379142253 |
| PageCount | 7 |
| ParticipantIDs | ieee_primary_11133102 |
| PublicationCentury | 2000 |
| PublicationDate | 2025-June-22 |
| PublicationDateYYYYMMDD | 2025-06-22 |
| PublicationDate_xml | – month: 06 year: 2025 text: 2025-June-22 day: 22 |
| PublicationDecade | 2020 |
| PublicationTitle | 2025 62nd ACM/IEEE Design Automation Conference (DAC) |
| PublicationTitleAbbrev | DAC |
| PublicationYear | 2025 |
| Publisher | IEEE |
| Publisher_xml | – name: IEEE |
| Score | 2.295383 |
| Snippet | Full-batch Graph Neural Network (GNN) training is indispensable for interdisciplinary applications. Although fullbatch training has advantages in convergence... |
| SourceID | ieee |
| SourceType | Publisher |
| StartPage | 1 |
| SubjectTerms | Accuracy Computation and communication overlapping Convergence Design automation Faces Full-batch distributed training Graph neural network Graph neural networks Graphics processing units Load balancing Load management Partitioning algorithms Pipelines Training |
| Title | ParGNN: A Scalable Graph Neural Network Training Framework on multi-GPUs |
| URI | https://ieeexplore.ieee.org/document/11133102 |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LSwMxEA5aPHhSseKbHLym3U3z9FaqbU_Lgi30VvKYgCBbqa2_3yTdKh48eEoIgcDkMcPk--ZD6EFxI5wLBSk9CMKCD0T5YIjlVqjAoBgom8UmZFWpxULXLVk9c2EAIIPPoJe6-S_fr9w2pcr6SRY9hiPxxT2UUu7IWi3rtyx0_2k4iqeJJfoJ5b395F-yKdlrjE_-ud4p6v7w73D97VnO0AE052ham_Wkqh7xEL9EwybKE56kctM4Fdgwb7HJiG48a1Uf8HgPvMKrBmfkIJnU848umo-fZ6MpaXUQiInXZUOo19w4Q73JVYWZtExw60pmTNBKWcao0z4IbQCEAqlBGKO1HEidMjx8cIE6zaqBS4Sl0FbFCKHUkjMApuPGhOjFZWELKTW9Qt1khuX7rtTFcm-B6z_Gb9BxMnbCTlF6izqb9Rbu0JH73Lx-rO_zBn0Bqb6RbA |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3LSgMxFA1SBV2pWPFtFm7TzkzzdFeqbcU6DNhCdyXJ3IAgU-nD7zdJp4oLF64SQiBw8rghOecehO4k09xal5C0BE6oKx2RpdPEMMOlo5B0pIlmEyLP5XSqilqsHrUwABDJZ9AK1fiXX87tOjyVtYMtur-O-BN3l1GapRu5Vq37TRPVfuj2_HqiQYCSsda2-y_jlBg3-of_HPEINX8UeLj4ji3HaAeqEzQs9GKQ5_e4i189tEH0hAch4TQOKTb0uy8ipxuPa98H3N9Sr_C8wpE7SAbFZNlEk_7juDcktRMC0X7DrEhWKqatzkod8wpTYShnxqZUa6ekNB4Mq0rHlQbgEoQCrrVSoiNUeONhnVPUqOYVnCEsuDLS3xFSJRgFoMpPjfNxXCQmEUJl56gZYJh9bJJdzLYIXPzRfov2h-OX0Wz0lD9fooMAfGBSZdkVaqwWa7hGe_Zz9bZc3MTJ-gJu55Sz |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2025+62nd+ACM%2FIEEE+Design+Automation+Conference+%28DAC%29&rft.atitle=ParGNN%3A+A+Scalable+Graph+Neural+Network+Training+Framework+on+multi-GPUs&rft.au=Gu%2C+Junyu&rft.au=Li%2C+Shunde&rft.au=Cao%2C+Rongqiang&rft.au=Wang%2C+Jue&rft.date=2025-06-22&rft.pub=IEEE&rft.spage=1&rft.epage=7&rft_id=info:doi/10.1109%2FDAC63849.2025.11133102&rft.externalDocID=11133102 |