Efficient Scaling of Dynamic Graph Neural Networks
We present distributed algorithms for training dynamic Graph Neural Networks (GNN) on large scale graphs spanning multi-node, multi-GPU systems. To the best of our knowledge, this is the first scaling study on dynamic GNN. We devise mechanisms for reducing the GPU memory usage and identify two execu...
Uloženo v:
| Vydáno v: | SC21: International Conference for High Performance Computing, Networking, Storage and Analysis s. 1 - 13 |
|---|---|
| Hlavní autoři: | , , , , , |
| Médium: | Konferenční příspěvek |
| Jazyk: | angličtina |
| Vydáno: |
ACM
14.11.2021
|
| Témata: | |
| ISSN: | 2167-4337 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Abstract | We present distributed algorithms for training dynamic Graph Neural Networks (GNN) on large scale graphs spanning multi-node, multi-GPU systems. To the best of our knowledge, this is the first scaling study on dynamic GNN. We devise mechanisms for reducing the GPU memory usage and identify two execution time bottlenecks: CPU-GPU data transfer; and communication volume. Exploiting properties of dynamic graphs, we design a graph difference-based strategy to significantly reduce the transfer time. We develop a simple, but effective data distribution technique under which the communication volume remains fixed and linear in the input size, for any number of GPUs. Our experiments using billion-size graphs on a system of 128 GPUs shows that: (i) the distribution scheme achieves up to 30x speedup on 128 GPUs; (ii) the graph-difference technique reduces the transfer time by a factor of up to 4.1x and the overall execution time by up to 40%. |
|---|---|
| AbstractList | We present distributed algorithms for training dynamic Graph Neural Networks (GNN) on large scale graphs spanning multi-node, multi-GPU systems. To the best of our knowledge, this is the first scaling study on dynamic GNN. We devise mechanisms for reducing the GPU memory usage and identify two execution time bottlenecks: CPU-GPU data transfer; and communication volume. Exploiting properties of dynamic graphs, we design a graph difference-based strategy to significantly reduce the transfer time. We develop a simple, but effective data distribution technique under which the communication volume remains fixed and linear in the input size, for any number of GPUs. Our experiments using billion-size graphs on a system of 128 GPUs shows that: (i) the distribution scheme achieves up to 30x speedup on 128 GPUs; (ii) the graph-difference technique reduces the transfer time by a factor of up to 4.1x and the overall execution time by up to 40%. |
| Author | Ubaru, Shashanka Raje, Saurabh Sabharwal, Yogish Suzumura, Toyotaro Pandian, Shivmaran S. Chakaravarthy, Venkatesan T. |
| Author_xml | – sequence: 1 givenname: Venkatesan T. surname: Chakaravarthy fullname: Chakaravarthy, Venkatesan T. email: vechakra@in.ibm.com organization: IBM Research India – sequence: 2 givenname: Shivmaran S. surname: Pandian fullname: Pandian, Shivmaran S. email: shivs017@in.ibm.com organization: IBM Research India – sequence: 3 givenname: Saurabh surname: Raje fullname: Raje, Saurabh email: saurabh.mraje@gmail.com organization: IBM Research India – sequence: 4 givenname: Yogish surname: Sabharwal fullname: Sabharwal, Yogish email: ysabharwal@in.ibm.com organization: IBM Research India – sequence: 5 givenname: Toyotaro surname: Suzumura fullname: Suzumura, Toyotaro email: suzumura@acm.org organization: IBM T.J. Watson Research Center,USA – sequence: 6 givenname: Shashanka surname: Ubaru fullname: Ubaru, Shashanka email: shashanka.ubaru@ibm.com organization: IBM T.J. Watson Research Center,USA |
| BookMark | eNotjLFOwzAUAA0CiVIyM7DkB1Ken5-xPaJSClIFAzBXL44NhtSpkiDUvycSTCfdSXcuTnKXgxCXEhZSkr5WpK2VZqHIgtX2SBTO2CmAskQoj8UM5Y2pSClzJoph-AQAtEYqhJnAVYzJp5DH8sVzm_J72cXy7pB5l3y57nn_UT6F757bCeNP138NF-I0cjuE4p9z8Xa_el0-VJvn9ePydlMxGj1WXlNjjHIREJ0l67STFKjmSauI4DVyYwEboqiDq70PpGPUbIDrWJOai6u_bwohbPd92nF_2DonAQyqX4UIRZk |
| ContentType | Conference Proceeding |
| DBID | 6IE 6IL CBEJK RIE RIL |
| DOI | 10.1145/3458817.3480858 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Xplore url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Computer Science |
| EISBN | 9781450384421 1450384420 |
| EISSN | 2167-4337 |
| EndPage | 13 |
| ExternalDocumentID | 9910072 |
| Genre | orig-research |
| GroupedDBID | 6IE 6IF 6IH 6IK 6IL 6IN AAWTH ABLEC ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK IPLJI OCL RIE RIL |
| ID | FETCH-LOGICAL-a275t-c54d7739f022984895914e4ba4d73f20c52ad802d44f5e9bcce45ff5a70abfb43 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 21 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000946520100033&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| IngestDate | Wed Aug 27 02:18:35 EDT 2025 |
| IsPeerReviewed | false |
| IsScholarly | false |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-a275t-c54d7739f022984895914e4ba4d73f20c52ad802d44f5e9bcce45ff5a70abfb43 |
| PageCount | 13 |
| ParticipantIDs | ieee_primary_9910072 |
| PublicationCentury | 2000 |
| PublicationDate | 2021-Nov.-14 |
| PublicationDateYYYYMMDD | 2021-11-14 |
| PublicationDate_xml | – month: 11 year: 2021 text: 2021-Nov.-14 day: 14 |
| PublicationDecade | 2020 |
| PublicationTitle | SC21: International Conference for High Performance Computing, Networking, Storage and Analysis |
| PublicationTitleAbbrev | SC |
| PublicationYear | 2021 |
| Publisher | ACM |
| Publisher_xml | – name: ACM |
| SSID | ssj0002871320 ssj0003204180 |
| Score | 1.9614563 |
| Snippet | We present distributed algorithms for training dynamic Graph Neural Networks (GNN) on large scale graphs spanning multi-node, multi-GPU systems. To the best of... |
| SourceID | ieee |
| SourceType | Publisher |
| StartPage | 1 |
| SubjectTerms | Data transfer Distributed algorithms dynamic graphs Graph neural networks Graphics processing units Heuristic algorithms High performance computing learning Training |
| Title | Efficient Scaling of Dynamic Graph Neural Networks |
| URI | https://ieeexplore.ieee.org/document/9910072 |
| WOSCitedRecordID | wos000946520100033&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV09T8MwED21FQNTgRbxLQ-MpE3sc2zP0MKAqg4gdasc5ywhoRa1Kb8fOwlFSCxMcW6w7ETWPdv33gO4zbX05PM0QU0uwTLHREeuTEhWPCOTK9-4ljyr2UwvFmbegbs9F4aI6uIzGsVmfZdfrt0uHpWNA5aJStdd6CqVN1yt_XlKRP6ihT7xPbQx02mr5pOhHItIyszUSKAOOEP_slOps8m0_79xHMHwh5bH5vuEcwwdWp1A_9uXgbXLdAB8UutChD5CzEa-OVt79tB4z7PHKFHNoiiHfQ-Pugp8O4TX6eTl_ilpvRESy5WsEiexVEoYH3Kw0aiNNBkSFjaEheepk9yWOuUlopdkCucIpffSqtQWvkBxCr3VekVnwAJADLsoSTznhN5K64S36DJhSysL685hED_B8qORv1i2s7_4O3wJhzyWfcRKObyCXrXZ0TUcuM_qbbu5qf_ZF-jRlTM |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV09T8MwED2VggRTgRbxjQdG0ib2OXZmaCmiVB2K1K1ynLOEhFrUD34_dhqKkFiY4txg2Ymse7bvvQdwm2rpyKVxhJpshEWKkQ5cGZ-seEJZqtzGtWSghkM9mWSjGtxtuTBEVBafUTs0y7v8Ym7X4ais47FMULregd3gnFWxtbYnKgH7iwr8hHffxkTHlZ5PgrIjAi0zUW2B2iMN_ctQpcwnvcb_RnIIrR9iHhttU84R1Gh2DI1vZwZWLdQm8G6pDOH78DETGOds7tjDxn2ePQaRahZkOcy7f5R14MsWvPa64_t-VLkjRIYruYqsxEIpkTmfhTONOpNZgoS58WHheGwlN4WOeYHoJGW5tYTSOWlUbHKXoziB-mw-o1NgHiL6fZQknnJCZ6Sxwhm0iTCFkbmxZ9AMn2D6sRHAmFazP_87fAP7_fHLYDp4Gj5fwAEPRSChbg4vob5arOkK9uzn6m25uC7_3xfinph8 |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=SC21%3A+International+Conference+for+High+Performance+Computing%2C+Networking%2C+Storage+and+Analysis&rft.atitle=Efficient+Scaling+of+Dynamic+Graph+Neural+Networks&rft.au=Chakaravarthy%2C+Venkatesan+T.&rft.au=Pandian%2C+Shivmaran+S.&rft.au=Raje%2C+Saurabh&rft.au=Sabharwal%2C+Yogish&rft.date=2021-11-14&rft.pub=ACM&rft.eissn=2167-4337&rft.spage=1&rft.epage=13&rft_id=info:doi/10.1145%2F3458817.3480858&rft.externalDocID=9910072 |