Reducing Communication in Graph Neural Network Training
Graph Neural Networks (GNNs) are powerful and flexible neural networks that use the naturally sparse connectivity information of the data. GNNs represent this connectivity as sparse matrices, which have lower arithmetic intensity and thus higher communication costs compared to dense matrices, making...
Gespeichert in:
| Veröffentlicht in: | International Conference for High Performance Computing, Networking, Storage and Analysis (Online) Jg. 2020; S. 1 - 14 |
|---|---|
| Hauptverfasser: | , , |
| Format: | Tagungsbericht Journal Article |
| Sprache: | Englisch |
| Veröffentlicht: |
United States
IEEE
01.11.2020
|
| Schlagworte: | |
| ISSN: | 2167-4329 |
| Online-Zugang: | Volltext |
| Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
| Abstract | Graph Neural Networks (GNNs) are powerful and flexible neural networks that use the naturally sparse connectivity information of the data. GNNs represent this connectivity as sparse matrices, which have lower arithmetic intensity and thus higher communication costs compared to dense matrices, making GNNs harder to scale to high concurrencies than convolutional or fully-connected neural networks. We introduce a family of parallel algorithms for training GNNs and show that they can asymptotically reduce communication compared to previous parallel GNN training methods. We implement these algorithms, which are based on 1D, 1. 5D, 2D, and 3D sparse-dense matrix multiplication, using torch.distributed on GPU-equipped clusters. Our algorithms optimize communication across the full GNN training pipeline. We train GNNs on over a hundred GPUs on multiple datasets, including a protein network with over a billion edges. |
|---|---|
| AbstractList | Graph Neural Networks (GNNs) are powerful and flexible neural networks that use the naturally sparse connectivity information of the data. GNNs represent this connectivity as sparse matrices, which have lower arithmetic intensity and thus higher communication costs compared to dense matrices, making GNNs harder to scale to high concurrencies than convolutional or fully-connected neural networks. Here, we introduce a family of parallel algorithms for training GNNs and show that they can asymptotically reduce communication compared to previous parallel GNN training methods. We implement these algorithms, which are based on 1D, 1. 5D, 2D, and 3D sparse-dense matrix multiplication, using torch.distributed on GPU-equipped clusters. Our algorithms optimize communication across the full GNN training pipeline. We train GNNs on over a hundred GPUs on multiple datasets, including a protein network with over a billion edges. Graph Neural Networks (GNNs) are powerful and flexible neural networks that use the naturally sparse connectivity information of the data. GNNs represent this connectivity as sparse matrices, which have lower arithmetic intensity and thus higher communication costs compared to dense matrices, making GNNs harder to scale to high concurrencies than convolutional or fully-connected neural networks. We introduce a family of parallel algorithms for training GNNs and show that they can asymptotically reduce communication compared to previous parallel GNN training methods. We implement these algorithms, which are based on 1D, 1. 5D, 2D, and 3D sparse-dense matrix multiplication, using torch.distributed on GPU-equipped clusters. Our algorithms optimize communication across the full GNN training pipeline. We train GNNs on over a hundred GPUs on multiple datasets, including a protein network with over a billion edges. |
| Author | Buluc, Aydin Yelick, Katherine Tripathy, Alok |
| Author_xml | – sequence: 1 givenname: Alok surname: Tripathy fullname: Tripathy, Alok organization: University of California,Electrical Engineering and Computer Sciences,Berkeley – sequence: 2 givenname: Katherine surname: Yelick fullname: Yelick, Katherine organization: University of California,Electrical Engineering and Computer Sciences,Berkeley – sequence: 3 givenname: Aydin surname: Buluc fullname: Buluc, Aydin organization: University of California,Electrical Engineering and Computer Sciences,Berkeley |
| BackLink | https://www.osti.gov/servlets/purl/1772909$$D View this record in Osti.gov |
| BookMark | eNotzMtKw0AUgOERKtjWvoBugvvEc-aSySwlaBWKgtZ1mMuJHWwnJUkR395AXf2bj3_BZqlLxNgNQoEI5v6jlihBFRw4FACg5QVboOYVGmMqmLE5x1LnUnBzxVbDEB1IjrosKzVn-p3Cycf0ldXd4XBK0dsxdimLKVv39rjLXunU2_2U8afrv7Ntb2Oa-DW7bO1-oNV_l-zz6XFbP-ebt_VL_bDJI9dyzImUDlYCkmpFpa3QQbfOgYHgrAvIEYP0lgxVwUIglN4b5SZufFkGJ5bs7vzthjE2g48j-Z3vUiI_Nqg1N2AmdHtGkYiaYx8Ptv9tjFCKayH-AK1hVSA |
| CODEN | IEEPAD |
| ContentType | Conference Proceeding Journal Article |
| CorporateAuthor | Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility (OLCF) Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States) |
| CorporateAuthor_xml | – sequence: 0 name: Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility (OLCF) – sequence: 0 name: Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States) |
| DBID | 6IE 6IL CBEJK RIE RIL OIOZB OTOTI |
| DOI | 10.1109/SC41405.2020.00074 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Xplore POP ALL IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present OSTI.GOV - Hybrid OSTI.GOV |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| EISBN | 1728199980 9781728199986 |
| EndPage | 14 |
| ExternalDocumentID | 1772909 9355273 |
| Genre | orig-research |
| GrantInformation_xml | – fundername: Office of Science funderid: 10.13039/100006132 – fundername: Advanced Scientific Computing Research funderid: 10.13039/100006192 – fundername: Oak Ridge National Laboratory funderid: 10.13039/100006228 – fundername: National Science Foundation funderid: 10.13039/100000001 |
| GroupedDBID | 6IE 6IL ACM ALMA_UNASSIGNED_HOLDINGS APO CBEJK GUFHI LHSKQ RIE RIL 6IF 6IH 6IK 6IN ABLEC ADZIZ BEFXN BFFAM BGNUA BKEBE BPEOZ CHZPO IEGSK IPLJI OCL OIOZB OTOTI |
| ID | FETCH-LOGICAL-i274t-ee57da401e5f387a37d7fbb090dbabd1211d4cae9e8da0de14cc95b1e59c66db3 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 42 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000668022000049&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 2167-4329 |
| IngestDate | Thu Dec 05 06:25:38 EST 2024 Wed Aug 27 02:41:14 EDT 2025 |
| IsDoiOpenAccess | true |
| IsOpenAccess | true |
| IsPeerReviewed | false |
| IsScholarly | false |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-i274t-ee57da401e5f387a37d7fbb090dbabd1211d4cae9e8da0de14cc95b1e59c66db3 |
| Notes | USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR) National Science Foundation (NSF) AC02-05CH11231; DGE 1752814; 1823034; AC05-00OR22725 USDOE National Nuclear Security Administration (NNSA) |
| OpenAccessLink | https://www.osti.gov/servlets/purl/1772909 |
| PageCount | 14 |
| ParticipantIDs | ieee_primary_9355273 osti_scitechconnect_1772909 |
| PublicationCentury | 2000 |
| PublicationDate | 2020-11-01 |
| PublicationDateYYYYMMDD | 2020-11-01 |
| PublicationDate_xml | – month: 11 year: 2020 text: 2020-11-01 day: 01 |
| PublicationDecade | 2020 |
| PublicationPlace | United States |
| PublicationPlace_xml | – name: United States |
| PublicationTitle | International Conference for High Performance Computing, Networking, Storage and Analysis (Online) |
| PublicationTitleAbbrev | SC |
| PublicationYear | 2020 |
| Publisher | IEEE |
| Publisher_xml | – sequence: 0 name: IEEE – name: IEEE |
| SSID | ssib042176685 ssj0003204180 |
| Score | 2.2233422 |
| Snippet | Graph Neural Networks (GNNs) are powerful and flexible neural networks that use the naturally sparse connectivity information of the data. GNNs represent this... |
| SourceID | osti ieee |
| SourceType | Open Access Repository Publisher |
| StartPage | 1 |
| SubjectTerms | Clustering algorithms communication-avoiding algorithms distributed training Graph neural networks MATHEMATICS AND COMPUTING Proteins Sparse matrices Three-dimensional displays Training Two dimensional displays |
| Title | Reducing Communication in Graph Neural Network Training |
| URI | https://ieeexplore.ieee.org/document/9355273 https://www.osti.gov/servlets/purl/1772909 |
| Volume | 2020 |
| WOSCitedRecordID | wos000668022000049&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NT8JAEJ0A8eBJDRgRNXvwaGWh2273TEQPhhBFw63ZjyEhMcVg8fc70wJq4sXTtk2bdKfdvJn2vTcA18qmgYDaRl4HNtVOVeQIFyIzRJt4lzqsVK6vj3oyyeZzM23AzV4Lg4gV-QxvebP6lx9WfsOfyvrsBU5w24Sm1rrWau3eHTVkp8Ms2elipOk_jxRVDwnVgEOmb0nm9FUdVGhY0fr5gSPjo__dwTF0vgV5YrqHmhNoYNEG_cS2q7Qvfqk8xLIQ92xDLdh4w77RUDG9xWzbDaIDL-O72egh2vZBiJZUM5YRYqKDpUIIk0WcaRtTWBfOSSODsy6wSVtQ3qLBLFgZcKC8N4mj041P0-DiU2gVqwLPQMgBZdR0GFP2qYu9XdB6p4wEjbRBhdCFNk86f6-tLvLtfLvQ44jlhMpsLeuZg-PLfMCpuTTnf1_Ug0MOfS3cu4BWud7gJRz4z3L5sb6qnt4XCm-bjg |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NS8NAEB1qFfSk0oq1fuTg0ei22Wyy52KtWEvRKr2F_ZhCQVKpqb_fmfRDBS-eNlkSyO5meTPJe28ALqVRnoDahC7xbKqtZGgJF0LdRhM7qyyWKtfXfjIYpOOxHlbgaqOFQcSSfIbXfFj-y_czt-BPZTfsBU5wuwXbsZTt1lKttX57qCNRKo3Xyhihb547kvKHmLLANhO4BLP6yhoq1MxoB_1Aku7-_57hAOrfkrxguAGbQ6hgXoPkiY1X6Tz4pfMIpnlwx0bUAVtvmDdqSq53MFrVg6jDS_d21OmFq0oI4ZSyxiJEjBNvKBXCeBKliYloYifWCi28NdazTZuXzqDG1BvhsSWd07Gly7VTytvoCKr5LMdjCESLYmrqRsVOdZEzE9rxFJOgFsZL7xtQ40Fn70uzi2w13gY0ecYywmU2l3XMwnFF1uLgXOiTv2-6gN3e6LGf9e8HD03Y42VYyvhOoVrMF3gGO-6zmH7Mz8uV_AJSJZ7V |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=SC20%3A+International+Conference+for+High+Performance+Computing%2C+Networking%2C+Storage+and+Analysis&rft.atitle=Reducing+Communication+in+Graph+Neural+Network+Training&rft.au=Tripathy%2C+Alok&rft.au=Yelick%2C+Katherine&rft.au=Buluc%2C+Aydin&rft.date=2020-11-01&rft.pub=IEEE&rft.spage=1&rft.epage=14&rft_id=info:doi/10.1109%2FSC41405.2020.00074&rft.externalDocID=9355273 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2167-4329&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2167-4329&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2167-4329&client=summon |