Sphynx: A parallel multi-GPU graph partitioner for distributed-memory systems
Graph partitioning has been an important tool to partition the work among several processors to minimize the communication cost and balance the workload. While accelerator-based supercomputers are emerging to be the standard, the use of graph partitioning becomes even more important as applications...
Gespeichert in:
| Veröffentlicht in: | Parallel computing Jg. 106; H. C; S. 102769 |
|---|---|
| Hauptverfasser: | , , , |
| Format: | Journal Article |
| Sprache: | Englisch |
| Veröffentlicht: |
Netherlands
Elsevier B.V
01.09.2021
Elsevier |
| Schlagworte: | |
| ISSN: | 0167-8191, 1872-7336 |
| Online-Zugang: | Volltext |
| Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
| Abstract | Graph partitioning has been an important tool to partition the work among several processors to minimize the communication cost and balance the workload. While accelerator-based supercomputers are emerging to be the standard, the use of graph partitioning becomes even more important as applications are rapidly moving to these architectures. However, there is no distributed-memory-parallel, multi-GPU graph partitioner available for applications. We developed a spectral graph partitioner, Sphynx, using the portable, accelerator-friendly stack of the Trilinos framework. In Sphynx, we allow using different preconditioners and exploit their unique advantages. We use Sphynx to systematically evaluate the various algorithmic choices in spectral partitioning with a focus on the GPU performance. We perform those evaluations on two distinct classes of graphs: regular (such as meshes, matrices from finite element methods) and irregular (such as social networks and web graphs), and show that different settings and preconditioners are needed for these graph classes. The experimental results on the Summit supercomputer show that Sphynx is the fastest alternative on irregular graphs in an application-friendly setting and obtains a partitioning quality close to ParMETIS on regular graphs. When compared to nvGRAPH on a single GPU, Sphynx is faster and obtains better balance and better quality partitions. Sphynx provides a good and robust partitioning method across a wide range of graphs for applications looking for a GPU-based partitioner.
•Sphynx is the first multi-GPU graph partitioner on distributed-memory systems.•Sphynx uses a spectral method followed by a fast geometric method.•Sphynx is flexible and provides multiple preconditioners in the eigensolver.•Sphynx has been tuned for different preconditioners and graph types. |
|---|---|
| AbstractList | Graph partitioning has been an important tool to partition the work among several processors to minimize the communication cost and balance the workload. While accelerator-based supercomputers are emerging to be the standard, the use of graph partitioning becomes even more important as applications are rapidly moving to these architectures. However, there is no distributed-memory-parallel, multi-GPU graph partitioner available for applications. We developed a spectral graph partitioner, Sphynx, using the portable, accelerator-friendly stack of the Trilinos framework. In Sphynx, we allow using different preconditioners and exploit their unique advantages. We use Sphynx to systematically evaluate the various algorithmic choices in spectral partitioning with a focus on the GPU performance. We perform those evaluations on two distinct classes of graphs: regular (such as meshes, matrices from finite element methods) and irregular (such as social networks and web graphs), and show that different settings and preconditioners are needed for these graph classes. The experimental results on the Summit supercomputer show that Sphynx is the fastest alternative on irregular graphs in an application-friendly setting and obtains a partitioning quality close to ParMETIS on regular graphs. When compared to nvGRAPH on a single GPU, Sphynx is faster and obtains better balance and better quality partitions. Sphynx provides a good and robust partitioning method across a wide range of graphs for applications looking for a GPU-based partitioner.
•Sphynx is the first multi-GPU graph partitioner on distributed-memory systems.•Sphynx uses a spectral method followed by a fast geometric method.•Sphynx is flexible and provides multiple preconditioners in the eigensolver.•Sphynx has been tuned for different preconditioners and graph types. |
| ArticleNumber | 102769 |
| Author | Boman, Erik G. Rajamanickam, Sivasankaran Glusa, Christian A. Acer, Seher |
| Author_xml | – sequence: 1 givenname: Seher orcidid: 0000-0003-3951-3930 surname: Acer fullname: Acer, Seher email: sacer@sandia.gov – sequence: 2 givenname: Erik G. surname: Boman fullname: Boman, Erik G. email: egboman@sandia.gov – sequence: 3 givenname: Christian A. orcidid: 0000-0003-2247-1914 surname: Glusa fullname: Glusa, Christian A. email: caglusa@sandia.gov – sequence: 4 givenname: Sivasankaran surname: Rajamanickam fullname: Rajamanickam, Sivasankaran email: srajama@sandia.gov |
| BackLink | https://www.osti.gov/biblio/1862429$$D View this record in Osti.gov |
| BookMark | eNqFkM1OwzAQhC1UJNrCE3CJuKfY-bETJA5VBQWpCCTo2XKcNXWVxJXtIvL2OIQTBzittDvf7s7M0KQzHSB0SfCCYEKv94uDsNIsEpyQ0EkYLU_QlBQsiVma0gmaBhWLC1KSMzRzbo8xplmBp-jp9bDru8-baBmFFaJpoInaY-N1vH7ZRu9WHHbDwGuvw0kbKWOjWjtvdXX0UMcttMb2keudh9ado1MlGgcXP3WOtvd3b6uHePO8flwtN7FMWe5jwlSR5TKrGK6qmhYiUwmuM6EYTmvJVEkFwRJLVeQVA6iSLMOK5MEDzRlISOfoatxrnNfcSe1B7qTpOpCek4ImWVIGUTmKpDXOWVA86MTgw1uhG04wH8Lje_4dHh_C42N4gU1_sQerW2H7f6jbkYLg_UODHV6DTkKt7fBZbfSf_BenZoyQ |
| CitedBy_id | crossref_primary_10_1109_TPDS_2022_3208082 crossref_primary_10_1007_s10586_023_03988_x crossref_primary_10_1137_23M1559129 crossref_primary_10_1016_j_eswa_2024_124677 crossref_primary_10_1145_3571808 crossref_primary_10_1007_s10766_024_00781_0 |
| Cites_doi | 10.1137/1.9781611974317.15 10.1109/TPAMI.2010.88 10.1137/1.9781611976137.4 10.1109/71.780863 10.1137/0611030 10.1137/S0895479801384019 10.1016/j.jpdc.2014.07.003 10.1109/IPDPS.2006.1639359 10.1137/15M1026183 10.1109/TPDS.2015.2412545 10.1145/2049662.2049663 10.1155/2020/3042642 10.1016/j.jcp.2006.02.007 10.1137/S0036144502409019 10.21136/CMJ.1973.101168 10.1137/0916028 10.1007/3-540-61142-8_588 10.1145/1527286.1527287 10.1137/S1064827595287997 10.1109/TPDS.2017.2671868 10.1137/090771430 10.1007/3-540-47789-6_66 10.1109/34.868688 10.1145/217474.217529 10.1137/S1064827500366124 |
| ContentType | Journal Article |
| Copyright | 2021 Elsevier B.V. |
| Copyright_xml | – notice: 2021 Elsevier B.V. |
| DBID | AAYXX CITATION OTOTI |
| DOI | 10.1016/j.parco.2021.102769 |
| DatabaseName | CrossRef OSTI.GOV |
| DatabaseTitle | CrossRef |
| DatabaseTitleList | |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Computer Science |
| EISSN | 1872-7336 |
| ExternalDocumentID | 1862429 10_1016_j_parco_2021_102769 S0167819121000272 |
| GroupedDBID | --K --M -~X .DC .~1 0R~ 123 1B1 1~. 1~5 29O 4.4 457 4G. 5VS 6OB 7-5 71M 8P~ 9JN AACTN AAEDT AAEDW AAIAV AAIKJ AAKOC AALRI AAOAW AAQFI AAQXK AAXUO AAYFN ABBOA ABEFU ABFNM ABJNI ABMAC ABXDB ABYKQ ACDAQ ACGFS ACNNM ACRLP ACZNC ADBBV ADEZE ADJOM ADMUD ADTZH AEBSH AECPX AEKER AENEX AFKWA AFTJW AGHFR AGUBO AGYEJ AHHHB AHJVU AHZHX AIALX AIEXJ AIKHN AITUG AJBFU AJOXV ALMA_UNASSIGNED_HOLDINGS AMFUW AMRAJ AOUOD ASPBG AVWKF AXJTR AZFZN BJAXD BKOJK BLXMC CS3 DU5 EBS EFJIC EFLBG EJD EO8 EO9 EP2 EP3 F5P FDB FEDTE FGOYB FIRID FNPLU FYGXN G-Q G8K GBLVA GBOLZ HLZ HVGLF HZ~ H~9 IHE J1W JJJVA KOM LG9 M41 MO0 N9A O-L O9- OAUVE OZT P-8 P-9 P2P PC. Q38 R2- RIG ROL RPZ SBC SCC SDF SDG SDP SES SEW SPC SPCBC SST SSV SSZ T5K WH7 WUQ XPP ZMT ~G- 9DU AATTM AAXKI AAYWO AAYXX ABDPE ABWVN ACLOT ACRPL ACVFH ADCNI ADNMO AEIPS AEUPX AFJKZ AFPUW AGQPQ AIGII AIIUN AKBMS AKRWK AKYEP ANKPU APXCP CITATION EFKBS ~HD AALMO ABPIF ABPTK OTOTI |
| ID | FETCH-LOGICAL-c375t-17f845c4b70bbd68a4f20d4af703dc7f96a10c0cf85b7eeb2440f15016657ece3 |
| ISICitedReferencesCount | 9 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000687400800001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 0167-8191 |
| IngestDate | Thu May 18 22:32:39 EDT 2023 Sat Nov 29 07:24:11 EST 2025 Tue Nov 18 22:23:06 EST 2025 Fri Feb 23 02:43:33 EST 2024 |
| IsDoiOpenAccess | false |
| IsOpenAccess | true |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | C |
| Keywords | Graph partitioning Distributed-memory systems Spectral partitioning GPUs |
| Language | English |
| LinkModel | OpenURL |
| MergedId | FETCHMERGED-LOGICAL-c375t-17f845c4b70bbd68a4f20d4af703dc7f96a10c0cf85b7eeb2440f15016657ece3 |
| Notes | 17-SC-20-SC; NA-0003525; AC05-00OR22725 USDOE National Nuclear Security Administration (NNSA) |
| ORCID | 0000-0003-2247-1914 0000-0003-3951-3930 0000000322471914 0000000339513930 |
| OpenAccessLink | https://www.osti.gov/biblio/1862429 |
| ParticipantIDs | osti_scitechconnect_1862429 crossref_citationtrail_10_1016_j_parco_2021_102769 crossref_primary_10_1016_j_parco_2021_102769 elsevier_sciencedirect_doi_10_1016_j_parco_2021_102769 |
| PublicationCentury | 2000 |
| PublicationDate | September 2021 2021-09-00 2021-09-01 |
| PublicationDateYYYYMMDD | 2021-09-01 |
| PublicationDate_xml | – month: 09 year: 2021 text: September 2021 |
| PublicationDecade | 2020 |
| PublicationPlace | Netherlands |
| PublicationPlace_xml | – name: Netherlands |
| PublicationTitle | Parallel computing |
| PublicationYear | 2021 |
| Publisher | Elsevier B.V Elsevier |
| Publisher_xml | – name: Elsevier B.V – name: Elsevier |
| References | Balay, Abhyankar, Adams, Brown, Brune, Buschelman, Dalcin, Dener, Eijkhout, Gropp, Karpeyev, Kaushik, Knepley, May, McInnes, Mills, Munson, Rupp, Sanan, Smith, Zampini, Zhang, Zhang (b30) 2019 Knyazev (b27) 2001; 23 Catalyurek, Aykanat (b2) 1999; 10 Berger-Vergiat, Glusa, Hu, Siefert, Tuminaro, Mayr, Prokopenko, Wiesner (b39) 2019 Boman, Devine, Lehoucq, Slattengren, Thornquist (b31) 2014 Baker, Heroux (b32) 2012; 20 Acer, Boman, Rajamanickam (b11) 2020 Meyerhenke, Sanders, Schulz (b22) 2017; 28 Loe, Morgan (b37) 2019 Boman, Devine, Leung, Rajamanickam, Riesen, Deveci, Catalyurek (b14) 2012 Vastenhouw, Bisseling (b26) 2005; 47 Prokopenko, Siefert, Hu, Hoemmen, Klinvex (b35) 2016 URL Boman, Deweese, Gilbert (b44) 2016 Naumov, Moon (b13) 2016 Shi, Malik (b8) 2000; 22 Karypis, Kumar (b1) 1998; 20 Hetmaniuk, Lehoucq (b28) 2006; 218 Karypis, Kumar (b5) 1997 Pothen, Simon, Liou (b7) 1990; 11 Hendrickson, Leland (b17) 1993 E.G. Boman, K. Deweese, J.R. Gilbert, An empirical comparison of graph laplacian solvers, in: Proceedings of the Eighteenth Workshop on Algorithm Engineering and Experiments, ALENEX 2016, Arlington, Virginia, USA, January 10, 2016, 2016, pp. 174–188. K.D. Devine, E.G. Boman, R.T. Heaphy, R.H. Bisseling, U.V. Catalyurek, Parallel hypergraph partitioning for scientific computing, in: Proceedings 20th IEEE International Parallel Distributed Processing Symposium, 2006, pp. 10 pp.–. Slota, Madduri, Rajamanickam (b24) 2016; 38 Sanders, Schulz (b21) 2013; vol. 7933 Falgout, Yang (b29) 2002 C.J. Alpert, S.-Z. Yao, Spectral partitioning: The more eigenvectors the better, in: Proc. ACM/IEEE Design Automation Conf., 1995. Davis, Hu (b40) 2011; 38 Baker, Hetmaniuk, Lehoucq, Thornquist (b33) 2009; 36 Spielman, Teng (b43) 2014; 35 Chen, Song, Bai, Lin, Chang (b20) 2011; 33 (b9) 2021 Fiedler (b15) 1973; 23 . Pellegrini, Roman (b6) 1996 Edwards, Trott, Sunderland (b10) 2014; 74 Slota, Rajamanickam, Devine, Madduri (b12) 2017 J.A. Loe, H.K. Thornquist, E.G. Boman, Polynomial preconditioned GMRES in Trilinos: Practical considerations for high-performance computing, in: Proceedings of the 2020 SIAM Conference on Parallel Processing for Scientific Computing, pp. 35–45. G.M. Slota, Private communication, 2020. Deveci, Rajamanickam, Devine, Çatalyürek (b3) 2016; 27 Slota, Madduri, Rajamanickam (b23) 2014 Bavier, Hoemmen, Rajamanickam, Thornquist (b38) 2012; 20 Hendrickson, Leland (b19) 1995; 16 Bern, Gilbert, Hendrickson, Nguyen, Toledo (b42) 2006; 27 Donath, Hoffman (b16) 1972; 15 Hendrickson, Leland (b4) 1995 Slota (10.1016/j.parco.2021.102769_b23) 2014 Catalyurek (10.1016/j.parco.2021.102769_b2) 1999; 10 Chen (10.1016/j.parco.2021.102769_b20) 2011; 33 Acer (10.1016/j.parco.2021.102769_b11) 2020 Meyerhenke (10.1016/j.parco.2021.102769_b22) 2017; 28 Fiedler (10.1016/j.parco.2021.102769_b15) 1973; 23 10.1016/j.parco.2021.102769_b34 Hendrickson (10.1016/j.parco.2021.102769_b19) 1995; 16 Prokopenko (10.1016/j.parco.2021.102769_b35) 2016 Pellegrini (10.1016/j.parco.2021.102769_b6) 1996 10.1016/j.parco.2021.102769_b36 Hendrickson (10.1016/j.parco.2021.102769_b4) 1995 Vastenhouw (10.1016/j.parco.2021.102769_b26) 2005; 47 Deveci (10.1016/j.parco.2021.102769_b3) 2016; 27 Naumov (10.1016/j.parco.2021.102769_b13) 2016 Davis (10.1016/j.parco.2021.102769_b40) 2011; 38 Spielman (10.1016/j.parco.2021.102769_b43) 2014; 35 Donath (10.1016/j.parco.2021.102769_b16) 1972; 15 Slota (10.1016/j.parco.2021.102769_b24) 2016; 38 Pothen (10.1016/j.parco.2021.102769_b7) 1990; 11 Boman (10.1016/j.parco.2021.102769_b31) 2014 Baker (10.1016/j.parco.2021.102769_b32) 2012; 20 Hendrickson (10.1016/j.parco.2021.102769_b17) 1993 Karypis (10.1016/j.parco.2021.102769_b1) 1998; 20 10.1016/j.parco.2021.102769_b41 Boman (10.1016/j.parco.2021.102769_b44) 2016 Edwards (10.1016/j.parco.2021.102769_b10) 2014; 74 Balay (10.1016/j.parco.2021.102769_b30) 2019 Slota (10.1016/j.parco.2021.102769_b12) 2017 Boman (10.1016/j.parco.2021.102769_b14) 2012 Sanders (10.1016/j.parco.2021.102769_b21) 2013; vol. 7933 10.1016/j.parco.2021.102769_b25 Loe (10.1016/j.parco.2021.102769_b37) 2019 Bern (10.1016/j.parco.2021.102769_b42) 2006; 27 Bavier (10.1016/j.parco.2021.102769_b38) 2012; 20 10.1016/j.parco.2021.102769_b18 Knyazev (10.1016/j.parco.2021.102769_b27) 2001; 23 Karypis (10.1016/j.parco.2021.102769_b5) 1997 Falgout (10.1016/j.parco.2021.102769_b29) 2002 Shi (10.1016/j.parco.2021.102769_b8) 2000; 22 (10.1016/j.parco.2021.102769_b9) 2021 Berger-Vergiat (10.1016/j.parco.2021.102769_b39) 2019 Hetmaniuk (10.1016/j.parco.2021.102769_b28) 2006; 218 Baker (10.1016/j.parco.2021.102769_b33) 2009; 36 |
| References_xml | – reference: K.D. Devine, E.G. Boman, R.T. Heaphy, R.H. Bisseling, U.V. Catalyurek, Parallel hypergraph partitioning for scientific computing, in: Proceedings 20th IEEE International Parallel Distributed Processing Symposium, 2006, pp. 10 pp.–. – volume: 27 start-page: 803 year: 2016 end-page: 817 ident: b3 article-title: Multi-jagged: A scalable parallel spatial partitioning algorithm publication-title: IEEE Trans. Parallel Distrib. Syst. – volume: 20 start-page: 115 year: 2012 end-page: 128 ident: b32 article-title: Tpetra, and the use of generic programming in scientific computing publication-title: Sci. Program. – start-page: 440 year: 2020 end-page: 449 ident: b11 article-title: SPHYNX: Spectral Partitioning for HYbrid aNd aXelerator-enabled systems publication-title: 2020 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) – year: 2019 ident: b37 article-title: New polynomial preconditioned GMRES – volume: 20 start-page: 359 year: 1998 end-page: 392 ident: b1 article-title: A fast and high quality multilevel scheme for partitioning irregular graphs publication-title: SIAM J. Sci. Comput. – year: 1995 ident: b4 article-title: A multilevel algorithm for partitioning graphs publication-title: Proc. Supercomputing ’95 – year: 2016 ident: b13 article-title: Parallel Spectral Graph Partitioning – volume: 38 start-page: S620 year: 2016 end-page: S645 ident: b24 article-title: Complex network partitioning using label propagation publication-title: SIAM J. Sci. Comput. – start-page: 493 year: 1996 end-page: 498 ident: b6 article-title: Scotch: A software package for static mapping by dual recursive bipartitioning of process and architecture graphs publication-title: High-Performance Computing and Networking – year: 2019 ident: b39 article-title: Muelu user’s guide. – volume: 27 start-page: 930 year: 2006 end-page: 951 ident: b42 article-title: Support-graph preconditioners publication-title: SIAM J. Matrix Anal. Appl. – reference: C.J. Alpert, S.-Z. Yao, Spectral partitioning: The more eigenvectors the better, in: Proc. ACM/IEEE Design Automation Conf., 1995. – volume: 33 start-page: 568 year: 2011 end-page: 586 ident: b20 article-title: Parallel spectral clustering in distributed systems publication-title: IEEE Trans. Pattern Anal. Mach. Intell. – volume: vol. 7933 start-page: 164 year: 2013 end-page: 175 ident: b21 article-title: Think locally, act globally: Highly balanced graph partitioning publication-title: Proceedings of the 12th International Symposium on Experimental Algorithms (SEA’13) – year: 2016 ident: b35 article-title: Ifpack2 User’s Guide 1.0 – volume: 218 start-page: 324 year: 2006 end-page: 332 ident: b28 article-title: Basis selection in LOBPCG publication-title: J. Comput. Phys. – year: 2014 ident: b31 article-title: Installing the Anasazi Eigensolver Package with Applications to Some Graph Eigenvalue Problems – reference: URL – volume: 11 start-page: 430 year: 1990 end-page: 452 ident: b7 article-title: Partitioning sparse matrices with eigenvectors of graphs publication-title: SIAM J. Matrix Anal. – volume: 20 start-page: 241 year: 2012 end-page: 255 ident: b38 article-title: Amesos2 and Belos: Direct and iterative solvers for large sparse linear systems publication-title: Sci. Program. – start-page: 174 year: 2016 end-page: 188 ident: b44 article-title: An empirical comparison of graph Laplacian solvers publication-title: 2016 Proceedings of the Meeting on Algorithm Engineering and Experiments (ALENEX) – reference: G.M. Slota, Private communication, 2020. – volume: 35 start-page: 835 year: 2014 end-page: 885 ident: b43 article-title: Nearly linear time algorithms for preconditioning and solving symmetric, diagonally dominant linear systems publication-title: SIAM J. Matrix Anal. Appl. – volume: 23 start-page: 517 year: 2001 end-page: 541 ident: b27 article-title: Toward the optimal preconditioned eigensolver: Locally optimal block preconditioned conjugate gradient method publication-title: SIAM J. Sci. Comput. – start-page: 632 year: 2002 end-page: 641 ident: b29 article-title: Hypre: A library of high performance preconditioners publication-title: Computational Science — ICCS 2002 – volume: 22 start-page: 888 year: 2000 end-page: 905 ident: b8 article-title: Normalized cuts and image segmentation publication-title: IEEE Trans. Pattern Anal. Mach. Intell. – volume: 23 start-page: 298 year: 1973 end-page: 305 ident: b15 article-title: Algebraic connectivity of graphs publication-title: Czechoslov. Math. J. – volume: 47 start-page: 67 year: 2005 end-page: 95 ident: b26 article-title: A two-dimensional data distribution method for parallel sparse matrix-vector multiplication publication-title: SIAM Rev. – year: 1997 ident: b5 article-title: Parmetis: Parallel Graph Partitioning and Sparse Matrix Ordering Library – volume: 74 start-page: 3202 year: 2014 end-page: 3216 ident: b10 article-title: Kokkos: Enabling manycore performance portability through polymorphic memory access patterns publication-title: J. Parallel Distrib. Comput. – volume: 15 start-page: 938 year: 1972 end-page: 944 ident: b16 article-title: Algorithms for partitioning of graphs and computer logic based on eigenvectors of connection matrices publication-title: IBM Tech. Discl. Bull. – year: 2021 ident: b9 article-title: The Trilinos Project Website – start-page: 646 year: 2017 end-page: 655 ident: b12 article-title: Partitioning trillion-edge graphs in minutes publication-title: 2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS) – volume: 38 start-page: 1:1 year: 2011 end-page: 1:25 ident: b40 article-title: The University of Florida sparse matrix collection publication-title: ACM Trans. Math. Software – reference: . – volume: 28 start-page: 2625 year: 2017 end-page: 2638 ident: b22 article-title: Parallel graph partitioning for complex networks publication-title: IEEE Trans. Parallel Distrib. Syst. (TPDS) – volume: 10 start-page: 673 year: 1999 end-page: 693 ident: b2 article-title: Hypergraph-partitioning-based decomposition for parallel sparse-matrix vector multiplication publication-title: IEEE Trans. Parallel Distrib. Syst. – reference: J.A. Loe, H.K. Thornquist, E.G. Boman, Polynomial preconditioned GMRES in Trilinos: Practical considerations for high-performance computing, in: Proceedings of the 2020 SIAM Conference on Parallel Processing for Scientific Computing, pp. 35–45. – start-page: 481 year: 2014 end-page: 490 ident: b23 article-title: PuLP: Scalable multi-objective multi-constraint partitioning for small-world networks publication-title: 2014 IEEE International Conference on Big Data (Big Data) – volume: 36 start-page: 13:1 year: 2009 end-page: 13:23 ident: b33 article-title: Anasazi software for the numerical solution of large-scale eigenvalue problems publication-title: ACM Trans. Math. Software – volume: 16 start-page: 452 year: 1995 end-page: 469 ident: b19 article-title: An improved spectral graph partitioning algorithm for mapping parallel computations publication-title: SIAM J. Sci. Comput. – year: 2019 ident: b30 article-title: PETSc Users Manual – year: 2012 ident: b14 article-title: Zoltan2: Next-Generation Combinatorial Toolkit – reference: E.G. Boman, K. Deweese, J.R. Gilbert, An empirical comparison of graph laplacian solvers, in: Proceedings of the Eighteenth Workshop on Algorithm Engineering and Experiments, ALENEX 2016, Arlington, Virginia, USA, January 10, 2016, 2016, pp. 174–188. – year: 1993 ident: b17 article-title: The Chaco User’s Guide – ident: 10.1016/j.parco.2021.102769_b34 doi: 10.1137/1.9781611974317.15 – start-page: 481 year: 2014 ident: 10.1016/j.parco.2021.102769_b23 article-title: PuLP: Scalable multi-objective multi-constraint partitioning for small-world networks – volume: 33 start-page: 568 issue: 3 year: 2011 ident: 10.1016/j.parco.2021.102769_b20 article-title: Parallel spectral clustering in distributed systems publication-title: IEEE Trans. Pattern Anal. Mach. Intell. doi: 10.1109/TPAMI.2010.88 – start-page: 174 year: 2016 ident: 10.1016/j.parco.2021.102769_b44 article-title: An empirical comparison of graph Laplacian solvers – year: 2016 ident: 10.1016/j.parco.2021.102769_b13 – volume: 20 start-page: 241 issue: 3 year: 2012 ident: 10.1016/j.parco.2021.102769_b38 article-title: Amesos2 and Belos: Direct and iterative solvers for large sparse linear systems publication-title: Sci. Program. – year: 2019 ident: 10.1016/j.parco.2021.102769_b30 – volume: 15 start-page: 938 year: 1972 ident: 10.1016/j.parco.2021.102769_b16 article-title: Algorithms for partitioning of graphs and computer logic based on eigenvectors of connection matrices publication-title: IBM Tech. Discl. Bull. – ident: 10.1016/j.parco.2021.102769_b36 doi: 10.1137/1.9781611976137.4 – year: 1993 ident: 10.1016/j.parco.2021.102769_b17 – year: 2016 ident: 10.1016/j.parco.2021.102769_b35 – volume: 10 start-page: 673 issue: 7 year: 1999 ident: 10.1016/j.parco.2021.102769_b2 article-title: Hypergraph-partitioning-based decomposition for parallel sparse-matrix vector multiplication publication-title: IEEE Trans. Parallel Distrib. Syst. doi: 10.1109/71.780863 – volume: 11 start-page: 430 issue: 3 year: 1990 ident: 10.1016/j.parco.2021.102769_b7 article-title: Partitioning sparse matrices with eigenvectors of graphs publication-title: SIAM J. Matrix Anal. doi: 10.1137/0611030 – year: 2019 ident: 10.1016/j.parco.2021.102769_b39 – year: 2014 ident: 10.1016/j.parco.2021.102769_b31 – volume: 27 start-page: 930 issue: 4 year: 2006 ident: 10.1016/j.parco.2021.102769_b42 article-title: Support-graph preconditioners publication-title: SIAM J. Matrix Anal. Appl. doi: 10.1137/S0895479801384019 – year: 1997 ident: 10.1016/j.parco.2021.102769_b5 – volume: 74 start-page: 3202 issn: 0743-7315 issue: 12 year: 2014 ident: 10.1016/j.parco.2021.102769_b10 article-title: Kokkos: Enabling manycore performance portability through polymorphic memory access patterns publication-title: J. Parallel Distrib. Comput. doi: 10.1016/j.jpdc.2014.07.003 – ident: 10.1016/j.parco.2021.102769_b25 doi: 10.1109/IPDPS.2006.1639359 – volume: 38 start-page: S620 issue: 5 year: 2016 ident: 10.1016/j.parco.2021.102769_b24 article-title: Complex network partitioning using label propagation publication-title: SIAM J. Sci. Comput. doi: 10.1137/15M1026183 – volume: 27 start-page: 803 issue: 3 year: 2016 ident: 10.1016/j.parco.2021.102769_b3 article-title: Multi-jagged: A scalable parallel spatial partitioning algorithm publication-title: IEEE Trans. Parallel Distrib. Syst. doi: 10.1109/TPDS.2015.2412545 – volume: 38 start-page: 1:1 issn: 0098-3500 issue: 1 year: 2011 ident: 10.1016/j.parco.2021.102769_b40 article-title: The University of Florida sparse matrix collection publication-title: ACM Trans. Math. Software doi: 10.1145/2049662.2049663 – ident: 10.1016/j.parco.2021.102769_b41 doi: 10.1155/2020/3042642 – start-page: 440 year: 2020 ident: 10.1016/j.parco.2021.102769_b11 article-title: SPHYNX: Spectral Partitioning for HYbrid aNd aXelerator-enabled systems – volume: 20 start-page: 115 issue: 2 year: 2012 ident: 10.1016/j.parco.2021.102769_b32 article-title: Tpetra, and the use of generic programming in scientific computing publication-title: Sci. Program. – volume: 218 start-page: 324 issn: 0021-9991 issue: 1 year: 2006 ident: 10.1016/j.parco.2021.102769_b28 article-title: Basis selection in LOBPCG publication-title: J. Comput. Phys. doi: 10.1016/j.jcp.2006.02.007 – volume: 47 start-page: 67 issue: 1 year: 2005 ident: 10.1016/j.parco.2021.102769_b26 article-title: A two-dimensional data distribution method for parallel sparse matrix-vector multiplication publication-title: SIAM Rev. doi: 10.1137/S0036144502409019 – volume: 23 start-page: 298 issue: 98 year: 1973 ident: 10.1016/j.parco.2021.102769_b15 article-title: Algebraic connectivity of graphs publication-title: Czechoslov. Math. J. doi: 10.21136/CMJ.1973.101168 – volume: 16 start-page: 452 issue: 2 year: 1995 ident: 10.1016/j.parco.2021.102769_b19 article-title: An improved spectral graph partitioning algorithm for mapping parallel computations publication-title: SIAM J. Sci. Comput. doi: 10.1137/0916028 – year: 1995 ident: 10.1016/j.parco.2021.102769_b4 article-title: A multilevel algorithm for partitioning graphs – start-page: 493 year: 1996 ident: 10.1016/j.parco.2021.102769_b6 article-title: Scotch: A software package for static mapping by dual recursive bipartitioning of process and architecture graphs doi: 10.1007/3-540-61142-8_588 – volume: vol. 7933 start-page: 164 year: 2013 ident: 10.1016/j.parco.2021.102769_b21 article-title: Think locally, act globally: Highly balanced graph partitioning – year: 2012 ident: 10.1016/j.parco.2021.102769_b14 – year: 2021 ident: 10.1016/j.parco.2021.102769_b9 – volume: 36 start-page: 13:1 issn: 0098-3500 issue: 3 year: 2009 ident: 10.1016/j.parco.2021.102769_b33 article-title: Anasazi software for the numerical solution of large-scale eigenvalue problems publication-title: ACM Trans. Math. Software doi: 10.1145/1527286.1527287 – start-page: 646 year: 2017 ident: 10.1016/j.parco.2021.102769_b12 article-title: Partitioning trillion-edge graphs in minutes – volume: 20 start-page: 359 issue: 1 year: 1998 ident: 10.1016/j.parco.2021.102769_b1 article-title: A fast and high quality multilevel scheme for partitioning irregular graphs publication-title: SIAM J. Sci. Comput. doi: 10.1137/S1064827595287997 – volume: 28 start-page: 2625 issue: 9 year: 2017 ident: 10.1016/j.parco.2021.102769_b22 article-title: Parallel graph partitioning for complex networks publication-title: IEEE Trans. Parallel Distrib. Syst. (TPDS) doi: 10.1109/TPDS.2017.2671868 – year: 2019 ident: 10.1016/j.parco.2021.102769_b37 – volume: 35 start-page: 835 issue: 3 year: 2014 ident: 10.1016/j.parco.2021.102769_b43 article-title: Nearly linear time algorithms for preconditioning and solving symmetric, diagonally dominant linear systems publication-title: SIAM J. Matrix Anal. Appl. doi: 10.1137/090771430 – start-page: 632 year: 2002 ident: 10.1016/j.parco.2021.102769_b29 article-title: Hypre: A library of high performance preconditioners doi: 10.1007/3-540-47789-6_66 – volume: 22 start-page: 888 issn: 0162-8828 issue: 8 year: 2000 ident: 10.1016/j.parco.2021.102769_b8 article-title: Normalized cuts and image segmentation publication-title: IEEE Trans. Pattern Anal. Mach. Intell. doi: 10.1109/34.868688 – ident: 10.1016/j.parco.2021.102769_b18 doi: 10.1145/217474.217529 – volume: 23 start-page: 517 issue: 2 year: 2001 ident: 10.1016/j.parco.2021.102769_b27 article-title: Toward the optimal preconditioned eigensolver: Locally optimal block preconditioned conjugate gradient method publication-title: SIAM J. Sci. Comput. doi: 10.1137/S1064827500366124 |
| SSID | ssj0006480 |
| Score | 2.3651972 |
| Snippet | Graph partitioning has been an important tool to partition the work among several processors to minimize the communication cost and balance the workload. While... |
| SourceID | osti crossref elsevier |
| SourceType | Open Access Repository Enrichment Source Index Database Publisher |
| StartPage | 102769 |
| SubjectTerms | Distributed-memory systems GPUs Graph partitioning Spectral partitioning |
| Title | Sphynx: A parallel multi-GPU graph partitioner for distributed-memory systems |
| URI | https://dx.doi.org/10.1016/j.parco.2021.102769 https://www.osti.gov/biblio/1862429 |
| Volume | 106 |
| WOSCitedRecordID | wos000687400800001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVESC databaseName: Elsevier SD Freedom Collection Journals 2021 customDbUrl: eissn: 1872-7336 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0006480 issn: 0167-8191 databaseCode: AIEXJ dateStart: 19950101 isFulltext: true titleUrlDefault: https://www.sciencedirect.com providerName: Elsevier |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV3db9MwELeqjgde-EaMAfIDbyNTnDhxwltBY4DEVNFN6lvkOLboGrJq7ary33Pnj6yiYgIkXqIqqe3o7pe7i3P3O0Jex0bxUjRZpKQwES-aNCrBU0SZSbDMskZOdNtsQpyeFtNpOR4MFqEWZt2Kris2m3LxX1UN50DZWDr7F-ruJ4UT8BuUDkdQOxz_SPETEFy3cRXnSOzdtrp1eYPRyfj80DJU44XAUmQTDRvkz8XWV7qJvmPy7Q_P8bzcjl7HYTple0EEr4d4Ub6Bl_52k-777tJvr4KxnR-eHPW5Pu31Ut4wG6CJGfUXv0qkaupmau6gOpmt5VJ2c1i6296iSFifg-X3zXZqZ9xWJphofF10nsiZ30JAvJ86SpTePltKgl1b77YdLo5AYgrLOBOGPBTCdX75hUR7gqvhYgmzH1vBae8lIivBDu6NPh1PP_feO-e2215_d4GpyuYE7iz1u2hmeAnS2wpUzh6Qe_4Ng44cMh6Sge4ekfuhewf1xvwx-eKA8paOaIAJ7WFCLUzoFkwowITuwoR6mDwh5x-Oz95_jHxzjUilIltFTJiCZ4rXIq7rJi8kh6ez4dKAC2iUMGUuWaxiZYqsFlrXEAbGBt4eGH6q00qnT8mwg-WfEZpxVZiGG8ZlCi5BSQUz6ryJtWCKxcU-SYKQKuWZ57EBSluFFMOLykq2QslWTrL75E0_aOGIV27_ex6kX_nY0cWEFcDl9oEHqCschKzJCtPLYBSzhVPl83-d9oDcZSX2dMIn4QUZrq6u9UtyR61Xs-XVKw-6nywinko |
| linkProvider | Elsevier |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Sphynx%3A+A+parallel+multi-GPU+graph+partitioner+for+distributed-memory+systems&rft.jtitle=Parallel+computing&rft.au=Acer%2C+Seher&rft.au=Boman%2C+Erik+G.&rft.au=Glusa%2C+Christian+A.&rft.au=Rajamanickam%2C+Sivasankaran&rft.date=2021-09-01&rft.pub=Elsevier+B.V&rft.issn=0167-8191&rft.eissn=1872-7336&rft.volume=106&rft_id=info:doi/10.1016%2Fj.parco.2021.102769&rft.externalDocID=S0167819121000272 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0167-8191&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0167-8191&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0167-8191&client=summon |