Sphynx: A parallel multi-GPU graph partitioner for distributed-memory systems

Graph partitioning has been an important tool to partition the work among several processors to minimize the communication cost and balance the workload. While accelerator-based supercomputers are emerging to be the standard, the use of graph partitioning becomes even more important as applications...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Parallel computing Jg. 106; H. C; S. 102769
Hauptverfasser: Acer, Seher, Boman, Erik G., Glusa, Christian A., Rajamanickam, Sivasankaran
Format: Journal Article
Sprache:Englisch
Veröffentlicht: Netherlands Elsevier B.V 01.09.2021
Elsevier
Schlagworte:
ISSN:0167-8191, 1872-7336
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Abstract Graph partitioning has been an important tool to partition the work among several processors to minimize the communication cost and balance the workload. While accelerator-based supercomputers are emerging to be the standard, the use of graph partitioning becomes even more important as applications are rapidly moving to these architectures. However, there is no distributed-memory-parallel, multi-GPU graph partitioner available for applications. We developed a spectral graph partitioner, Sphynx, using the portable, accelerator-friendly stack of the Trilinos framework. In Sphynx, we allow using different preconditioners and exploit their unique advantages. We use Sphynx to systematically evaluate the various algorithmic choices in spectral partitioning with a focus on the GPU performance. We perform those evaluations on two distinct classes of graphs: regular (such as meshes, matrices from finite element methods) and irregular (such as social networks and web graphs), and show that different settings and preconditioners are needed for these graph classes. The experimental results on the Summit supercomputer show that Sphynx is the fastest alternative on irregular graphs in an application-friendly setting and obtains a partitioning quality close to ParMETIS on regular graphs. When compared to nvGRAPH on a single GPU, Sphynx is faster and obtains better balance and better quality partitions. Sphynx provides a good and robust partitioning method across a wide range of graphs for applications looking for a GPU-based partitioner. •Sphynx is the first multi-GPU graph partitioner on distributed-memory systems.•Sphynx uses a spectral method followed by a fast geometric method.•Sphynx is flexible and provides multiple preconditioners in the eigensolver.•Sphynx has been tuned for different preconditioners and graph types.
AbstractList Graph partitioning has been an important tool to partition the work among several processors to minimize the communication cost and balance the workload. While accelerator-based supercomputers are emerging to be the standard, the use of graph partitioning becomes even more important as applications are rapidly moving to these architectures. However, there is no distributed-memory-parallel, multi-GPU graph partitioner available for applications. We developed a spectral graph partitioner, Sphynx, using the portable, accelerator-friendly stack of the Trilinos framework. In Sphynx, we allow using different preconditioners and exploit their unique advantages. We use Sphynx to systematically evaluate the various algorithmic choices in spectral partitioning with a focus on the GPU performance. We perform those evaluations on two distinct classes of graphs: regular (such as meshes, matrices from finite element methods) and irregular (such as social networks and web graphs), and show that different settings and preconditioners are needed for these graph classes. The experimental results on the Summit supercomputer show that Sphynx is the fastest alternative on irregular graphs in an application-friendly setting and obtains a partitioning quality close to ParMETIS on regular graphs. When compared to nvGRAPH on a single GPU, Sphynx is faster and obtains better balance and better quality partitions. Sphynx provides a good and robust partitioning method across a wide range of graphs for applications looking for a GPU-based partitioner. •Sphynx is the first multi-GPU graph partitioner on distributed-memory systems.•Sphynx uses a spectral method followed by a fast geometric method.•Sphynx is flexible and provides multiple preconditioners in the eigensolver.•Sphynx has been tuned for different preconditioners and graph types.
ArticleNumber 102769
Author Boman, Erik G.
Rajamanickam, Sivasankaran
Glusa, Christian A.
Acer, Seher
Author_xml – sequence: 1
  givenname: Seher
  orcidid: 0000-0003-3951-3930
  surname: Acer
  fullname: Acer, Seher
  email: sacer@sandia.gov
– sequence: 2
  givenname: Erik G.
  surname: Boman
  fullname: Boman, Erik G.
  email: egboman@sandia.gov
– sequence: 3
  givenname: Christian A.
  orcidid: 0000-0003-2247-1914
  surname: Glusa
  fullname: Glusa, Christian A.
  email: caglusa@sandia.gov
– sequence: 4
  givenname: Sivasankaran
  surname: Rajamanickam
  fullname: Rajamanickam, Sivasankaran
  email: srajama@sandia.gov
BackLink https://www.osti.gov/biblio/1862429$$D View this record in Osti.gov
BookMark eNqFkM1OwzAQhC1UJNrCE3CJuKfY-bETJA5VBQWpCCTo2XKcNXWVxJXtIvL2OIQTBzittDvf7s7M0KQzHSB0SfCCYEKv94uDsNIsEpyQ0EkYLU_QlBQsiVma0gmaBhWLC1KSMzRzbo8xplmBp-jp9bDru8-baBmFFaJpoInaY-N1vH7ZRu9WHHbDwGuvw0kbKWOjWjtvdXX0UMcttMb2keudh9ado1MlGgcXP3WOtvd3b6uHePO8flwtN7FMWe5jwlSR5TKrGK6qmhYiUwmuM6EYTmvJVEkFwRJLVeQVA6iSLMOK5MEDzRlISOfoatxrnNfcSe1B7qTpOpCek4ImWVIGUTmKpDXOWVA86MTgw1uhG04wH8Lje_4dHh_C42N4gU1_sQerW2H7f6jbkYLg_UODHV6DTkKt7fBZbfSf_BenZoyQ
CitedBy_id crossref_primary_10_1109_TPDS_2022_3208082
crossref_primary_10_1007_s10586_023_03988_x
crossref_primary_10_1137_23M1559129
crossref_primary_10_1016_j_eswa_2024_124677
crossref_primary_10_1145_3571808
crossref_primary_10_1007_s10766_024_00781_0
Cites_doi 10.1137/1.9781611974317.15
10.1109/TPAMI.2010.88
10.1137/1.9781611976137.4
10.1109/71.780863
10.1137/0611030
10.1137/S0895479801384019
10.1016/j.jpdc.2014.07.003
10.1109/IPDPS.2006.1639359
10.1137/15M1026183
10.1109/TPDS.2015.2412545
10.1145/2049662.2049663
10.1155/2020/3042642
10.1016/j.jcp.2006.02.007
10.1137/S0036144502409019
10.21136/CMJ.1973.101168
10.1137/0916028
10.1007/3-540-61142-8_588
10.1145/1527286.1527287
10.1137/S1064827595287997
10.1109/TPDS.2017.2671868
10.1137/090771430
10.1007/3-540-47789-6_66
10.1109/34.868688
10.1145/217474.217529
10.1137/S1064827500366124
ContentType Journal Article
Copyright 2021 Elsevier B.V.
Copyright_xml – notice: 2021 Elsevier B.V.
DBID AAYXX
CITATION
OTOTI
DOI 10.1016/j.parco.2021.102769
DatabaseName CrossRef
OSTI.GOV
DatabaseTitle CrossRef
DatabaseTitleList
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISSN 1872-7336
ExternalDocumentID 1862429
10_1016_j_parco_2021_102769
S0167819121000272
GroupedDBID --K
--M
-~X
.DC
.~1
0R~
123
1B1
1~.
1~5
29O
4.4
457
4G.
5VS
6OB
7-5
71M
8P~
9JN
AACTN
AAEDT
AAEDW
AAIAV
AAIKJ
AAKOC
AALRI
AAOAW
AAQFI
AAQXK
AAXUO
AAYFN
ABBOA
ABEFU
ABFNM
ABJNI
ABMAC
ABXDB
ABYKQ
ACDAQ
ACGFS
ACNNM
ACRLP
ACZNC
ADBBV
ADEZE
ADJOM
ADMUD
ADTZH
AEBSH
AECPX
AEKER
AENEX
AFKWA
AFTJW
AGHFR
AGUBO
AGYEJ
AHHHB
AHJVU
AHZHX
AIALX
AIEXJ
AIKHN
AITUG
AJBFU
AJOXV
ALMA_UNASSIGNED_HOLDINGS
AMFUW
AMRAJ
AOUOD
ASPBG
AVWKF
AXJTR
AZFZN
BJAXD
BKOJK
BLXMC
CS3
DU5
EBS
EFJIC
EFLBG
EJD
EO8
EO9
EP2
EP3
F5P
FDB
FEDTE
FGOYB
FIRID
FNPLU
FYGXN
G-Q
G8K
GBLVA
GBOLZ
HLZ
HVGLF
HZ~
H~9
IHE
J1W
JJJVA
KOM
LG9
M41
MO0
N9A
O-L
O9-
OAUVE
OZT
P-8
P-9
P2P
PC.
Q38
R2-
RIG
ROL
RPZ
SBC
SCC
SDF
SDG
SDP
SES
SEW
SPC
SPCBC
SST
SSV
SSZ
T5K
WH7
WUQ
XPP
ZMT
~G-
9DU
AATTM
AAXKI
AAYWO
AAYXX
ABDPE
ABWVN
ACLOT
ACRPL
ACVFH
ADCNI
ADNMO
AEIPS
AEUPX
AFJKZ
AFPUW
AGQPQ
AIGII
AIIUN
AKBMS
AKRWK
AKYEP
ANKPU
APXCP
CITATION
EFKBS
~HD
AALMO
ABPIF
ABPTK
OTOTI
ID FETCH-LOGICAL-c375t-17f845c4b70bbd68a4f20d4af703dc7f96a10c0cf85b7eeb2440f15016657ece3
ISICitedReferencesCount 9
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000687400800001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 0167-8191
IngestDate Thu May 18 22:32:39 EDT 2023
Sat Nov 29 07:24:11 EST 2025
Tue Nov 18 22:23:06 EST 2025
Fri Feb 23 02:43:33 EST 2024
IsDoiOpenAccess false
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue C
Keywords Graph partitioning
Distributed-memory systems
Spectral partitioning
GPUs
Language English
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-c375t-17f845c4b70bbd68a4f20d4af703dc7f96a10c0cf85b7eeb2440f15016657ece3
Notes 17-SC-20-SC; NA-0003525; AC05-00OR22725
USDOE National Nuclear Security Administration (NNSA)
ORCID 0000-0003-2247-1914
0000-0003-3951-3930
0000000322471914
0000000339513930
OpenAccessLink https://www.osti.gov/biblio/1862429
ParticipantIDs osti_scitechconnect_1862429
crossref_citationtrail_10_1016_j_parco_2021_102769
crossref_primary_10_1016_j_parco_2021_102769
elsevier_sciencedirect_doi_10_1016_j_parco_2021_102769
PublicationCentury 2000
PublicationDate September 2021
2021-09-00
2021-09-01
PublicationDateYYYYMMDD 2021-09-01
PublicationDate_xml – month: 09
  year: 2021
  text: September 2021
PublicationDecade 2020
PublicationPlace Netherlands
PublicationPlace_xml – name: Netherlands
PublicationTitle Parallel computing
PublicationYear 2021
Publisher Elsevier B.V
Elsevier
Publisher_xml – name: Elsevier B.V
– name: Elsevier
References Balay, Abhyankar, Adams, Brown, Brune, Buschelman, Dalcin, Dener, Eijkhout, Gropp, Karpeyev, Kaushik, Knepley, May, McInnes, Mills, Munson, Rupp, Sanan, Smith, Zampini, Zhang, Zhang (b30) 2019
Knyazev (b27) 2001; 23
Catalyurek, Aykanat (b2) 1999; 10
Berger-Vergiat, Glusa, Hu, Siefert, Tuminaro, Mayr, Prokopenko, Wiesner (b39) 2019
Boman, Devine, Lehoucq, Slattengren, Thornquist (b31) 2014
Baker, Heroux (b32) 2012; 20
Acer, Boman, Rajamanickam (b11) 2020
Meyerhenke, Sanders, Schulz (b22) 2017; 28
Loe, Morgan (b37) 2019
Boman, Devine, Leung, Rajamanickam, Riesen, Deveci, Catalyurek (b14) 2012
Vastenhouw, Bisseling (b26) 2005; 47
Prokopenko, Siefert, Hu, Hoemmen, Klinvex (b35) 2016
URL
Boman, Deweese, Gilbert (b44) 2016
Naumov, Moon (b13) 2016
Shi, Malik (b8) 2000; 22
Karypis, Kumar (b1) 1998; 20
Hetmaniuk, Lehoucq (b28) 2006; 218
Karypis, Kumar (b5) 1997
Pothen, Simon, Liou (b7) 1990; 11
Hendrickson, Leland (b17) 1993
E.G. Boman, K. Deweese, J.R. Gilbert, An empirical comparison of graph laplacian solvers, in: Proceedings of the Eighteenth Workshop on Algorithm Engineering and Experiments, ALENEX 2016, Arlington, Virginia, USA, January 10, 2016, 2016, pp. 174–188.
K.D. Devine, E.G. Boman, R.T. Heaphy, R.H. Bisseling, U.V. Catalyurek, Parallel hypergraph partitioning for scientific computing, in: Proceedings 20th IEEE International Parallel Distributed Processing Symposium, 2006, pp. 10 pp.–.
Slota, Madduri, Rajamanickam (b24) 2016; 38
Sanders, Schulz (b21) 2013; vol. 7933
Falgout, Yang (b29) 2002
C.J. Alpert, S.-Z. Yao, Spectral partitioning: The more eigenvectors the better, in: Proc. ACM/IEEE Design Automation Conf., 1995.
Davis, Hu (b40) 2011; 38
Baker, Hetmaniuk, Lehoucq, Thornquist (b33) 2009; 36
Spielman, Teng (b43) 2014; 35
Chen, Song, Bai, Lin, Chang (b20) 2011; 33
(b9) 2021
Fiedler (b15) 1973; 23
.
Pellegrini, Roman (b6) 1996
Edwards, Trott, Sunderland (b10) 2014; 74
Slota, Rajamanickam, Devine, Madduri (b12) 2017
J.A. Loe, H.K. Thornquist, E.G. Boman, Polynomial preconditioned GMRES in Trilinos: Practical considerations for high-performance computing, in: Proceedings of the 2020 SIAM Conference on Parallel Processing for Scientific Computing, pp. 35–45.
G.M. Slota, Private communication, 2020.
Deveci, Rajamanickam, Devine, Çatalyürek (b3) 2016; 27
Slota, Madduri, Rajamanickam (b23) 2014
Bavier, Hoemmen, Rajamanickam, Thornquist (b38) 2012; 20
Hendrickson, Leland (b19) 1995; 16
Bern, Gilbert, Hendrickson, Nguyen, Toledo (b42) 2006; 27
Donath, Hoffman (b16) 1972; 15
Hendrickson, Leland (b4) 1995
Slota (10.1016/j.parco.2021.102769_b23) 2014
Catalyurek (10.1016/j.parco.2021.102769_b2) 1999; 10
Chen (10.1016/j.parco.2021.102769_b20) 2011; 33
Acer (10.1016/j.parco.2021.102769_b11) 2020
Meyerhenke (10.1016/j.parco.2021.102769_b22) 2017; 28
Fiedler (10.1016/j.parco.2021.102769_b15) 1973; 23
10.1016/j.parco.2021.102769_b34
Hendrickson (10.1016/j.parco.2021.102769_b19) 1995; 16
Prokopenko (10.1016/j.parco.2021.102769_b35) 2016
Pellegrini (10.1016/j.parco.2021.102769_b6) 1996
10.1016/j.parco.2021.102769_b36
Hendrickson (10.1016/j.parco.2021.102769_b4) 1995
Vastenhouw (10.1016/j.parco.2021.102769_b26) 2005; 47
Deveci (10.1016/j.parco.2021.102769_b3) 2016; 27
Naumov (10.1016/j.parco.2021.102769_b13) 2016
Davis (10.1016/j.parco.2021.102769_b40) 2011; 38
Spielman (10.1016/j.parco.2021.102769_b43) 2014; 35
Donath (10.1016/j.parco.2021.102769_b16) 1972; 15
Slota (10.1016/j.parco.2021.102769_b24) 2016; 38
Pothen (10.1016/j.parco.2021.102769_b7) 1990; 11
Boman (10.1016/j.parco.2021.102769_b31) 2014
Baker (10.1016/j.parco.2021.102769_b32) 2012; 20
Hendrickson (10.1016/j.parco.2021.102769_b17) 1993
Karypis (10.1016/j.parco.2021.102769_b1) 1998; 20
10.1016/j.parco.2021.102769_b41
Boman (10.1016/j.parco.2021.102769_b44) 2016
Edwards (10.1016/j.parco.2021.102769_b10) 2014; 74
Balay (10.1016/j.parco.2021.102769_b30) 2019
Slota (10.1016/j.parco.2021.102769_b12) 2017
Boman (10.1016/j.parco.2021.102769_b14) 2012
Sanders (10.1016/j.parco.2021.102769_b21) 2013; vol. 7933
10.1016/j.parco.2021.102769_b25
Loe (10.1016/j.parco.2021.102769_b37) 2019
Bern (10.1016/j.parco.2021.102769_b42) 2006; 27
Bavier (10.1016/j.parco.2021.102769_b38) 2012; 20
10.1016/j.parco.2021.102769_b18
Knyazev (10.1016/j.parco.2021.102769_b27) 2001; 23
Karypis (10.1016/j.parco.2021.102769_b5) 1997
Falgout (10.1016/j.parco.2021.102769_b29) 2002
Shi (10.1016/j.parco.2021.102769_b8) 2000; 22
(10.1016/j.parco.2021.102769_b9) 2021
Berger-Vergiat (10.1016/j.parco.2021.102769_b39) 2019
Hetmaniuk (10.1016/j.parco.2021.102769_b28) 2006; 218
Baker (10.1016/j.parco.2021.102769_b33) 2009; 36
References_xml – reference: K.D. Devine, E.G. Boman, R.T. Heaphy, R.H. Bisseling, U.V. Catalyurek, Parallel hypergraph partitioning for scientific computing, in: Proceedings 20th IEEE International Parallel Distributed Processing Symposium, 2006, pp. 10 pp.–.
– volume: 27
  start-page: 803
  year: 2016
  end-page: 817
  ident: b3
  article-title: Multi-jagged: A scalable parallel spatial partitioning algorithm
  publication-title: IEEE Trans. Parallel Distrib. Syst.
– volume: 20
  start-page: 115
  year: 2012
  end-page: 128
  ident: b32
  article-title: Tpetra, and the use of generic programming in scientific computing
  publication-title: Sci. Program.
– start-page: 440
  year: 2020
  end-page: 449
  ident: b11
  article-title: SPHYNX: Spectral Partitioning for HYbrid aNd aXelerator-enabled systems
  publication-title: 2020 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)
– year: 2019
  ident: b37
  article-title: New polynomial preconditioned GMRES
– volume: 20
  start-page: 359
  year: 1998
  end-page: 392
  ident: b1
  article-title: A fast and high quality multilevel scheme for partitioning irregular graphs
  publication-title: SIAM J. Sci. Comput.
– year: 1995
  ident: b4
  article-title: A multilevel algorithm for partitioning graphs
  publication-title: Proc. Supercomputing ’95
– year: 2016
  ident: b13
  article-title: Parallel Spectral Graph Partitioning
– volume: 38
  start-page: S620
  year: 2016
  end-page: S645
  ident: b24
  article-title: Complex network partitioning using label propagation
  publication-title: SIAM J. Sci. Comput.
– start-page: 493
  year: 1996
  end-page: 498
  ident: b6
  article-title: Scotch: A software package for static mapping by dual recursive bipartitioning of process and architecture graphs
  publication-title: High-Performance Computing and Networking
– year: 2019
  ident: b39
  article-title: Muelu user’s guide.
– volume: 27
  start-page: 930
  year: 2006
  end-page: 951
  ident: b42
  article-title: Support-graph preconditioners
  publication-title: SIAM J. Matrix Anal. Appl.
– reference: C.J. Alpert, S.-Z. Yao, Spectral partitioning: The more eigenvectors the better, in: Proc. ACM/IEEE Design Automation Conf., 1995.
– volume: 33
  start-page: 568
  year: 2011
  end-page: 586
  ident: b20
  article-title: Parallel spectral clustering in distributed systems
  publication-title: IEEE Trans. Pattern Anal. Mach. Intell.
– volume: vol. 7933
  start-page: 164
  year: 2013
  end-page: 175
  ident: b21
  article-title: Think locally, act globally: Highly balanced graph partitioning
  publication-title: Proceedings of the 12th International Symposium on Experimental Algorithms (SEA’13)
– year: 2016
  ident: b35
  article-title: Ifpack2 User’s Guide 1.0
– volume: 218
  start-page: 324
  year: 2006
  end-page: 332
  ident: b28
  article-title: Basis selection in LOBPCG
  publication-title: J. Comput. Phys.
– year: 2014
  ident: b31
  article-title: Installing the Anasazi Eigensolver Package with Applications to Some Graph Eigenvalue Problems
– reference: URL
– volume: 11
  start-page: 430
  year: 1990
  end-page: 452
  ident: b7
  article-title: Partitioning sparse matrices with eigenvectors of graphs
  publication-title: SIAM J. Matrix Anal.
– volume: 20
  start-page: 241
  year: 2012
  end-page: 255
  ident: b38
  article-title: Amesos2 and Belos: Direct and iterative solvers for large sparse linear systems
  publication-title: Sci. Program.
– start-page: 174
  year: 2016
  end-page: 188
  ident: b44
  article-title: An empirical comparison of graph Laplacian solvers
  publication-title: 2016 Proceedings of the Meeting on Algorithm Engineering and Experiments (ALENEX)
– reference: G.M. Slota, Private communication, 2020.
– volume: 35
  start-page: 835
  year: 2014
  end-page: 885
  ident: b43
  article-title: Nearly linear time algorithms for preconditioning and solving symmetric, diagonally dominant linear systems
  publication-title: SIAM J. Matrix Anal. Appl.
– volume: 23
  start-page: 517
  year: 2001
  end-page: 541
  ident: b27
  article-title: Toward the optimal preconditioned eigensolver: Locally optimal block preconditioned conjugate gradient method
  publication-title: SIAM J. Sci. Comput.
– start-page: 632
  year: 2002
  end-page: 641
  ident: b29
  article-title: Hypre: A library of high performance preconditioners
  publication-title: Computational Science — ICCS 2002
– volume: 22
  start-page: 888
  year: 2000
  end-page: 905
  ident: b8
  article-title: Normalized cuts and image segmentation
  publication-title: IEEE Trans. Pattern Anal. Mach. Intell.
– volume: 23
  start-page: 298
  year: 1973
  end-page: 305
  ident: b15
  article-title: Algebraic connectivity of graphs
  publication-title: Czechoslov. Math. J.
– volume: 47
  start-page: 67
  year: 2005
  end-page: 95
  ident: b26
  article-title: A two-dimensional data distribution method for parallel sparse matrix-vector multiplication
  publication-title: SIAM Rev.
– year: 1997
  ident: b5
  article-title: Parmetis: Parallel Graph Partitioning and Sparse Matrix Ordering Library
– volume: 74
  start-page: 3202
  year: 2014
  end-page: 3216
  ident: b10
  article-title: Kokkos: Enabling manycore performance portability through polymorphic memory access patterns
  publication-title: J. Parallel Distrib. Comput.
– volume: 15
  start-page: 938
  year: 1972
  end-page: 944
  ident: b16
  article-title: Algorithms for partitioning of graphs and computer logic based on eigenvectors of connection matrices
  publication-title: IBM Tech. Discl. Bull.
– year: 2021
  ident: b9
  article-title: The Trilinos Project Website
– start-page: 646
  year: 2017
  end-page: 655
  ident: b12
  article-title: Partitioning trillion-edge graphs in minutes
  publication-title: 2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS)
– volume: 38
  start-page: 1:1
  year: 2011
  end-page: 1:25
  ident: b40
  article-title: The University of Florida sparse matrix collection
  publication-title: ACM Trans. Math. Software
– reference: .
– volume: 28
  start-page: 2625
  year: 2017
  end-page: 2638
  ident: b22
  article-title: Parallel graph partitioning for complex networks
  publication-title: IEEE Trans. Parallel Distrib. Syst. (TPDS)
– volume: 10
  start-page: 673
  year: 1999
  end-page: 693
  ident: b2
  article-title: Hypergraph-partitioning-based decomposition for parallel sparse-matrix vector multiplication
  publication-title: IEEE Trans. Parallel Distrib. Syst.
– reference: J.A. Loe, H.K. Thornquist, E.G. Boman, Polynomial preconditioned GMRES in Trilinos: Practical considerations for high-performance computing, in: Proceedings of the 2020 SIAM Conference on Parallel Processing for Scientific Computing, pp. 35–45.
– start-page: 481
  year: 2014
  end-page: 490
  ident: b23
  article-title: PuLP: Scalable multi-objective multi-constraint partitioning for small-world networks
  publication-title: 2014 IEEE International Conference on Big Data (Big Data)
– volume: 36
  start-page: 13:1
  year: 2009
  end-page: 13:23
  ident: b33
  article-title: Anasazi software for the numerical solution of large-scale eigenvalue problems
  publication-title: ACM Trans. Math. Software
– volume: 16
  start-page: 452
  year: 1995
  end-page: 469
  ident: b19
  article-title: An improved spectral graph partitioning algorithm for mapping parallel computations
  publication-title: SIAM J. Sci. Comput.
– year: 2019
  ident: b30
  article-title: PETSc Users Manual
– year: 2012
  ident: b14
  article-title: Zoltan2: Next-Generation Combinatorial Toolkit
– reference: E.G. Boman, K. Deweese, J.R. Gilbert, An empirical comparison of graph laplacian solvers, in: Proceedings of the Eighteenth Workshop on Algorithm Engineering and Experiments, ALENEX 2016, Arlington, Virginia, USA, January 10, 2016, 2016, pp. 174–188.
– year: 1993
  ident: b17
  article-title: The Chaco User’s Guide
– ident: 10.1016/j.parco.2021.102769_b34
  doi: 10.1137/1.9781611974317.15
– start-page: 481
  year: 2014
  ident: 10.1016/j.parco.2021.102769_b23
  article-title: PuLP: Scalable multi-objective multi-constraint partitioning for small-world networks
– volume: 33
  start-page: 568
  issue: 3
  year: 2011
  ident: 10.1016/j.parco.2021.102769_b20
  article-title: Parallel spectral clustering in distributed systems
  publication-title: IEEE Trans. Pattern Anal. Mach. Intell.
  doi: 10.1109/TPAMI.2010.88
– start-page: 174
  year: 2016
  ident: 10.1016/j.parco.2021.102769_b44
  article-title: An empirical comparison of graph Laplacian solvers
– year: 2016
  ident: 10.1016/j.parco.2021.102769_b13
– volume: 20
  start-page: 241
  issue: 3
  year: 2012
  ident: 10.1016/j.parco.2021.102769_b38
  article-title: Amesos2 and Belos: Direct and iterative solvers for large sparse linear systems
  publication-title: Sci. Program.
– year: 2019
  ident: 10.1016/j.parco.2021.102769_b30
– volume: 15
  start-page: 938
  year: 1972
  ident: 10.1016/j.parco.2021.102769_b16
  article-title: Algorithms for partitioning of graphs and computer logic based on eigenvectors of connection matrices
  publication-title: IBM Tech. Discl. Bull.
– ident: 10.1016/j.parco.2021.102769_b36
  doi: 10.1137/1.9781611976137.4
– year: 1993
  ident: 10.1016/j.parco.2021.102769_b17
– year: 2016
  ident: 10.1016/j.parco.2021.102769_b35
– volume: 10
  start-page: 673
  issue: 7
  year: 1999
  ident: 10.1016/j.parco.2021.102769_b2
  article-title: Hypergraph-partitioning-based decomposition for parallel sparse-matrix vector multiplication
  publication-title: IEEE Trans. Parallel Distrib. Syst.
  doi: 10.1109/71.780863
– volume: 11
  start-page: 430
  issue: 3
  year: 1990
  ident: 10.1016/j.parco.2021.102769_b7
  article-title: Partitioning sparse matrices with eigenvectors of graphs
  publication-title: SIAM J. Matrix Anal.
  doi: 10.1137/0611030
– year: 2019
  ident: 10.1016/j.parco.2021.102769_b39
– year: 2014
  ident: 10.1016/j.parco.2021.102769_b31
– volume: 27
  start-page: 930
  issue: 4
  year: 2006
  ident: 10.1016/j.parco.2021.102769_b42
  article-title: Support-graph preconditioners
  publication-title: SIAM J. Matrix Anal. Appl.
  doi: 10.1137/S0895479801384019
– year: 1997
  ident: 10.1016/j.parco.2021.102769_b5
– volume: 74
  start-page: 3202
  issn: 0743-7315
  issue: 12
  year: 2014
  ident: 10.1016/j.parco.2021.102769_b10
  article-title: Kokkos: Enabling manycore performance portability through polymorphic memory access patterns
  publication-title: J. Parallel Distrib. Comput.
  doi: 10.1016/j.jpdc.2014.07.003
– ident: 10.1016/j.parco.2021.102769_b25
  doi: 10.1109/IPDPS.2006.1639359
– volume: 38
  start-page: S620
  issue: 5
  year: 2016
  ident: 10.1016/j.parco.2021.102769_b24
  article-title: Complex network partitioning using label propagation
  publication-title: SIAM J. Sci. Comput.
  doi: 10.1137/15M1026183
– volume: 27
  start-page: 803
  issue: 3
  year: 2016
  ident: 10.1016/j.parco.2021.102769_b3
  article-title: Multi-jagged: A scalable parallel spatial partitioning algorithm
  publication-title: IEEE Trans. Parallel Distrib. Syst.
  doi: 10.1109/TPDS.2015.2412545
– volume: 38
  start-page: 1:1
  issn: 0098-3500
  issue: 1
  year: 2011
  ident: 10.1016/j.parco.2021.102769_b40
  article-title: The University of Florida sparse matrix collection
  publication-title: ACM Trans. Math. Software
  doi: 10.1145/2049662.2049663
– ident: 10.1016/j.parco.2021.102769_b41
  doi: 10.1155/2020/3042642
– start-page: 440
  year: 2020
  ident: 10.1016/j.parco.2021.102769_b11
  article-title: SPHYNX: Spectral Partitioning for HYbrid aNd aXelerator-enabled systems
– volume: 20
  start-page: 115
  issue: 2
  year: 2012
  ident: 10.1016/j.parco.2021.102769_b32
  article-title: Tpetra, and the use of generic programming in scientific computing
  publication-title: Sci. Program.
– volume: 218
  start-page: 324
  issn: 0021-9991
  issue: 1
  year: 2006
  ident: 10.1016/j.parco.2021.102769_b28
  article-title: Basis selection in LOBPCG
  publication-title: J. Comput. Phys.
  doi: 10.1016/j.jcp.2006.02.007
– volume: 47
  start-page: 67
  issue: 1
  year: 2005
  ident: 10.1016/j.parco.2021.102769_b26
  article-title: A two-dimensional data distribution method for parallel sparse matrix-vector multiplication
  publication-title: SIAM Rev.
  doi: 10.1137/S0036144502409019
– volume: 23
  start-page: 298
  issue: 98
  year: 1973
  ident: 10.1016/j.parco.2021.102769_b15
  article-title: Algebraic connectivity of graphs
  publication-title: Czechoslov. Math. J.
  doi: 10.21136/CMJ.1973.101168
– volume: 16
  start-page: 452
  issue: 2
  year: 1995
  ident: 10.1016/j.parco.2021.102769_b19
  article-title: An improved spectral graph partitioning algorithm for mapping parallel computations
  publication-title: SIAM J. Sci. Comput.
  doi: 10.1137/0916028
– year: 1995
  ident: 10.1016/j.parco.2021.102769_b4
  article-title: A multilevel algorithm for partitioning graphs
– start-page: 493
  year: 1996
  ident: 10.1016/j.parco.2021.102769_b6
  article-title: Scotch: A software package for static mapping by dual recursive bipartitioning of process and architecture graphs
  doi: 10.1007/3-540-61142-8_588
– volume: vol. 7933
  start-page: 164
  year: 2013
  ident: 10.1016/j.parco.2021.102769_b21
  article-title: Think locally, act globally: Highly balanced graph partitioning
– year: 2012
  ident: 10.1016/j.parco.2021.102769_b14
– year: 2021
  ident: 10.1016/j.parco.2021.102769_b9
– volume: 36
  start-page: 13:1
  issn: 0098-3500
  issue: 3
  year: 2009
  ident: 10.1016/j.parco.2021.102769_b33
  article-title: Anasazi software for the numerical solution of large-scale eigenvalue problems
  publication-title: ACM Trans. Math. Software
  doi: 10.1145/1527286.1527287
– start-page: 646
  year: 2017
  ident: 10.1016/j.parco.2021.102769_b12
  article-title: Partitioning trillion-edge graphs in minutes
– volume: 20
  start-page: 359
  issue: 1
  year: 1998
  ident: 10.1016/j.parco.2021.102769_b1
  article-title: A fast and high quality multilevel scheme for partitioning irregular graphs
  publication-title: SIAM J. Sci. Comput.
  doi: 10.1137/S1064827595287997
– volume: 28
  start-page: 2625
  issue: 9
  year: 2017
  ident: 10.1016/j.parco.2021.102769_b22
  article-title: Parallel graph partitioning for complex networks
  publication-title: IEEE Trans. Parallel Distrib. Syst. (TPDS)
  doi: 10.1109/TPDS.2017.2671868
– year: 2019
  ident: 10.1016/j.parco.2021.102769_b37
– volume: 35
  start-page: 835
  issue: 3
  year: 2014
  ident: 10.1016/j.parco.2021.102769_b43
  article-title: Nearly linear time algorithms for preconditioning and solving symmetric, diagonally dominant linear systems
  publication-title: SIAM J. Matrix Anal. Appl.
  doi: 10.1137/090771430
– start-page: 632
  year: 2002
  ident: 10.1016/j.parco.2021.102769_b29
  article-title: Hypre: A library of high performance preconditioners
  doi: 10.1007/3-540-47789-6_66
– volume: 22
  start-page: 888
  issn: 0162-8828
  issue: 8
  year: 2000
  ident: 10.1016/j.parco.2021.102769_b8
  article-title: Normalized cuts and image segmentation
  publication-title: IEEE Trans. Pattern Anal. Mach. Intell.
  doi: 10.1109/34.868688
– ident: 10.1016/j.parco.2021.102769_b18
  doi: 10.1145/217474.217529
– volume: 23
  start-page: 517
  issue: 2
  year: 2001
  ident: 10.1016/j.parco.2021.102769_b27
  article-title: Toward the optimal preconditioned eigensolver: Locally optimal block preconditioned conjugate gradient method
  publication-title: SIAM J. Sci. Comput.
  doi: 10.1137/S1064827500366124
SSID ssj0006480
Score 2.3651972
Snippet Graph partitioning has been an important tool to partition the work among several processors to minimize the communication cost and balance the workload. While...
SourceID osti
crossref
elsevier
SourceType Open Access Repository
Enrichment Source
Index Database
Publisher
StartPage 102769
SubjectTerms Distributed-memory systems
GPUs
Graph partitioning
Spectral partitioning
Title Sphynx: A parallel multi-GPU graph partitioner for distributed-memory systems
URI https://dx.doi.org/10.1016/j.parco.2021.102769
https://www.osti.gov/biblio/1862429
Volume 106
WOSCitedRecordID wos000687400800001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVESC
  databaseName: Elsevier SD Freedom Collection Journals 2021
  customDbUrl:
  eissn: 1872-7336
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0006480
  issn: 0167-8191
  databaseCode: AIEXJ
  dateStart: 19950101
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV3db9MwELeqjgde-EaMAfIDbyNTnDhxwltBY4DEVNFN6lvkOLboGrJq7ary33Pnj6yiYgIkXqIqqe3o7pe7i3P3O0Jex0bxUjRZpKQwES-aNCrBU0SZSbDMskZOdNtsQpyeFtNpOR4MFqEWZt2Kris2m3LxX1UN50DZWDr7F-ruJ4UT8BuUDkdQOxz_SPETEFy3cRXnSOzdtrp1eYPRyfj80DJU44XAUmQTDRvkz8XWV7qJvmPy7Q_P8bzcjl7HYTple0EEr4d4Ub6Bl_52k-777tJvr4KxnR-eHPW5Pu31Ut4wG6CJGfUXv0qkaupmau6gOpmt5VJ2c1i6296iSFifg-X3zXZqZ9xWJphofF10nsiZ30JAvJ86SpTePltKgl1b77YdLo5AYgrLOBOGPBTCdX75hUR7gqvhYgmzH1vBae8lIivBDu6NPh1PP_feO-e2215_d4GpyuYE7iz1u2hmeAnS2wpUzh6Qe_4Ng44cMh6Sge4ekfuhewf1xvwx-eKA8paOaIAJ7WFCLUzoFkwowITuwoR6mDwh5x-Oz95_jHxzjUilIltFTJiCZ4rXIq7rJi8kh6ez4dKAC2iUMGUuWaxiZYqsFlrXEAbGBt4eGH6q00qnT8mwg-WfEZpxVZiGG8ZlCi5BSQUz6ryJtWCKxcU-SYKQKuWZ57EBSluFFMOLykq2QslWTrL75E0_aOGIV27_ex6kX_nY0cWEFcDl9oEHqCschKzJCtPLYBSzhVPl83-d9oDcZSX2dMIn4QUZrq6u9UtyR61Xs-XVKw-6nywinko
linkProvider Elsevier
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Sphynx%3A+A+parallel+multi-GPU+graph+partitioner+for+distributed-memory+systems&rft.jtitle=Parallel+computing&rft.au=Acer%2C+Seher&rft.au=Boman%2C+Erik+G.&rft.au=Glusa%2C+Christian+A.&rft.au=Rajamanickam%2C+Sivasankaran&rft.date=2021-09-01&rft.pub=Elsevier+B.V&rft.issn=0167-8191&rft.eissn=1872-7336&rft.volume=106&rft_id=info:doi/10.1016%2Fj.parco.2021.102769&rft.externalDocID=S0167819121000272
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0167-8191&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0167-8191&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0167-8191&client=summon