A distributed-memory hierarchical solver for general sparse linear systems

•Derived a new formulation of a sequential hierarchical solver, which compresses dense fill-in blocks.•Proposed a new parallel algorithm for solving general sparse linear systems based on data decomposition.•Implemented a task-based asynchronous scheme by exploiting data dependency in our algorithm....

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Parallel computing Jg. 74; H. C; S. 49 - 64
Hauptverfasser: Chen, Chao, Pouransari, Hadi, Rajamanickam, Sivasankaran, Boman, Erik G., Darve, Eric
Format: Journal Article
Sprache:Englisch
Veröffentlicht: United States Elsevier B.V 01.05.2018
Elsevier
Schlagworte:
ISSN:0167-8191, 1872-7336
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Abstract •Derived a new formulation of a sequential hierarchical solver, which compresses dense fill-in blocks.•Proposed a new parallel algorithm for solving general sparse linear systems based on data decomposition.•Implemented a task-based asynchronous scheme by exploiting data dependency in our algorithm.•Implemented a coloring scheme to extract concurrency in the execution.•Provided benchmarks for various problems and analysis of parallel scalability under different conditions. We present a parallel hierarchical solver for general sparse linear systems on distributed-memory machines. For large-scale problems, this fully algebraic algorithm is faster and more memory-efficient than sparse direct solvers because it exploits the low-rank structure of fill-in blocks. Depending on the accuracy of low-rank approximations, the hierarchical solver can be used either as a direct solver or as a preconditioner. The parallel algorithm is based on data decomposition and requires only local communication for updating boundary data on every processor. Moreover, the computation-to-communication ratio of the parallel algorithm is approximately the volume-to-surface-area ratio of the subdomain owned by every processor. We present various numerical results to demonstrate the versatility and scalability of the parallel algorithm.
AbstractList •Derived a new formulation of a sequential hierarchical solver, which compresses dense fill-in blocks.•Proposed a new parallel algorithm for solving general sparse linear systems based on data decomposition.•Implemented a task-based asynchronous scheme by exploiting data dependency in our algorithm.•Implemented a coloring scheme to extract concurrency in the execution.•Provided benchmarks for various problems and analysis of parallel scalability under different conditions. We present a parallel hierarchical solver for general sparse linear systems on distributed-memory machines. For large-scale problems, this fully algebraic algorithm is faster and more memory-efficient than sparse direct solvers because it exploits the low-rank structure of fill-in blocks. Depending on the accuracy of low-rank approximations, the hierarchical solver can be used either as a direct solver or as a preconditioner. The parallel algorithm is based on data decomposition and requires only local communication for updating boundary data on every processor. Moreover, the computation-to-communication ratio of the parallel algorithm is approximately the volume-to-surface-area ratio of the subdomain owned by every processor. We present various numerical results to demonstrate the versatility and scalability of the parallel algorithm.
We present a parallel hierarchical solver for general sparse linear systems on distributed-memory machines. For large-scale problems, this fully algebraic algorithm is faster and more memory-efficient than sparse direct solvers because it exploits the low-rank structure of fill-in blocks. Depending on the accuracy of low-rank approximations, the hierarchical solver can be used either as a direct solver or as a preconditioner. The parallel algorithm is based on data decomposition and requires only local communication for updating boundary data on every processor. Moreover, the computation-to-communication ratio of the parallel algorithm is approximately the volume-to-surface-area ratio of the subdomain owned by every processor. We also provide various numerical results to demonstrate the versatility and scalability of the parallel algorithm.
Author Rajamanickam, Sivasankaran
Boman, Erik G.
Darve, Eric
Pouransari, Hadi
Chen, Chao
Author_xml – sequence: 1
  givenname: Chao
  orcidid: 0000-0002-5385-3651
  surname: Chen
  fullname: Chen, Chao
  email: cchen10@stanford.edu
  organization: Institute for Computational and Mathematical Engineering, Stanford University, Stanford, USA
– sequence: 2
  givenname: Hadi
  surname: Pouransari
  fullname: Pouransari, Hadi
  email: hadip@stanford.edu
  organization: Department of Mechanical Engineering, Stanford University, Stanford, USA
– sequence: 3
  givenname: Sivasankaran
  surname: Rajamanickam
  fullname: Rajamanickam, Sivasankaran
  email: srajama@sandia.gov
  organization: Center for Computing Research, Sandia National Laboratories, Albuquerque, USA
– sequence: 4
  givenname: Erik G.
  surname: Boman
  fullname: Boman, Erik G.
  email: egboman@sandia.gov
  organization: Center for Computing Research, Sandia National Laboratories, Albuquerque, USA
– sequence: 5
  givenname: Eric
  orcidid: 0000-0002-1938-3836
  surname: Darve
  fullname: Darve, Eric
  email: darve@stanford.edu
  organization: Institute for Computational and Mathematical Engineering, Stanford University, Stanford, USA
BackLink https://www.osti.gov/servlets/purl/1429626$$D View this record in Osti.gov
BookMark eNqFkLtOwzAUhi1UJNrCE7BE7Ak-zsXxwFBVXFWJBWbLcU6oqySubFOpb49DmRhgsnT8f-fyLchstCMScg00AwrV7S7bK6dtxijwDFhGaXFG5lBzlvI8r2ZkHlM8rUHABVl4v6OUVkVN5-RllbTGB2eaz4BtOuBg3THZGnSx4dZo1Sfe9gd0SWdd8oFj_IilOM5j0psRlUv80Qcc_CU571Tv8ernXZL3h_u39VO6eX18Xq82qc55GVKmuVZIW81EI0AhAFOV0pxDKZocal3wtm0ABa9zZEyVZZlXWihVakFVV-dLcnPqa30w0msTUG-1HUfUQULBRMWqGBKnkHbWe4edjDkVjB2DU6aXQOVkTu7ktzk5mZPAZDQX2fwXu3dmUO74D3V3ojDefogCp9Vw1NgaN23WWvMn_wVq_Iwt
CitedBy_id crossref_primary_10_1016_j_jcp_2019_07_024
crossref_primary_10_1007_s10586_020_03188_x
crossref_primary_10_1088_1742_6596_1158_3_032037
crossref_primary_10_1137_20M1315683
crossref_primary_10_1137_20M1316330
crossref_primary_10_1080_17445760_2021_2004412
crossref_primary_10_1016_j_parco_2017_12_001
crossref_primary_10_1016_j_cpc_2019_106975
crossref_primary_10_1137_20M1380624
crossref_primary_10_1137_19M129961X
crossref_primary_10_1002_nme_7076
crossref_primary_10_1002_nme_7100
crossref_primary_10_1016_j_parco_2022_102956
Cites_doi 10.1137/S0895479895291765
10.1002/nla.1680010405
10.1145/992200.992206
10.1145/2830569
10.1109/88.242438
10.1137/S0895479897317685
10.1007/s00211-009-0218-6
10.1137/S0895479803436652
10.1137/1034116
10.1007/s006070050015
10.1137/120903476
10.1137/S1064827595287997
10.1137/080732158
10.1007/PL00021408
10.1137/15M1034477
10.1145/1391989.1391995
10.1016/S0377-0427(00)00516-1
10.1137/0710032
10.1137/15M1046939
10.1016/j.jcp.2015.10.012
10.1007/s00607-004-0102-2
10.1145/779359.779361
10.1002/nla.691
10.1017/S0962492916000076
10.1142/S0129626414420055
10.1137/09074543X
10.1007/s00607-002-1450-4
10.1016/j.parco.2007.12.001
ContentType Journal Article
Copyright 2017
Copyright_xml – notice: 2017
CorporateAuthor Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). National Energy Research Scientific Computing Center (NERSC)
Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)
CorporateAuthor_xml – name: Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)
– name: Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). National Energy Research Scientific Computing Center (NERSC)
DBID AAYXX
CITATION
OIOZB
OTOTI
DOI 10.1016/j.parco.2017.12.004
DatabaseName CrossRef
OSTI.GOV - Hybrid
OSTI.GOV
DatabaseTitle CrossRef
DatabaseTitleList

DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISSN 1872-7336
EndPage 64
ExternalDocumentID 1429626
10_1016_j_parco_2017_12_004
S0167819117302077
GroupedDBID --K
--M
-~X
.DC
.~1
0R~
123
1B1
1~.
1~5
29O
4.4
457
4G.
5VS
6OB
7-5
71M
8P~
9JN
AACTN
AAEDT
AAEDW
AAIAV
AAIKJ
AAKOC
AALRI
AAOAW
AAQFI
AAQXK
AAXUO
AAYFN
ABBOA
ABEFU
ABFNM
ABJNI
ABMAC
ABXDB
ABYKQ
ACDAQ
ACGFS
ACNNM
ACRLP
ACZNC
ADBBV
ADEZE
ADJOM
ADMUD
ADTZH
AEBSH
AECPX
AEKER
AENEX
AFKWA
AFTJW
AGHFR
AGUBO
AGYEJ
AHHHB
AHJVU
AHZHX
AIALX
AIEXJ
AIKHN
AITUG
AJBFU
AJOXV
ALMA_UNASSIGNED_HOLDINGS
AMFUW
AMRAJ
AOUOD
ASPBG
AVWKF
AXJTR
AZFZN
BJAXD
BKOJK
BLXMC
CS3
DU5
EBS
EFJIC
EFLBG
EJD
EO8
EO9
EP2
EP3
F5P
FDB
FEDTE
FGOYB
FIRID
FNPLU
FYGXN
G-Q
G8K
GBLVA
GBOLZ
HLZ
HVGLF
HZ~
H~9
IHE
J1W
JJJVA
KOM
LG9
M41
MO0
N9A
O-L
O9-
OAUVE
OZT
P-8
P-9
P2P
PC.
Q38
R2-
RIG
ROL
RPZ
SBC
SCC
SDF
SDG
SDP
SES
SEW
SPC
SPCBC
SST
SSV
SSZ
T5K
WH7
WUQ
XPP
ZMT
~G-
9DU
AATTM
AAXKI
AAYWO
AAYXX
ABDPE
ABWVN
ACLOT
ACRPL
ACVFH
ADCNI
ADNMO
AEIPS
AEUPX
AFJKZ
AFPUW
AGQPQ
AIGII
AIIUN
AKBMS
AKRWK
AKYEP
ANKPU
APXCP
CITATION
EFKBS
~HD
AALMO
ABPIF
ABPTK
OIOZB
OTOTI
ID FETCH-LOGICAL-c375t-2c7cae0dc29b91ae112a6ac77159b318c47ddb1e9783e22a55536c9aa5c90af83
ISICitedReferencesCount 16
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000428486900005&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 0167-8191
IngestDate Thu May 18 18:38:15 EDT 2023
Sat Nov 29 07:26:06 EST 2025
Tue Nov 18 22:23:31 EST 2025
Fri Feb 23 02:46:05 EST 2024
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue C
Keywords Parallel linear solver
Sparse matrix
Hierarchical matrix
Language English
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-c375t-2c7cae0dc29b91ae112a6ac77159b318c47ddb1e9783e22a55536c9aa5c90af83
Notes AC04-94AL85000; NA0002373-1; AC02-05CH11231; NA-0003525
USDOE Office of Science (SC)
USDOE National Nuclear Security Administration (NNSA)
Stanford Univ., CA (United States)
SAND2017-0977J
ORCID 0000-0002-5385-3651
0000-0002-1938-3836
0000000253853651
0000000219383836
OpenAccessLink https://www.osti.gov/servlets/purl/1429626
PageCount 16
ParticipantIDs osti_scitechconnect_1429626
crossref_citationtrail_10_1016_j_parco_2017_12_004
crossref_primary_10_1016_j_parco_2017_12_004
elsevier_sciencedirect_doi_10_1016_j_parco_2017_12_004
PublicationCentury 2000
PublicationDate 2018-05-01
PublicationDateYYYYMMDD 2018-05-01
PublicationDate_xml – month: 05
  year: 2018
  text: 2018-05-01
  day: 01
PublicationDecade 2010
PublicationPlace United States
PublicationPlace_xml – name: United States
PublicationTitle Parallel computing
PublicationYear 2018
Publisher Elsevier B.V
Elsevier
Publisher_xml – name: Elsevier B.V
– name: Elsevier
References Duff, Erisman, Reid (bib0003) 1986
Xia, Chandrasekaran, Gu, Li (bib0029) 2010; 17
Hackbusch (bib0027) 1999; 62
Grama, Gupta, Kumar (bib0037) 1993; 1
Lin, Bettencourt, Domino, Fisher, Hoemmen, Hu, Phipps, Prokopenko, Rajamanickam, Siefert, Kennon (bib0012) 2014
Hackbusch, Khoromskij (bib0028) 2000; 64
Kriemann (bib0020) 2005; 74
Wang, Li, Rouet, Xia, De Hoop (bib0022) 2016; 42
Grasedyck, Kriemann, Le Borne (bib0014) 2009; 112
Chandrasekaran, Gu, Pals (bib0030) 2006; 28
Chen, Davis, Hager, Rajamanickam (bib0004) 2008; 35
Li, Demmel, Gilbert, Grigori, Shao, Yamazaki (bib0038) 1999
Davis (bib0005) 2004; 30
Hackbusch (bib0032) 2015
Kuzmin, Luisier, Schenk (bib0007) 2013; 8097
Amestoy, Duff, L’Excellent, Koster (bib0009) 2001
Boman, Catalyurek, Chevalier, Devine (bib0035) 2012; 20
Pouransari, Coulier, Darve (bib0019) 2017; 39
Karypis, Kumar (bib0033) 1998; 20
Xia, Chandrasekaran, Gu, Li (bib0015) 2009; 31
Bozdağ, Çatalyürek, Gebremedhin, Manne, Boman, Özgüner (bib0036) 2010; 32
Aminfar, Ambikasaran, Darve (bib0018) 2016; 304
Amestoy, Ashcraft, Boiteau, Buttari, L’Excellent, Weisbecker (bib0017) 2015; 37
Stüben (bib0026) 2001; 128
Lin, Bettencourt, Domino, Fisher, Hoemmen, Hu, Phipps, Prokopenko, Rajamanickam, Siefert (bib0013) 2014
Ho, Ying (bib0016) 2015
Brandt (bib0025) 1986; 19
Xu (bib0011) 1992; 34
Davis, Rajamanickam, Sid-Lakhdar (bib0001) 2016; 25
Li, Demmel (bib0010) 2003; 29
George (bib0002) 1973; 10
Demmel, Eisenstat, Gilbert, Li, Liu (bib0006) 1999; 20
Chevalier, Pellegrini (bib0034) 2008; 34
Li, Ying (bib0023) 2016
Saad (bib0024) 1994; 1
Hackbusch, Börm (bib0031) 2002; 69
Coulier, Pouransari, Darve (bib0039) 2017; 39
Demmel, Gilbert, Li (bib0008) 1999; 20
Ghysels, Li, Rouet, Williams, Napov (bib0021) 2015
Ghysels (10.1016/j.parco.2017.12.004_bib0021) 2015
George (10.1016/j.parco.2017.12.004_bib0002) 1973; 10
Chevalier (10.1016/j.parco.2017.12.004_bib0034) 2008; 34
Li (10.1016/j.parco.2017.12.004_sbref0038) 1999
Aminfar (10.1016/j.parco.2017.12.004_bib0018) 2016; 304
Chandrasekaran (10.1016/j.parco.2017.12.004_bib0030) 2006; 28
Brandt (10.1016/j.parco.2017.12.004_bib0025) 1986; 19
Chen (10.1016/j.parco.2017.12.004_bib0004) 2008; 35
Saad (10.1016/j.parco.2017.12.004_bib0024) 1994; 1
Demmel (10.1016/j.parco.2017.12.004_bib0006) 1999; 20
Stüben (10.1016/j.parco.2017.12.004_bib0026) 2001; 128
Hackbusch (10.1016/j.parco.2017.12.004_bib0027) 1999; 62
Hackbusch (10.1016/j.parco.2017.12.004_bib0028) 2000; 64
Amestoy (10.1016/j.parco.2017.12.004_bib0009) 2001
Demmel (10.1016/j.parco.2017.12.004_bib0008) 1999; 20
Coulier (10.1016/j.parco.2017.12.004_bib0039) 2017; 39
Bozdağ (10.1016/j.parco.2017.12.004_bib0036) 2010; 32
Lin (10.1016/j.parco.2017.12.004_bib0013) 2014
Li (10.1016/j.parco.2017.12.004_bib0023) 2016
Kriemann (10.1016/j.parco.2017.12.004_bib0020) 2005; 74
Pouransari (10.1016/j.parco.2017.12.004_bib0019) 2017; 39
Hackbusch (10.1016/j.parco.2017.12.004_bib0032) 2015
Ho (10.1016/j.parco.2017.12.004_bib0016) 2015
Lin (10.1016/j.parco.2017.12.004_bib0012) 2014
Kuzmin (10.1016/j.parco.2017.12.004_bib0007) 2013; 8097
Boman (10.1016/j.parco.2017.12.004_bib0035) 2012; 20
Xu (10.1016/j.parco.2017.12.004_bib0011) 1992; 34
Hackbusch (10.1016/j.parco.2017.12.004_bib0031) 2002; 69
Grama (10.1016/j.parco.2017.12.004_bib0037) 1993; 1
Amestoy (10.1016/j.parco.2017.12.004_bib0017) 2015; 37
Davis (10.1016/j.parco.2017.12.004_bib0001) 2016; 25
Duff (10.1016/j.parco.2017.12.004_bib0003) 1986
Grasedyck (10.1016/j.parco.2017.12.004_bib0014) 2009; 112
Xia (10.1016/j.parco.2017.12.004_bib0029) 2010; 17
Wang (10.1016/j.parco.2017.12.004_bib0022) 2016; 42
Davis (10.1016/j.parco.2017.12.004_bib0005) 2004; 30
Xia (10.1016/j.parco.2017.12.004_bib0015) 2009; 31
Li (10.1016/j.parco.2017.12.004_bib0010) 2003; 29
Karypis (10.1016/j.parco.2017.12.004_bib0033) 1998; 20
References_xml – volume: 42
  start-page: 21:1
  year: 2016
  end-page: 21:21
  ident: bib0022
  article-title: A parallel geometric multifrontal solver using hierarchically semiseparable structure
  publication-title: ACM Trans. Math. Softw. (TOMS)
– volume: 34
  start-page: 581
  year: 1992
  end-page: 613
  ident: bib0011
  article-title: Iterative methods by space decomposition and subspace correction
  publication-title: SIAM Rev.
– volume: 20
  start-page: 129
  year: 2012
  end-page: 150
  ident: bib0035
  article-title: The Zoltan and Isorropia parallel toolkits for combinatorial scientific computing: partitioning, ordering, and coloring
  publication-title: Sci. Program
– volume: 20
  start-page: 720
  year: 1999
  end-page: 755
  ident: bib0006
  article-title: A supernodal approach to sparse partial pivoting
  publication-title: SIAM J. Matrix Analysis Appl.
– volume: 29
  start-page: 110
  year: 2003
  end-page: 140
  ident: bib0010
  article-title: SuperLU DIST: a scalable distributed-memory sparse direct solver for unsymmetric linear systems
  publication-title: ACM Trans. Math. Softw. (TOMS)
– year: 2014
  ident: bib0012
  article-title: Towards extreme-scale simulations with second-generation trilinos
  publication-title: Parallel Process Lett.
– start-page: 203
  year: 2015
  end-page: 240
  ident: bib0032
  article-title: -matrices
  publication-title: Hierarchical Matrices: Algorithms and Analysis
– volume: 34
  start-page: 318
  year: 2008
  end-page: 331
  ident: bib0034
  article-title: Pt-scotch: a tool for efficient parallel graph ordering
  publication-title: Parallel Comput.
– volume: 1
  start-page: 387
  year: 1994
  end-page: 402
  ident: bib0024
  article-title: ILUT: A dual threshold incomplete LU factorization
  publication-title: Num. Lin. Algebra Appl.
– volume: 20
  start-page: 359
  year: 1998
  end-page: 392
  ident: bib0033
  article-title: A fast and high quality multilevel scheme for partitioning irregular graphs
  publication-title: SIAM J. Sci. Comput.
– volume: 39
  start-page: A761
  year: 2017
  end-page: A796
  ident: bib0039
  article-title: The inverse fast multipole method: using a fast approximate direct dolver as a preconditioner for dense linear systems
  publication-title: SIAM J. Sci. Comput.
– volume: 17
  start-page: 953
  year: 2010
  end-page: 976
  ident: bib0029
  article-title: Fast algorithms for hierarchically semiseparable matrices
  publication-title: Numerical Linear Algebra Appl.
– volume: 112
  start-page: 565
  year: 2009
  end-page: 600
  ident: bib0014
  article-title: Domain decomposition based-LU preconditioning
  publication-title: Numerische Mathematik
– year: 1999
  ident: bib0038
  article-title: SuperLU users’ guide
  publication-title: Technical Report
– year: 1986
  ident: bib0003
  article-title: Direct methods for sparse matrices
– volume: 10
  start-page: 345
  year: 1973
  end-page: 363
  ident: bib0002
  article-title: Nested dissection of a regular finite element mesh
  publication-title: SIAM J. Numer. Anal
– year: 2015
  ident: bib0016
  article-title: Hierarchical interpolative factorization for elliptic operators: differential equations
  publication-title: Commun. Pure Appl, Math.
– volume: 37
  start-page: A1451
  year: 2015
  end-page: A1474
  ident: bib0017
  article-title: Improving multifrontal methods by means of block low-rank representations
  publication-title: SIAM Journal on Scientific Computing
– volume: 8097
  start-page: 533
  year: 2013
  end-page: 544
  ident: bib0007
  article-title: Fast methods for computing selected elements of the Green's function in massively parallel nanoelectronic device simulations
  publication-title: Euro-Par 2013 Parallel Processing
– volume: 128
  start-page: 281
  year: 2001
  end-page: 309
  ident: bib0026
  article-title: A review of algebraic multigrid
  publication-title: J. Comput. Appl. Math.
– volume: 64
  start-page: 21
  year: 2000
  end-page: 47
  ident: bib0028
  article-title: A sparse H-matrix arithmetic.
  publication-title: Computing
– volume: 35
  start-page: 22
  year: 2008
  ident: bib0004
  article-title: Algorithm 887: CHOLMOD, supernodal sparse cholesky factorization and update/downdate
  publication-title: ACM Trans. Math. Softw. (TOMS)
– volume: 20
  start-page: 915
  year: 1999
  end-page: 952
  ident: bib0008
  article-title: An asynchronous parallel supernodal algorithm for sparse gaussian elimination
  publication-title: SIAM J. Matrix Anal. Appl.
– volume: 30
  start-page: 196
  year: 2004
  end-page: 199
  ident: bib0005
  article-title: Algorithm 832: UMFPACK V4. 3—an unsymmetric-pattern multifrontal method
  publication-title: ACM Trans. Math. Softw. (TOMS)
– volume: 69
  start-page: 1
  year: 2002
  end-page: 35
  ident: bib0031
  article-title: Data-sparse approximation by adaptive
  publication-title: Computing
– volume: 32
  start-page: 2418
  year: 2010
  end-page: 2446
  ident: bib0036
  article-title: Distributed-memory parallel algorithms for distance-2 coloring and related problems in derivative computation
  publication-title: SIAM J. Sci. Comput.
– volume: 31
  start-page: 1382
  year: 2009
  end-page: 1411
  ident: bib0015
  article-title: Superfast multifrontal method for large structured linear systems of equations
  publication-title: SIAM J. Matrix Anal. Appl.
– volume: 304
  start-page: 170
  year: 2016
  end-page: 188
  ident: bib0018
  article-title: A fast block low-rank dense solver with applications to finite-element matrices
  publication-title: J. Comput. Phys.
– volume: 62
  start-page: 89
  year: 1999
  end-page: 108
  ident: bib0027
  article-title: A sparse matrix arithmetic based on H-matrices. Part I: Introduction to H-matrices
  publication-title: Computing
– volume: 1
  start-page: 12
  year: 1993
  end-page: 21
  ident: bib0037
  article-title: Isoefficiency: measuring the scalability of parallel algorithms and architectures
  publication-title: IEEE Parallel Distrib. Technol.
– start-page: 121
  year: 2001
  end-page: 130
  ident: bib0009
  article-title: MUMPS: a general purpose distributed memory sparse solver
  publication-title: . New Paradigms for HPC in Industry and Academia
– volume: 39
  start-page: A797
  year: 2017
  end-page: A830
  ident: bib0019
  article-title: Fast hierarchical solvers for sparse matrices using extended sparsification and low-rank approximation
  publication-title: SIAM J. Sci.c Comput.
– volume: 19
  start-page: 23
  year: 1986
  end-page: 56
  ident: bib0025
  article-title: Algebraic multigrid theory: the symmetric case
  publication-title: Appl. Math. Comput.
– volume: 28
  start-page: 603
  year: 2006
  end-page: 622
  ident: bib0030
  article-title: A fast ULV decomposition solver for hierarchically semiseparable representations
  publication-title: SIAM J. Matrix Anal. Appl.
– year: 2015
  ident: bib0021
  article-title: An efficient multi-core implementation of a novel HSS-structured multifrontal solver using randomized sampling
– volume: 25
  start-page: 383
  year: 2016
  end-page: 566
  ident: bib0001
  article-title: A survey of direct methods for sparse linear systems
  publication-title: Acta Numerica
– volume: 74
  start-page: 273
  year: 2005
  end-page: 297
  ident: bib0020
  article-title: Parallel H-matrix arithmetics on shared memory systems
  publication-title: Computing
– start-page: 1485
  year: 2014
  end-page: 1494
  ident: bib0013
  article-title: Towards extreme-scale simulations with next-generation Trilinos: a low Mach fluid application case study
  publication-title: Parallel & Distributed Processing Symposium Workshops (IPDPSW), 2014 IEEE International
– year: 2016
  ident: bib0023
  article-title: Distributed-memory hierarchical interpolative factorization
  publication-title: arXiv preprint arXiv:1607.00346
– volume: 20
  start-page: 720
  issue: 3
  year: 1999
  ident: 10.1016/j.parco.2017.12.004_bib0006
  article-title: A supernodal approach to sparse partial pivoting
  publication-title: SIAM J. Matrix Analysis Appl.
  doi: 10.1137/S0895479895291765
– volume: 1
  start-page: 387
  issue: 4
  year: 1994
  ident: 10.1016/j.parco.2017.12.004_bib0024
  article-title: ILUT: A dual threshold incomplete LU factorization
  publication-title: Num. Lin. Algebra Appl.
  doi: 10.1002/nla.1680010405
– volume: 20
  start-page: 129
  issue: 2
  year: 2012
  ident: 10.1016/j.parco.2017.12.004_bib0035
  article-title: The Zoltan and Isorropia parallel toolkits for combinatorial scientific computing: partitioning, ordering, and coloring
  publication-title: Sci. Program
– volume: 8097
  start-page: 533
  year: 2013
  ident: 10.1016/j.parco.2017.12.004_bib0007
  article-title: Fast methods for computing selected elements of the Green's function in massively parallel nanoelectronic device simulations
– volume: 30
  start-page: 196
  issue: 2
  year: 2004
  ident: 10.1016/j.parco.2017.12.004_bib0005
  article-title: Algorithm 832: UMFPACK V4. 3—an unsymmetric-pattern multifrontal method
  publication-title: ACM Trans. Math. Softw. (TOMS)
  doi: 10.1145/992200.992206
– volume: 42
  start-page: 21:1
  issue: 3
  year: 2016
  ident: 10.1016/j.parco.2017.12.004_bib0022
  article-title: A parallel geometric multifrontal solver using hierarchically semiseparable structure
  publication-title: ACM Trans. Math. Softw. (TOMS)
  doi: 10.1145/2830569
– volume: 1
  start-page: 12
  issue: 3
  year: 1993
  ident: 10.1016/j.parco.2017.12.004_bib0037
  article-title: Isoefficiency: measuring the scalability of parallel algorithms and architectures
  publication-title: IEEE Parallel Distrib. Technol.
  doi: 10.1109/88.242438
– volume: 20
  start-page: 915
  issue: 4
  year: 1999
  ident: 10.1016/j.parco.2017.12.004_bib0008
  article-title: An asynchronous parallel supernodal algorithm for sparse gaussian elimination
  publication-title: SIAM J. Matrix Anal. Appl.
  doi: 10.1137/S0895479897317685
– year: 2015
  ident: 10.1016/j.parco.2017.12.004_bib0021
– volume: 112
  start-page: 565
  issue: 4
  year: 2009
  ident: 10.1016/j.parco.2017.12.004_bib0014
  article-title: Domain decomposition based-LU preconditioning
  publication-title: Numerische Mathematik
  doi: 10.1007/s00211-009-0218-6
– volume: 28
  start-page: 603
  issue: 3
  year: 2006
  ident: 10.1016/j.parco.2017.12.004_bib0030
  article-title: A fast ULV decomposition solver for hierarchically semiseparable representations
  publication-title: SIAM J. Matrix Anal. Appl.
  doi: 10.1137/S0895479803436652
– start-page: 203
  year: 2015
  ident: 10.1016/j.parco.2017.12.004_bib0032
  article-title: H2-matrices
– volume: 34
  start-page: 581
  issue: 4
  year: 1992
  ident: 10.1016/j.parco.2017.12.004_bib0011
  article-title: Iterative methods by space decomposition and subspace correction
  publication-title: SIAM Rev.
  doi: 10.1137/1034116
– volume: 19
  start-page: 23
  issue: 1
  year: 1986
  ident: 10.1016/j.parco.2017.12.004_bib0025
  article-title: Algebraic multigrid theory: the symmetric case
  publication-title: Appl. Math. Comput.
– volume: 62
  start-page: 89
  issue: 2
  year: 1999
  ident: 10.1016/j.parco.2017.12.004_bib0027
  article-title: A sparse matrix arithmetic based on H-matrices. Part I: Introduction to H-matrices
  publication-title: Computing
  doi: 10.1007/s006070050015
– volume: 37
  start-page: A1451
  issue: 3
  year: 2015
  ident: 10.1016/j.parco.2017.12.004_bib0017
  article-title: Improving multifrontal methods by means of block low-rank representations
  publication-title: SIAM Journal on Scientific Computing
  doi: 10.1137/120903476
– start-page: 1485
  year: 2014
  ident: 10.1016/j.parco.2017.12.004_bib0013
  article-title: Towards extreme-scale simulations with next-generation Trilinos: a low Mach fluid application case study
– year: 1986
  ident: 10.1016/j.parco.2017.12.004_bib0003
– volume: 20
  start-page: 359
  issue: 1
  year: 1998
  ident: 10.1016/j.parco.2017.12.004_bib0033
  article-title: A fast and high quality multilevel scheme for partitioning irregular graphs
  publication-title: SIAM J. Sci. Comput.
  doi: 10.1137/S1064827595287997
– volume: 32
  start-page: 2418
  issue: 4
  year: 2010
  ident: 10.1016/j.parco.2017.12.004_bib0036
  article-title: Distributed-memory parallel algorithms for distance-2 coloring and related problems in derivative computation
  publication-title: SIAM J. Sci. Comput.
  doi: 10.1137/080732158
– volume: 64
  start-page: 21
  issue: 1
  year: 2000
  ident: 10.1016/j.parco.2017.12.004_bib0028
  article-title: A sparse H-matrix arithmetic.
  publication-title: Computing
  doi: 10.1007/PL00021408
– year: 1999
  ident: 10.1016/j.parco.2017.12.004_sbref0038
  article-title: SuperLU users’ guide
– volume: 39
  start-page: A761
  issue: 3
  year: 2017
  ident: 10.1016/j.parco.2017.12.004_bib0039
  article-title: The inverse fast multipole method: using a fast approximate direct dolver as a preconditioner for dense linear systems
  publication-title: SIAM J. Sci. Comput.
  doi: 10.1137/15M1034477
– start-page: 121
  year: 2001
  ident: 10.1016/j.parco.2017.12.004_bib0009
  article-title: MUMPS: a general purpose distributed memory sparse solver
– volume: 35
  start-page: 22
  issue: 3
  year: 2008
  ident: 10.1016/j.parco.2017.12.004_bib0004
  article-title: Algorithm 887: CHOLMOD, supernodal sparse cholesky factorization and update/downdate
  publication-title: ACM Trans. Math. Softw. (TOMS)
  doi: 10.1145/1391989.1391995
– volume: 128
  start-page: 281
  issue: 1
  year: 2001
  ident: 10.1016/j.parco.2017.12.004_bib0026
  article-title: A review of algebraic multigrid
  publication-title: J. Comput. Appl. Math.
  doi: 10.1016/S0377-0427(00)00516-1
– volume: 10
  start-page: 345
  issue: 2
  year: 1973
  ident: 10.1016/j.parco.2017.12.004_bib0002
  article-title: Nested dissection of a regular finite element mesh
  publication-title: SIAM J. Numer. Anal
  doi: 10.1137/0710032
– volume: 39
  start-page: A797
  issue: 3
  year: 2017
  ident: 10.1016/j.parco.2017.12.004_bib0019
  article-title: Fast hierarchical solvers for sparse matrices using extended sparsification and low-rank approximation
  publication-title: SIAM J. Sci.c Comput.
  doi: 10.1137/15M1046939
– volume: 304
  start-page: 170
  year: 2016
  ident: 10.1016/j.parco.2017.12.004_bib0018
  article-title: A fast block low-rank dense solver with applications to finite-element matrices
  publication-title: J. Comput. Phys.
  doi: 10.1016/j.jcp.2015.10.012
– volume: 74
  start-page: 273
  issue: 3
  year: 2005
  ident: 10.1016/j.parco.2017.12.004_bib0020
  article-title: Parallel H-matrix arithmetics on shared memory systems
  publication-title: Computing
  doi: 10.1007/s00607-004-0102-2
– volume: 29
  start-page: 110
  issue: 2
  year: 2003
  ident: 10.1016/j.parco.2017.12.004_bib0010
  article-title: SuperLU DIST: a scalable distributed-memory sparse direct solver for unsymmetric linear systems
  publication-title: ACM Trans. Math. Softw. (TOMS)
  doi: 10.1145/779359.779361
– volume: 17
  start-page: 953
  issue: 6
  year: 2010
  ident: 10.1016/j.parco.2017.12.004_bib0029
  article-title: Fast algorithms for hierarchically semiseparable matrices
  publication-title: Numerical Linear Algebra Appl.
  doi: 10.1002/nla.691
– volume: 25
  start-page: 383
  year: 2016
  ident: 10.1016/j.parco.2017.12.004_bib0001
  article-title: A survey of direct methods for sparse linear systems
  publication-title: Acta Numerica
  doi: 10.1017/S0962492916000076
– year: 2014
  ident: 10.1016/j.parco.2017.12.004_bib0012
  article-title: Towards extreme-scale simulations with second-generation trilinos
  publication-title: Parallel Process Lett.
  doi: 10.1142/S0129626414420055
– volume: 31
  start-page: 1382
  issue: 3
  year: 2009
  ident: 10.1016/j.parco.2017.12.004_bib0015
  article-title: Superfast multifrontal method for large structured linear systems of equations
  publication-title: SIAM J. Matrix Anal. Appl.
  doi: 10.1137/09074543X
– volume: 69
  start-page: 1
  issue: 1
  year: 2002
  ident: 10.1016/j.parco.2017.12.004_bib0031
  article-title: Data-sparse approximation by adaptive H2-matrices
  publication-title: Computing
  doi: 10.1007/s00607-002-1450-4
– volume: 34
  start-page: 318
  issue: 6
  year: 2008
  ident: 10.1016/j.parco.2017.12.004_bib0034
  article-title: Pt-scotch: a tool for efficient parallel graph ordering
  publication-title: Parallel Comput.
  doi: 10.1016/j.parco.2007.12.001
– year: 2015
  ident: 10.1016/j.parco.2017.12.004_bib0016
  article-title: Hierarchical interpolative factorization for elliptic operators: differential equations
  publication-title: Commun. Pure Appl, Math.
– year: 2016
  ident: 10.1016/j.parco.2017.12.004_bib0023
  article-title: Distributed-memory hierarchical interpolative factorization
  publication-title: arXiv preprint arXiv:1607.00346
SSID ssj0006480
Score 2.31834
Snippet •Derived a new formulation of a sequential hierarchical solver, which compresses dense fill-in blocks.•Proposed a new parallel algorithm for solving general...
We present a parallel hierarchical solver for general sparse linear systems on distributed-memory machines. For large-scale problems, this fully algebraic...
SourceID osti
crossref
elsevier
SourceType Open Access Repository
Enrichment Source
Index Database
Publisher
StartPage 49
SubjectTerms Hierarchical matrix
MATHEMATICS AND COMPUTING
Parallel linear solver
Sparse matrix
Title A distributed-memory hierarchical solver for general sparse linear systems
URI https://dx.doi.org/10.1016/j.parco.2017.12.004
https://www.osti.gov/servlets/purl/1429626
Volume 74
WOSCitedRecordID wos000428486900005&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVESC
  databaseName: Elsevier SD Freedom Collection Journals 2021
  customDbUrl:
  eissn: 1872-7336
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0006480
  issn: 0167-8191
  databaseCode: AIEXJ
  dateStart: 19950101
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV3Nb9MwFLdKx4EL41OMAfKBW8mUOB-OjwWNjR2mSRtSb5HjOKhrSaa2q7bb_vS95480omICJC5RFdWu9d6vPz8_vw9CPlZRpWpVykAmYRkkPI8DEeo6kFqnWV7xWIraNJvgp6f5ZCLOBoM7nwuznvOmyW9uxNV_VTW8A2Vj6uxfqLubFF7AZ1A6PEHt8PwjxY_x0sX2sdJV8BMjaW9H2PHa3BmYJMgWw6FNgOEPW3V6BLyyWOoR2pzSl3de9g3XM7nAritzE4N-vfIbngkM0P7ivu2IFhaKfchtGjuw23Rzn4SVmJqpmlkknk_XcimbGUzfwfRz69yyQNKz0dFB3zMR5Zs4QOsu20qZsR5MYGY8JdoNyLJuzsHMj20lFE_LPOnxqi1r6nZoW_Z8i_utG-LyAASmMK0z4sbRa7sb_1JU-xyXgauIgOFYyPkjssN4KvIh2Rl_O5ycdLt5lpjue92yfeUqEyO49VO_s26GLRB2z3C5eEaeuhMHHVukPCcD3bwgu76bB3Xk_pKcjOk2cGgfONQChwJwqAMOtcChFjjUAecV-f718OLLceA6bQQq5ukqYIorqcNKMVGKSGowwmUmFedg7JbA-irhVVVGGv2EmjGZpmmcKSFlqkQo6zx-TYZN2-g3hMILDnYXj8sMS0-CAVpxlui4ZloKniZ7hHkJFcqVocduKPPCxxteFkasBYq1iFgBYt0jn7pBV7YKy8Nfz7zoC2dIWgOxAKw8PHAfFYWDsISywlgzGBWB0QYH_7f_Ou0-ebL5f7wjw9XiWr8nj9V6NV0uPjjE3QPDeaUJ
linkProvider Elsevier
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+distributed-memory+hierarchical+solver+for+general+sparse+linear+systems&rft.jtitle=Parallel+computing&rft.au=Chen%2C+Chao&rft.au=Pouransari%2C+Hadi&rft.au=Rajamanickam%2C+Sivasankaran&rft.au=Boman%2C+Erik+G.&rft.date=2018-05-01&rft.pub=Elsevier+B.V&rft.issn=0167-8191&rft.eissn=1872-7336&rft.volume=74&rft.spage=49&rft.epage=64&rft_id=info:doi/10.1016%2Fj.parco.2017.12.004&rft.externalDocID=S0167819117302077
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0167-8191&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0167-8191&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0167-8191&client=summon