PSelInv – A distributed memory parallel algorithm for selected inversion: The non-symmetric case

•Parallel selected inversion for non-symmetric sparse matrices.•High performance implementation on distributed memory platforms.•Experiments demonstrate excellent strong and weak scalability. This paper generalizes the parallel selected inversion algorithm called PSelInv to sparse non-symmetric matr...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Parallel computing Jg. 74; H. C; S. 84 - 98
Hauptverfasser: Jacquelin, Mathias, Lin, Lin, Yang, Chao
Format: Journal Article
Sprache:Englisch
Veröffentlicht: United States Elsevier B.V 01.05.2018
Elsevier
Schlagworte:
ISSN:0167-8191, 1872-7336
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Abstract •Parallel selected inversion for non-symmetric sparse matrices.•High performance implementation on distributed memory platforms.•Experiments demonstrate excellent strong and weak scalability. This paper generalizes the parallel selected inversion algorithm called PSelInv to sparse non-symmetric matrices. We assume a general sparse matrix A has been decomposed as PAQ=LU on a distributed memory parallel machine, where L, U are lower and upper triangular matrices, and P, Q are permutation matrices, respectively. The PSelInvmethod computes selected elements of A−1. The selection is confined by the sparsity pattern of the matrix AT. Our algorithm does not assume any symmetry properties of A, and our parallel implementation is memory efficient, in the sense that the computed elements of A−T overwrites the sparse matrix L+Uin situ. PSelInv involves a large number of collective data communication activities within different processor groups of various sizes. In order to minimize idle time and improve load balancing, tree-based asynchronous communication is used to coordinate all such collective communication. Numerical results demonstrate that PSelInv can scale efficiently to 6,400 cores for a variety of matrices.
AbstractList •Parallel selected inversion for non-symmetric sparse matrices.•High performance implementation on distributed memory platforms.•Experiments demonstrate excellent strong and weak scalability. This paper generalizes the parallel selected inversion algorithm called PSelInv to sparse non-symmetric matrices. We assume a general sparse matrix A has been decomposed as PAQ=LU on a distributed memory parallel machine, where L, U are lower and upper triangular matrices, and P, Q are permutation matrices, respectively. The PSelInvmethod computes selected elements of A−1. The selection is confined by the sparsity pattern of the matrix AT. Our algorithm does not assume any symmetry properties of A, and our parallel implementation is memory efficient, in the sense that the computed elements of A−T overwrites the sparse matrix L+Uin situ. PSelInv involves a large number of collective data communication activities within different processor groups of various sizes. In order to minimize idle time and improve load balancing, tree-based asynchronous communication is used to coordinate all such collective communication. Numerical results demonstrate that PSelInv can scale efficiently to 6,400 cores for a variety of matrices.
This paper generalizes the parallel selected inversion algorithm called PSelInv to sparse non-symmetric matrices. We assume a general sparse matrix A has been decomposed as $PAQ$ = $LU$ on a distributed memory parallel machine, where L, U are lower and upper triangular matrices, and P, Q are permutation matrices, respectively. The PSelInvmethod computes selected elements of $A$-1. The selection is confined by the sparsity pattern of the matrix AT. Here our algorithm does not assume any symmetry properties of A, and our parallel implementation is memory efficient, in the sense that the computed elements of $A$-T overwrites the sparse matrix $L + U$ in situ. PSelInv involves a large number of collective data communication activities within different processor groups of various sizes. In order to minimize idle time and improve load balancing, tree-based asynchronous communication is used to coordinate all such collective communication. Numerical results demonstrate that PSelInv can scale efficiently to 6,400 cores for a variety of matrices.
Author Lin, Lin
Yang, Chao
Jacquelin, Mathias
Author_xml – sequence: 1
  givenname: Mathias
  surname: Jacquelin
  fullname: Jacquelin, Mathias
  email: mjacquelin@lbl.gov
  organization: Computational Research Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
– sequence: 2
  givenname: Lin
  surname: Lin
  fullname: Lin, Lin
  email: linlin@math.berkeley.edu
  organization: Computational Research Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
– sequence: 3
  givenname: Chao
  surname: Yang
  fullname: Yang, Chao
  email: cyang@lbl.gov
  organization: Computational Research Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
BackLink https://www.osti.gov/servlets/purl/1478750$$D View this record in Osti.gov
BookMark eNqFkL1OwzAUhS1UJNrCE7BY7Am-SZrESAxVxU-lSiDR3XKcG-rKiSs7VOrGO_CGPAkOZWKA6S7fd3TumZBRZzsk5BJYDAzy6228k07ZOGFQxAAxY_yEjKEskqhI03xExoEqohI4nJGJ91vGWJ6VbEyq5xc0y25PP98_6JzW2vdOV2891rTF1roDDcnSGDRUmlfrdL9paWMd9WhQDZju9ui8tt0NXW-QhmaRP7QthhxFlfR4Tk4baTxe_NwpWd_frReP0erpYbmYryKVcMYjSCvOC67KTNZZrlKFCbI0axpg9azKFfKyhipPeZbwmmFeZohNYAvgVdOwdEqujrHW91p4pXtUG2W7LtQUkBVlMRug9AgpZ7132Iid0610BwFMDFOKrfieUgxTCgARpgwW_2WFdNmHn3sntfnHvT26GF7fa3RDNewU1toNzWqr__S_AIL5lPU
CitedBy_id crossref_primary_10_1002_cpe_4918
crossref_primary_10_1017_S0962492919000047
crossref_primary_10_1016_j_cpc_2020_107459
crossref_primary_10_3390_e27040384
crossref_primary_10_1137_23M1561531
crossref_primary_10_1063_5_0007045
crossref_primary_10_1080_21681163_2019_1627910
Cites_doi 10.1137/0915085
10.1137/120902616
10.1145/76909.76910
10.1016/j.jcp.2011.11.032
10.1016/j.jcp.2008.06.033
10.1016/j.jcp.2014.07.004
10.1016/j.jcp.2011.05.027
10.1137/0611010
10.1137/S1064827595287997
10.1016/j.jcp.2009.03.035
10.1137/S0895479899358194
10.1145/1916461.1916464
10.4310/CMS.2009.v7.n3.a12
10.1103/RevModPhys.78.865
10.1145/360680.360704
10.1002/nme.4518
10.1088/0953-8984/25/29/295501
10.1103/PhysRev.140.A1133
10.1137/100799411
10.1145/779359.779361
ContentType Journal Article
Copyright 2017 Elsevier B.V.
Copyright_xml – notice: 2017 Elsevier B.V.
CorporateAuthor Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States). National Energy Research Scientific Computing Center (NERSC)
CorporateAuthor_xml – name: Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States). National Energy Research Scientific Computing Center (NERSC)
DBID AAYXX
CITATION
OIOZB
OTOTI
DOI 10.1016/j.parco.2017.11.009
DatabaseName CrossRef
OSTI.GOV - Hybrid
OSTI.GOV
DatabaseTitle CrossRef
DatabaseTitleList

DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISSN 1872-7336
EndPage 98
ExternalDocumentID 1478750
10_1016_j_parco_2017_11_009
S0167819117301941
GroupedDBID --K
--M
-~X
.DC
.~1
0R~
123
1B1
1~.
1~5
29O
4.4
457
4G.
5VS
6OB
7-5
71M
8P~
9JN
AACTN
AAEDT
AAEDW
AAIAV
AAIKJ
AAKOC
AALRI
AAOAW
AAQFI
AAQXK
AAXUO
AAYFN
ABBOA
ABEFU
ABFNM
ABJNI
ABMAC
ABXDB
ABYKQ
ACDAQ
ACGFS
ACNNM
ACRLP
ACZNC
ADBBV
ADEZE
ADJOM
ADMUD
ADTZH
AEBSH
AECPX
AEKER
AENEX
AFKWA
AFTJW
AGHFR
AGUBO
AGYEJ
AHHHB
AHJVU
AHZHX
AIALX
AIEXJ
AIKHN
AITUG
AJBFU
AJOXV
ALMA_UNASSIGNED_HOLDINGS
AMFUW
AMRAJ
AOUOD
ASPBG
AVWKF
AXJTR
AZFZN
BJAXD
BKOJK
BLXMC
CS3
DU5
EBS
EFJIC
EFLBG
EJD
EO8
EO9
EP2
EP3
F5P
FDB
FEDTE
FGOYB
FIRID
FNPLU
FYGXN
G-Q
G8K
GBLVA
GBOLZ
HLZ
HVGLF
HZ~
H~9
IHE
J1W
JJJVA
KOM
LG9
M41
MO0
N9A
O-L
O9-
OAUVE
OZT
P-8
P-9
P2P
PC.
Q38
R2-
RIG
ROL
RPZ
SBC
SCC
SDF
SDG
SDP
SES
SEW
SPC
SPCBC
SST
SSV
SSZ
T5K
WH7
WUQ
XPP
ZMT
~G-
9DU
AATTM
AAXKI
AAYWO
AAYXX
ABDPE
ABWVN
ACLOT
ACRPL
ACVFH
ADCNI
ADNMO
AEIPS
AEUPX
AFJKZ
AFPUW
AGQPQ
AIGII
AIIUN
AKBMS
AKRWK
AKYEP
ANKPU
APXCP
CITATION
EFKBS
~HD
OIOZB
OTOTI
ID FETCH-LOGICAL-c2909-13b9979c84ad46c3ce2e034ff10d5b6ce98d1b639429d0e684eef4ad719bff03
ISSN 0167-8191
IngestDate Mon Feb 10 03:54:50 EST 2025
Sat Nov 29 07:27:20 EST 2025
Tue Nov 18 22:23:06 EST 2025
Fri Feb 23 02:46:05 EST 2024
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue C
Keywords Parallel algorithm
High performance computation
Non-symmetric
Selected inversion
Language English
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-c2909-13b9979c84ad46c3ce2e034ff10d5b6ce98d1b639429d0e684eef4ad719bff03
Notes AC02-05CH11231; 1450372
National Science Foundation (NSF)
USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR). Scientific Discovery through Advanced Computing (SciDAC)
USDOE Office of Science (SC), Basic Energy Sciences (BES)
OpenAccessLink https://www.osti.gov/servlets/purl/1478750
PageCount 15
ParticipantIDs osti_scitechconnect_1478750
crossref_primary_10_1016_j_parco_2017_11_009
crossref_citationtrail_10_1016_j_parco_2017_11_009
elsevier_sciencedirect_doi_10_1016_j_parco_2017_11_009
PublicationCentury 2000
PublicationDate 2018-05-01
PublicationDateYYYYMMDD 2018-05-01
PublicationDate_xml – month: 05
  year: 2018
  text: 2018-05-01
  day: 01
PublicationDecade 2010
PublicationPlace United States
PublicationPlace_xml – name: United States
PublicationTitle Parallel computing
PublicationYear 2018
Publisher Elsevier B.V
Elsevier
Publisher_xml – name: Elsevier B.V
– name: Elsevier
References Li, Ahmed, Klimeck, Darve (bib0006) 2008; 227
Lin, Lu, Ying, E (bib0030) 2012; 231
Lin, Yang, Meza, Lu, Ying, E (bib0003) 2011; 37
Duff, Grimes, Lewis (bib0021) 1992
Kohn, Sham (bib0031) 1965; 140
Amestoy, Duff, L’Excellent, Koster (bib0024) 2001; 23
Petersen, Li, Stokbro, Sørensen, Hansen, Skelboe, Darve (bib0011) 2009; 228
Erisman, Tinney (bib0002) 1975; 18
Jacquelin, Lin, Yang (bib0004) 2016; 43
Soler, Artacho, Gale, García, Junquera, Ordejón, Sánchez-Portal (bib0029) 2002; 14
Takahashi, Fagan, Chin (bib0001) 1973
Liu (bib0028) 1990; 11
Z. Xu, A.C. Maggs, Solving fluctuation-enhanced Poisson-Boltzmann equations
Lin, García, Huhs, Yang (bib0015) 2014; 26
Kotliar, Savrasov, Haule, Oudovenko, Parcollet, Marianetti (bib0016) 2006; 78
Li, Demmel (bib0019) 2003; 29
Li, Darve (bib0007) 2012; 231
Golub, Van Loan (bib0018) 1996
Amestoy, Duff, L’Excellent, Rouet (bib0010) 2015; 37
Campbell, Davis (bib0005) 1995
Amestoy, Duff, L’Excellent, Robert, Rouet, Uçar (bib0009) 2012; 34
Rothberg, Gupta (bib0027) 1994; 15
(2013).
Lin, Lu, Ying, Car, E (bib0012) 2009; 7
Kuzmin, Luisier, Schenk (bib0013) 2013
Schenk, Gartner (bib0025) 2006; 23
Karypis, Kumar (bib0032) 1998; 20
Jacquelin, Lin, Wichmann, Yang (bib0020) 2016
Davis, Hu (bib0022) 2011; 38
Blackford (bib0026) 1997; 4
Lin, Chen, Yang, He (bib0014) 2013; 25
Ashcraft, Grimes (bib0023) 1989; 15
Hetmaniuk, Zhao, Anantram (bib0008) 2013
Amestoy (10.1016/j.parco.2017.11.009_bib0024) 2001; 23
Petersen (10.1016/j.parco.2017.11.009_bib0011) 2009; 228
Campbell (10.1016/j.parco.2017.11.009_bib0005) 1995
Kuzmin (10.1016/j.parco.2017.11.009_bib0013) 2013
Ashcraft (10.1016/j.parco.2017.11.009_bib0023) 1989; 15
Blackford (10.1016/j.parco.2017.11.009_bib0026) 1997; 4
Golub (10.1016/j.parco.2017.11.009_bib0018) 1996
Lin (10.1016/j.parco.2017.11.009_bib0003) 2011; 37
Li (10.1016/j.parco.2017.11.009_bib0006) 2008; 227
Li (10.1016/j.parco.2017.11.009_bib0019) 2003; 29
Lin (10.1016/j.parco.2017.11.009_bib0030) 2012; 231
Amestoy (10.1016/j.parco.2017.11.009_bib0010) 2015; 37
Hetmaniuk (10.1016/j.parco.2017.11.009_bib0008) 2013
Rothberg (10.1016/j.parco.2017.11.009_bib0027) 1994; 15
Liu (10.1016/j.parco.2017.11.009_bib0028) 1990; 11
10.1016/j.parco.2017.11.009_bib0017
Duff (10.1016/j.parco.2017.11.009_bib0021) 1992
Lin (10.1016/j.parco.2017.11.009_bib0014) 2013; 25
Amestoy (10.1016/j.parco.2017.11.009_bib0009) 2012; 34
Kohn (10.1016/j.parco.2017.11.009_bib0031) 1965; 140
Jacquelin (10.1016/j.parco.2017.11.009_bib0020) 2016
Takahashi (10.1016/j.parco.2017.11.009_bib0001) 1973
Li (10.1016/j.parco.2017.11.009_bib0007) 2012; 231
Lin (10.1016/j.parco.2017.11.009_bib0015) 2014; 26
Jacquelin (10.1016/j.parco.2017.11.009_bib0004) 2016; 43
Lin (10.1016/j.parco.2017.11.009_bib0012) 2009; 7
Karypis (10.1016/j.parco.2017.11.009_bib0032) 1998; 20
Erisman (10.1016/j.parco.2017.11.009_bib0002) 1975; 18
Davis (10.1016/j.parco.2017.11.009_bib0022) 2011; 38
Schenk (10.1016/j.parco.2017.11.009_bib0025) 2006; 23
Kotliar (10.1016/j.parco.2017.11.009_bib0016) 2006; 78
Soler (10.1016/j.parco.2017.11.009_bib0029) 2002; 14
References_xml – volume: 231
  start-page: 2140
  year: 2012
  end-page: 2154
  ident: bib0030
  article-title: Adaptive local basis set for Kohn-Sham density functional theory in a discontinuous Galerkin framework I: total energy calculation
  publication-title: J. Comput. Phys.
– volume: 38
  start-page: 1
  year: 2011
  ident: bib0022
  article-title: The University of Florida sparse matrix collection
  publication-title: ACM Trans. Math. Software
– volume: 18
  start-page: 177
  year: 1975
  ident: bib0002
  article-title: On computing certain elements of the inverse of a sparse matrix
  publication-title: Commun. ACM
– volume: 227
  start-page: 9408
  year: 2008
  end-page: 9427
  ident: bib0006
  article-title: Computing entries of the inverse of a sparse matrix using the FIND algorithm
  publication-title: J. Comput. Phys.
– volume: 34
  start-page: A1975
  year: 2012
  end-page: A1999
  ident: bib0009
  article-title: On computing inverse entries of a sparse matrix in an out-of-core environment
  publication-title: SIAM J. Sci. Comput.
– volume: 228
  start-page: 5020
  year: 2009
  end-page: 5039
  ident: bib0011
  article-title: A hybrid method for the parallel computation of Green’s functions
  publication-title: J. Comput. Phys.
– volume: 11
  start-page: 134
  year: 1990
  ident: bib0028
  article-title: The role of elimination trees in sparse factorization
  publication-title: SIAM J. Matrix Anal. Appl.
– volume: 25
  start-page: 295501
  year: 2013
  ident: bib0014
  article-title: Accelerating atomic orbital-based electronic structure calculation via pole expansion and selected inversion
  publication-title: J. Phys. Condens. Matter
– year: 1973
  ident: bib0001
  article-title: Formation of a sparse bus impedance matrix and its application to short circuit study
  publication-title: 8th PICA Conf. Proc.
– volume: 4
  year: 1997
  ident: bib0026
  article-title: ScaLAPACK User’s Guide
– volume: 231
  start-page: 1121
  year: 2012
  end-page: 1139
  ident: bib0007
  article-title: Extension and optimization of the find algorithm: computing greens and less-than greens functions
  publication-title: J. Comput. Phys.
– volume: 29
  start-page: 110
  year: 2003
  ident: bib0019
  article-title: SuperLU_DIST: a scalable distributed-memory sparse direct solver for unsymmetric linear systems
  publication-title: ACM Trans. Math. Softw.
– volume: 140
  start-page: A1133
  year: 1965
  end-page: A1138
  ident: bib0031
  article-title: Self-consistent equations including exchange and correlation effects
  publication-title: Phys. Rev.
– volume: 43
  start-page: 21:1
  year: 2016
  end-page: 21:28
  ident: bib0004
  article-title: Pselinv – a distributed memory parallel algorithm for selected inversion: the symmetric case
  publication-title: ACM Trans. Math. Softw.
– volume: 20
  start-page: 359
  year: 1998
  end-page: 392
  ident: bib0032
  article-title: A fast and high quality multilevel scheme for partitioning irregular graphs
  publication-title: SIAM J. Sci. Comput.
– start-page: 533
  year: 2013
  end-page: 544
  ident: bib0013
  article-title: Fast methods for computing selected elements of the Greens function in massively parallel nanoelectronic device simulations
  publication-title: Euro-Par 2013 Parallel Processing
– volume: 37
  start-page: C268
  year: 2015
  end-page: C284
  ident: bib0010
  article-title: Parallel computation of entries of
  publication-title: SIAM J. Sci. Comput.
– volume: 26
  start-page: 305503
  year: 2014
  ident: bib0015
  article-title: SIESTA-PEXSI: massively parallel method for efficient and accurate
  publication-title: J. Phys.
– volume: 23
  start-page: 158
  year: 2006
  end-page: 179
  ident: bib0025
  article-title: On fast factorization pivoting methods for symmetric indefinite systems
  publication-title: Electron. Trans. Numer. Anal.
– volume: 15
  start-page: 1413
  year: 1994
  end-page: 1439
  ident: bib0027
  article-title: An efficient block-oriented approach to parallel sparse Cholesky factorization
  publication-title: SIAM J. Sci. Comput.
– year: 1992
  ident: bib0021
  article-title: User’s Guide for the Harwell-Boeing Sparse Matrix Collection
– year: 1996
  ident: bib0018
  article-title: Matrix Computations
– year: 1995
  ident: bib0005
  article-title: Computing the sparse inverse subset: an inverse multifrontal approach
  publication-title: Technical Report, TR-95-021
– year: 2013
  ident: bib0008
  article-title: A nested dissection approach to modeling transport in nanodevices: algorithms and applications
  publication-title: Int. J. Numer. Method Eng.
– volume: 14
  start-page: 2745
  year: 2002
  end-page: 2779
  ident: bib0029
  article-title: The SIESTA method for ab initio order-N materials simulation
  publication-title: J. Phys.
– reference: Z. Xu, A.C. Maggs, Solving fluctuation-enhanced Poisson-Boltzmann equations,
– volume: 15
  start-page: 291
  year: 1989
  end-page: 309
  ident: bib0023
  article-title: The influence of relaxed supernode partitions on the multifrontal method
  publication-title: ACM Trans. Math. Softw.
– reference: (2013).
– volume: 23
  start-page: 15
  year: 2001
  end-page: 41
  ident: bib0024
  article-title: A fully asynchronous multifrontal solver using distributed dynamic scheduling
  publication-title: SIAM J. Matrix Anal. Appl.
– volume: 7
  start-page: 755
  year: 2009
  ident: bib0012
  article-title: Fast algorithm for extracting the diagonal of the inverse matrix with application to the electronic structure analysis of metallic systems
  publication-title: Commun. Math. Sci.
– volume: 78
  start-page: 865
  year: 2006
  end-page: 952
  ident: bib0016
  article-title: Electronic structure calculations with dynamical mean-field theory
  publication-title: Rev. Mod. Phys.
– start-page: 192
  year: 2016
  end-page: 201
  ident: bib0020
  article-title: Enhancing the scalability and load balancing of the parallel selected inversion algorithm via tree-based asynchronous communication
  publication-title: 2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS)
– volume: 37
  start-page: 40
  year: 2011
  ident: bib0003
  article-title: SelInv – an algorithm for selected inversion of a sparse symmetric matrix
  publication-title: ACM. Trans. Math. Softw.
– volume: 15
  start-page: 1413
  year: 1994
  ident: 10.1016/j.parco.2017.11.009_bib0027
  article-title: An efficient block-oriented approach to parallel sparse Cholesky factorization
  publication-title: SIAM J. Sci. Comput.
  doi: 10.1137/0915085
– volume: 37
  start-page: C268
  year: 2015
  ident: 10.1016/j.parco.2017.11.009_bib0010
  article-title: Parallel computation of entries of a−1
  publication-title: SIAM J. Sci. Comput.
  doi: 10.1137/120902616
– volume: 43
  start-page: 21:1
  issue: 3
  year: 2016
  ident: 10.1016/j.parco.2017.11.009_bib0004
  article-title: Pselinv – a distributed memory parallel algorithm for selected inversion: the symmetric case
  publication-title: ACM Trans. Math. Softw.
– volume: 15
  start-page: 291
  year: 1989
  ident: 10.1016/j.parco.2017.11.009_bib0023
  article-title: The influence of relaxed supernode partitions on the multifrontal method
  publication-title: ACM Trans. Math. Softw.
  doi: 10.1145/76909.76910
– volume: 231
  start-page: 2140
  year: 2012
  ident: 10.1016/j.parco.2017.11.009_bib0030
  article-title: Adaptive local basis set for Kohn-Sham density functional theory in a discontinuous Galerkin framework I: total energy calculation
  publication-title: J. Comput. Phys.
  doi: 10.1016/j.jcp.2011.11.032
– volume: 227
  start-page: 9408
  year: 2008
  ident: 10.1016/j.parco.2017.11.009_bib0006
  article-title: Computing entries of the inverse of a sparse matrix using the FIND algorithm
  publication-title: J. Comput. Phys.
  doi: 10.1016/j.jcp.2008.06.033
– ident: 10.1016/j.parco.2017.11.009_bib0017
  doi: 10.1016/j.jcp.2014.07.004
– volume: 231
  start-page: 1121
  issue: 4
  year: 2012
  ident: 10.1016/j.parco.2017.11.009_bib0007
  article-title: Extension and optimization of the find algorithm: computing greens and less-than greens functions
  publication-title: J. Comput. Phys.
  doi: 10.1016/j.jcp.2011.05.027
– volume: 11
  start-page: 134
  year: 1990
  ident: 10.1016/j.parco.2017.11.009_bib0028
  article-title: The role of elimination trees in sparse factorization
  publication-title: SIAM J. Matrix Anal. Appl.
  doi: 10.1137/0611010
– volume: 20
  start-page: 359
  year: 1998
  ident: 10.1016/j.parco.2017.11.009_bib0032
  article-title: A fast and high quality multilevel scheme for partitioning irregular graphs
  publication-title: SIAM J. Sci. Comput.
  doi: 10.1137/S1064827595287997
– volume: 38
  start-page: 1
  year: 2011
  ident: 10.1016/j.parco.2017.11.009_bib0022
  article-title: The University of Florida sparse matrix collection
  publication-title: ACM Trans. Math. Software
– year: 1992
  ident: 10.1016/j.parco.2017.11.009_bib0021
– volume: 228
  start-page: 5020
  year: 2009
  ident: 10.1016/j.parco.2017.11.009_bib0011
  article-title: A hybrid method for the parallel computation of Green’s functions
  publication-title: J. Comput. Phys.
  doi: 10.1016/j.jcp.2009.03.035
– year: 1996
  ident: 10.1016/j.parco.2017.11.009_bib0018
– volume: 23
  start-page: 15
  year: 2001
  ident: 10.1016/j.parco.2017.11.009_bib0024
  article-title: A fully asynchronous multifrontal solver using distributed dynamic scheduling
  publication-title: SIAM J. Matrix Anal. Appl.
  doi: 10.1137/S0895479899358194
– year: 1973
  ident: 10.1016/j.parco.2017.11.009_bib0001
  article-title: Formation of a sparse bus impedance matrix and its application to short circuit study
– volume: 37
  start-page: 40
  year: 2011
  ident: 10.1016/j.parco.2017.11.009_bib0003
  article-title: SelInv – an algorithm for selected inversion of a sparse symmetric matrix
  publication-title: ACM. Trans. Math. Softw.
  doi: 10.1145/1916461.1916464
– volume: 7
  start-page: 755
  year: 2009
  ident: 10.1016/j.parco.2017.11.009_bib0012
  article-title: Fast algorithm for extracting the diagonal of the inverse matrix with application to the electronic structure analysis of metallic systems
  publication-title: Commun. Math. Sci.
  doi: 10.4310/CMS.2009.v7.n3.a12
– start-page: 533
  year: 2013
  ident: 10.1016/j.parco.2017.11.009_bib0013
  article-title: Fast methods for computing selected elements of the Greens function in massively parallel nanoelectronic device simulations
– volume: 23
  start-page: 158
  year: 2006
  ident: 10.1016/j.parco.2017.11.009_bib0025
  article-title: On fast factorization pivoting methods for symmetric indefinite systems
  publication-title: Electron. Trans. Numer. Anal.
– volume: 78
  start-page: 865
  year: 2006
  ident: 10.1016/j.parco.2017.11.009_bib0016
  article-title: Electronic structure calculations with dynamical mean-field theory
  publication-title: Rev. Mod. Phys.
  doi: 10.1103/RevModPhys.78.865
– volume: 26
  start-page: 305503
  year: 2014
  ident: 10.1016/j.parco.2017.11.009_bib0015
  article-title: SIESTA-PEXSI: massively parallel method for efficient and accurate ab initio materials simulation without matrix diagonalization
  publication-title: J. Phys.
– volume: 4
  year: 1997
  ident: 10.1016/j.parco.2017.11.009_bib0026
– volume: 18
  start-page: 177
  year: 1975
  ident: 10.1016/j.parco.2017.11.009_bib0002
  article-title: On computing certain elements of the inverse of a sparse matrix
  publication-title: Commun. ACM
  doi: 10.1145/360680.360704
– volume: 14
  start-page: 2745
  year: 2002
  ident: 10.1016/j.parco.2017.11.009_bib0029
  article-title: The SIESTA method for ab initio order-N materials simulation
  publication-title: J. Phys.
– year: 2013
  ident: 10.1016/j.parco.2017.11.009_bib0008
  article-title: A nested dissection approach to modeling transport in nanodevices: algorithms and applications
  publication-title: Int. J. Numer. Method Eng.
  doi: 10.1002/nme.4518
– volume: 25
  start-page: 295501
  year: 2013
  ident: 10.1016/j.parco.2017.11.009_bib0014
  article-title: Accelerating atomic orbital-based electronic structure calculation via pole expansion and selected inversion
  publication-title: J. Phys. Condens. Matter
  doi: 10.1088/0953-8984/25/29/295501
– volume: 140
  start-page: A1133
  year: 1965
  ident: 10.1016/j.parco.2017.11.009_bib0031
  article-title: Self-consistent equations including exchange and correlation effects
  publication-title: Phys. Rev.
  doi: 10.1103/PhysRev.140.A1133
– year: 1995
  ident: 10.1016/j.parco.2017.11.009_bib0005
  article-title: Computing the sparse inverse subset: an inverse multifrontal approach
– volume: 34
  start-page: A1975
  year: 2012
  ident: 10.1016/j.parco.2017.11.009_bib0009
  article-title: On computing inverse entries of a sparse matrix in an out-of-core environment
  publication-title: SIAM J. Sci. Comput.
  doi: 10.1137/100799411
– volume: 29
  start-page: 110
  year: 2003
  ident: 10.1016/j.parco.2017.11.009_bib0019
  article-title: SuperLU_DIST: a scalable distributed-memory sparse direct solver for unsymmetric linear systems
  publication-title: ACM Trans. Math. Softw.
  doi: 10.1145/779359.779361
– start-page: 192
  year: 2016
  ident: 10.1016/j.parco.2017.11.009_bib0020
  article-title: Enhancing the scalability and load balancing of the parallel selected inversion algorithm via tree-based asynchronous communication
SSID ssj0006480
Score 2.2505891
Snippet •Parallel selected inversion for non-symmetric sparse matrices.•High performance implementation on distributed memory platforms.•Experiments demonstrate...
This paper generalizes the parallel selected inversion algorithm called PSelInv to sparse non-symmetric matrices. We assume a general sparse matrix A has been...
SourceID osti
crossref
elsevier
SourceType Open Access Repository
Enrichment Source
Index Database
Publisher
StartPage 84
SubjectTerms High performance computation
MATHEMATICS AND COMPUTING
Non-symmetric
Parallel algorithm
Selected inversion
Title PSelInv – A distributed memory parallel algorithm for selected inversion: The non-symmetric case
URI https://dx.doi.org/10.1016/j.parco.2017.11.009
https://www.osti.gov/servlets/purl/1478750
Volume 74
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVESC
  databaseName: Elsevier SD Freedom Collection Journals 2021
  customDbUrl:
  eissn: 1872-7336
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0006480
  issn: 0167-8191
  databaseCode: AIEXJ
  dateStart: 19950101
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Lj9MwELZQlwMX3ojdBeQDt5JVHk4cc6uqRSwqq0pUqJyi2HHYrtp01ZZq98Z_4B_yS5iJHw0CKjhwiSInTip_k_F4-nk-Ql4mEEKzTOtA8DoMmIBPqhRhGnDFcp2lSsYtN-fjiJ-f59OpGFu10XUrJ8CbJr--Flf_FWpoA7Bx6-w_wO0fCg1wDqDDEWCH418BP_6g52fNtu9oDAl8-xWWx0VlKwgvF8itveljze_5XCNB-fNyNdtcLFrG4brVxdFYkWlrUmmOl9Esm2B9s1igBJfqK_evjg1sx-5xqpWJcBOi2cUPc4_VhX-PZMfSx_Ej0zqaeRP9ZPPXw4ty2c1IRPmO_2fSZHZO72YtwRvjyrDrdo04jzWv4W-9uUksXJ7AiCjcqBnxE6y4Gord5OUphRFWGcLkzUHMUwEu-mBwdjp952fljLUqev6nuApULdfvlxf8KUrpLcHxdgKQyX1y164c6MAg_oDc0s1Dcs-pclDrpB8RaQ2Afv_6jQ5oB3pqoKcOeuqhpwA9ddBTD_1rCsDTn4CnCPxjMnlzOhm-DaySRqBigXoXiRSCC5WzsmKZSpSOdZiwuo7CKpWZ0iKvIgnBKkQnVaiznGldw708ErKuw-QJ6cG79FNCUx5JUdWxUKliWZXmNU94qGARIFNVSn1IYjdwhbJV5lHsZF44OuFl0Y52gaMN688CRvuQvPKdrkyRlf23Zw6RwsaJJv4rwHD2dzxG_LATVkhWSCWDXtZ0jvZePSZ3drb-jPQ2qy_6ObmttpvZevXCWtsPWi-Reg
linkProvider Elsevier
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=PSelInv+%E2%80%93+A+distributed+memory+parallel+algorithm+for+selected+inversion%3A+The+non-symmetric+case&rft.jtitle=Parallel+computing&rft.au=Jacquelin%2C+Mathias&rft.au=Lin%2C+Lin&rft.au=Yang%2C+Chao&rft.date=2018-05-01&rft.pub=Elsevier&rft.issn=0167-8191&rft.volume=74&rft.issue=C&rft_id=info:doi/10.1016%2Fj.parco.2017.11.009&rft.externalDocID=1478750
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0167-8191&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0167-8191&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0167-8191&client=summon