PSelInv – A distributed memory parallel algorithm for selected inversion: The non-symmetric case
•Parallel selected inversion for non-symmetric sparse matrices.•High performance implementation on distributed memory platforms.•Experiments demonstrate excellent strong and weak scalability. This paper generalizes the parallel selected inversion algorithm called PSelInv to sparse non-symmetric matr...
Gespeichert in:
| Veröffentlicht in: | Parallel computing Jg. 74; H. C; S. 84 - 98 |
|---|---|
| Hauptverfasser: | , , |
| Format: | Journal Article |
| Sprache: | Englisch |
| Veröffentlicht: |
United States
Elsevier B.V
01.05.2018
Elsevier |
| Schlagworte: | |
| ISSN: | 0167-8191, 1872-7336 |
| Online-Zugang: | Volltext |
| Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
| Abstract | •Parallel selected inversion for non-symmetric sparse matrices.•High performance implementation on distributed memory platforms.•Experiments demonstrate excellent strong and weak scalability.
This paper generalizes the parallel selected inversion algorithm called PSelInv to sparse non-symmetric matrices. We assume a general sparse matrix A has been decomposed as PAQ=LU on a distributed memory parallel machine, where L, U are lower and upper triangular matrices, and P, Q are permutation matrices, respectively. The PSelInvmethod computes selected elements of A−1. The selection is confined by the sparsity pattern of the matrix AT. Our algorithm does not assume any symmetry properties of A, and our parallel implementation is memory efficient, in the sense that the computed elements of A−T overwrites the sparse matrix L+Uin situ. PSelInv involves a large number of collective data communication activities within different processor groups of various sizes. In order to minimize idle time and improve load balancing, tree-based asynchronous communication is used to coordinate all such collective communication. Numerical results demonstrate that PSelInv can scale efficiently to 6,400 cores for a variety of matrices. |
|---|---|
| AbstractList | •Parallel selected inversion for non-symmetric sparse matrices.•High performance implementation on distributed memory platforms.•Experiments demonstrate excellent strong and weak scalability.
This paper generalizes the parallel selected inversion algorithm called PSelInv to sparse non-symmetric matrices. We assume a general sparse matrix A has been decomposed as PAQ=LU on a distributed memory parallel machine, where L, U are lower and upper triangular matrices, and P, Q are permutation matrices, respectively. The PSelInvmethod computes selected elements of A−1. The selection is confined by the sparsity pattern of the matrix AT. Our algorithm does not assume any symmetry properties of A, and our parallel implementation is memory efficient, in the sense that the computed elements of A−T overwrites the sparse matrix L+Uin situ. PSelInv involves a large number of collective data communication activities within different processor groups of various sizes. In order to minimize idle time and improve load balancing, tree-based asynchronous communication is used to coordinate all such collective communication. Numerical results demonstrate that PSelInv can scale efficiently to 6,400 cores for a variety of matrices. This paper generalizes the parallel selected inversion algorithm called PSelInv to sparse non-symmetric matrices. We assume a general sparse matrix A has been decomposed as $PAQ$ = $LU$ on a distributed memory parallel machine, where L, U are lower and upper triangular matrices, and P, Q are permutation matrices, respectively. The PSelInvmethod computes selected elements of $A$-1. The selection is confined by the sparsity pattern of the matrix AT. Here our algorithm does not assume any symmetry properties of A, and our parallel implementation is memory efficient, in the sense that the computed elements of $A$-T overwrites the sparse matrix $L + U$ in situ. PSelInv involves a large number of collective data communication activities within different processor groups of various sizes. In order to minimize idle time and improve load balancing, tree-based asynchronous communication is used to coordinate all such collective communication. Numerical results demonstrate that PSelInv can scale efficiently to 6,400 cores for a variety of matrices. |
| Author | Lin, Lin Yang, Chao Jacquelin, Mathias |
| Author_xml | – sequence: 1 givenname: Mathias surname: Jacquelin fullname: Jacquelin, Mathias email: mjacquelin@lbl.gov organization: Computational Research Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA – sequence: 2 givenname: Lin surname: Lin fullname: Lin, Lin email: linlin@math.berkeley.edu organization: Computational Research Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA – sequence: 3 givenname: Chao surname: Yang fullname: Yang, Chao email: cyang@lbl.gov organization: Computational Research Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA |
| BackLink | https://www.osti.gov/servlets/purl/1478750$$D View this record in Osti.gov |
| BookMark | eNqFkL1OwzAUhS1UJNrCE7BY7Am-SZrESAxVxU-lSiDR3XKcG-rKiSs7VOrGO_CGPAkOZWKA6S7fd3TumZBRZzsk5BJYDAzy6228k07ZOGFQxAAxY_yEjKEskqhI03xExoEqohI4nJGJ91vGWJ6VbEyq5xc0y25PP98_6JzW2vdOV2891rTF1roDDcnSGDRUmlfrdL9paWMd9WhQDZju9ui8tt0NXW-QhmaRP7QthhxFlfR4Tk4baTxe_NwpWd_frReP0erpYbmYryKVcMYjSCvOC67KTNZZrlKFCbI0axpg9azKFfKyhipPeZbwmmFeZohNYAvgVdOwdEqujrHW91p4pXtUG2W7LtQUkBVlMRug9AgpZ7132Iid0610BwFMDFOKrfieUgxTCgARpgwW_2WFdNmHn3sntfnHvT26GF7fa3RDNewU1toNzWqr__S_AIL5lPU |
| CitedBy_id | crossref_primary_10_1002_cpe_4918 crossref_primary_10_1017_S0962492919000047 crossref_primary_10_1016_j_cpc_2020_107459 crossref_primary_10_3390_e27040384 crossref_primary_10_1137_23M1561531 crossref_primary_10_1063_5_0007045 crossref_primary_10_1080_21681163_2019_1627910 |
| Cites_doi | 10.1137/0915085 10.1137/120902616 10.1145/76909.76910 10.1016/j.jcp.2011.11.032 10.1016/j.jcp.2008.06.033 10.1016/j.jcp.2014.07.004 10.1016/j.jcp.2011.05.027 10.1137/0611010 10.1137/S1064827595287997 10.1016/j.jcp.2009.03.035 10.1137/S0895479899358194 10.1145/1916461.1916464 10.4310/CMS.2009.v7.n3.a12 10.1103/RevModPhys.78.865 10.1145/360680.360704 10.1002/nme.4518 10.1088/0953-8984/25/29/295501 10.1103/PhysRev.140.A1133 10.1137/100799411 10.1145/779359.779361 |
| ContentType | Journal Article |
| Copyright | 2017 Elsevier B.V. |
| Copyright_xml | – notice: 2017 Elsevier B.V. |
| CorporateAuthor | Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States). National Energy Research Scientific Computing Center (NERSC) |
| CorporateAuthor_xml | – name: Lawrence Berkeley National Laboratory (LBNL), Berkeley, CA (United States). National Energy Research Scientific Computing Center (NERSC) |
| DBID | AAYXX CITATION OIOZB OTOTI |
| DOI | 10.1016/j.parco.2017.11.009 |
| DatabaseName | CrossRef OSTI.GOV - Hybrid OSTI.GOV |
| DatabaseTitle | CrossRef |
| DatabaseTitleList | |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Computer Science |
| EISSN | 1872-7336 |
| EndPage | 98 |
| ExternalDocumentID | 1478750 10_1016_j_parco_2017_11_009 S0167819117301941 |
| GroupedDBID | --K --M -~X .DC .~1 0R~ 123 1B1 1~. 1~5 29O 4.4 457 4G. 5VS 6OB 7-5 71M 8P~ 9JN AACTN AAEDT AAEDW AAIAV AAIKJ AAKOC AALRI AAOAW AAQFI AAQXK AAXUO AAYFN ABBOA ABEFU ABFNM ABJNI ABMAC ABXDB ABYKQ ACDAQ ACGFS ACNNM ACRLP ACZNC ADBBV ADEZE ADJOM ADMUD ADTZH AEBSH AECPX AEKER AENEX AFKWA AFTJW AGHFR AGUBO AGYEJ AHHHB AHJVU AHZHX AIALX AIEXJ AIKHN AITUG AJBFU AJOXV ALMA_UNASSIGNED_HOLDINGS AMFUW AMRAJ AOUOD ASPBG AVWKF AXJTR AZFZN BJAXD BKOJK BLXMC CS3 DU5 EBS EFJIC EFLBG EJD EO8 EO9 EP2 EP3 F5P FDB FEDTE FGOYB FIRID FNPLU FYGXN G-Q G8K GBLVA GBOLZ HLZ HVGLF HZ~ H~9 IHE J1W JJJVA KOM LG9 M41 MO0 N9A O-L O9- OAUVE OZT P-8 P-9 P2P PC. Q38 R2- RIG ROL RPZ SBC SCC SDF SDG SDP SES SEW SPC SPCBC SST SSV SSZ T5K WH7 WUQ XPP ZMT ~G- 9DU AATTM AAXKI AAYWO AAYXX ABDPE ABWVN ACLOT ACRPL ACVFH ADCNI ADNMO AEIPS AEUPX AFJKZ AFPUW AGQPQ AIGII AIIUN AKBMS AKRWK AKYEP ANKPU APXCP CITATION EFKBS ~HD OIOZB OTOTI |
| ID | FETCH-LOGICAL-c2909-13b9979c84ad46c3ce2e034ff10d5b6ce98d1b639429d0e684eef4ad719bff03 |
| ISSN | 0167-8191 |
| IngestDate | Mon Feb 10 03:54:50 EST 2025 Sat Nov 29 07:27:20 EST 2025 Tue Nov 18 22:23:06 EST 2025 Fri Feb 23 02:46:05 EST 2024 |
| IsDoiOpenAccess | true |
| IsOpenAccess | true |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | C |
| Keywords | Parallel algorithm High performance computation Non-symmetric Selected inversion |
| Language | English |
| LinkModel | OpenURL |
| MergedId | FETCHMERGED-LOGICAL-c2909-13b9979c84ad46c3ce2e034ff10d5b6ce98d1b639429d0e684eef4ad719bff03 |
| Notes | AC02-05CH11231; 1450372 National Science Foundation (NSF) USDOE Office of Science (SC), Advanced Scientific Computing Research (ASCR). Scientific Discovery through Advanced Computing (SciDAC) USDOE Office of Science (SC), Basic Energy Sciences (BES) |
| OpenAccessLink | https://www.osti.gov/servlets/purl/1478750 |
| PageCount | 15 |
| ParticipantIDs | osti_scitechconnect_1478750 crossref_primary_10_1016_j_parco_2017_11_009 crossref_citationtrail_10_1016_j_parco_2017_11_009 elsevier_sciencedirect_doi_10_1016_j_parco_2017_11_009 |
| PublicationCentury | 2000 |
| PublicationDate | 2018-05-01 |
| PublicationDateYYYYMMDD | 2018-05-01 |
| PublicationDate_xml | – month: 05 year: 2018 text: 2018-05-01 day: 01 |
| PublicationDecade | 2010 |
| PublicationPlace | United States |
| PublicationPlace_xml | – name: United States |
| PublicationTitle | Parallel computing |
| PublicationYear | 2018 |
| Publisher | Elsevier B.V Elsevier |
| Publisher_xml | – name: Elsevier B.V – name: Elsevier |
| References | Li, Ahmed, Klimeck, Darve (bib0006) 2008; 227 Lin, Lu, Ying, E (bib0030) 2012; 231 Lin, Yang, Meza, Lu, Ying, E (bib0003) 2011; 37 Duff, Grimes, Lewis (bib0021) 1992 Kohn, Sham (bib0031) 1965; 140 Amestoy, Duff, L’Excellent, Koster (bib0024) 2001; 23 Petersen, Li, Stokbro, Sørensen, Hansen, Skelboe, Darve (bib0011) 2009; 228 Erisman, Tinney (bib0002) 1975; 18 Jacquelin, Lin, Yang (bib0004) 2016; 43 Soler, Artacho, Gale, García, Junquera, Ordejón, Sánchez-Portal (bib0029) 2002; 14 Takahashi, Fagan, Chin (bib0001) 1973 Liu (bib0028) 1990; 11 Z. Xu, A.C. Maggs, Solving fluctuation-enhanced Poisson-Boltzmann equations Lin, García, Huhs, Yang (bib0015) 2014; 26 Kotliar, Savrasov, Haule, Oudovenko, Parcollet, Marianetti (bib0016) 2006; 78 Li, Demmel (bib0019) 2003; 29 Li, Darve (bib0007) 2012; 231 Golub, Van Loan (bib0018) 1996 Amestoy, Duff, L’Excellent, Rouet (bib0010) 2015; 37 Campbell, Davis (bib0005) 1995 Amestoy, Duff, L’Excellent, Robert, Rouet, Uçar (bib0009) 2012; 34 Rothberg, Gupta (bib0027) 1994; 15 (2013). Lin, Lu, Ying, Car, E (bib0012) 2009; 7 Kuzmin, Luisier, Schenk (bib0013) 2013 Schenk, Gartner (bib0025) 2006; 23 Karypis, Kumar (bib0032) 1998; 20 Jacquelin, Lin, Wichmann, Yang (bib0020) 2016 Davis, Hu (bib0022) 2011; 38 Blackford (bib0026) 1997; 4 Lin, Chen, Yang, He (bib0014) 2013; 25 Ashcraft, Grimes (bib0023) 1989; 15 Hetmaniuk, Zhao, Anantram (bib0008) 2013 Amestoy (10.1016/j.parco.2017.11.009_bib0024) 2001; 23 Petersen (10.1016/j.parco.2017.11.009_bib0011) 2009; 228 Campbell (10.1016/j.parco.2017.11.009_bib0005) 1995 Kuzmin (10.1016/j.parco.2017.11.009_bib0013) 2013 Ashcraft (10.1016/j.parco.2017.11.009_bib0023) 1989; 15 Blackford (10.1016/j.parco.2017.11.009_bib0026) 1997; 4 Golub (10.1016/j.parco.2017.11.009_bib0018) 1996 Lin (10.1016/j.parco.2017.11.009_bib0003) 2011; 37 Li (10.1016/j.parco.2017.11.009_bib0006) 2008; 227 Li (10.1016/j.parco.2017.11.009_bib0019) 2003; 29 Lin (10.1016/j.parco.2017.11.009_bib0030) 2012; 231 Amestoy (10.1016/j.parco.2017.11.009_bib0010) 2015; 37 Hetmaniuk (10.1016/j.parco.2017.11.009_bib0008) 2013 Rothberg (10.1016/j.parco.2017.11.009_bib0027) 1994; 15 Liu (10.1016/j.parco.2017.11.009_bib0028) 1990; 11 10.1016/j.parco.2017.11.009_bib0017 Duff (10.1016/j.parco.2017.11.009_bib0021) 1992 Lin (10.1016/j.parco.2017.11.009_bib0014) 2013; 25 Amestoy (10.1016/j.parco.2017.11.009_bib0009) 2012; 34 Kohn (10.1016/j.parco.2017.11.009_bib0031) 1965; 140 Jacquelin (10.1016/j.parco.2017.11.009_bib0020) 2016 Takahashi (10.1016/j.parco.2017.11.009_bib0001) 1973 Li (10.1016/j.parco.2017.11.009_bib0007) 2012; 231 Lin (10.1016/j.parco.2017.11.009_bib0015) 2014; 26 Jacquelin (10.1016/j.parco.2017.11.009_bib0004) 2016; 43 Lin (10.1016/j.parco.2017.11.009_bib0012) 2009; 7 Karypis (10.1016/j.parco.2017.11.009_bib0032) 1998; 20 Erisman (10.1016/j.parco.2017.11.009_bib0002) 1975; 18 Davis (10.1016/j.parco.2017.11.009_bib0022) 2011; 38 Schenk (10.1016/j.parco.2017.11.009_bib0025) 2006; 23 Kotliar (10.1016/j.parco.2017.11.009_bib0016) 2006; 78 Soler (10.1016/j.parco.2017.11.009_bib0029) 2002; 14 |
| References_xml | – volume: 231 start-page: 2140 year: 2012 end-page: 2154 ident: bib0030 article-title: Adaptive local basis set for Kohn-Sham density functional theory in a discontinuous Galerkin framework I: total energy calculation publication-title: J. Comput. Phys. – volume: 38 start-page: 1 year: 2011 ident: bib0022 article-title: The University of Florida sparse matrix collection publication-title: ACM Trans. Math. Software – volume: 18 start-page: 177 year: 1975 ident: bib0002 article-title: On computing certain elements of the inverse of a sparse matrix publication-title: Commun. ACM – volume: 227 start-page: 9408 year: 2008 end-page: 9427 ident: bib0006 article-title: Computing entries of the inverse of a sparse matrix using the FIND algorithm publication-title: J. Comput. Phys. – volume: 34 start-page: A1975 year: 2012 end-page: A1999 ident: bib0009 article-title: On computing inverse entries of a sparse matrix in an out-of-core environment publication-title: SIAM J. Sci. Comput. – volume: 228 start-page: 5020 year: 2009 end-page: 5039 ident: bib0011 article-title: A hybrid method for the parallel computation of Green’s functions publication-title: J. Comput. Phys. – volume: 11 start-page: 134 year: 1990 ident: bib0028 article-title: The role of elimination trees in sparse factorization publication-title: SIAM J. Matrix Anal. Appl. – volume: 25 start-page: 295501 year: 2013 ident: bib0014 article-title: Accelerating atomic orbital-based electronic structure calculation via pole expansion and selected inversion publication-title: J. Phys. Condens. Matter – year: 1973 ident: bib0001 article-title: Formation of a sparse bus impedance matrix and its application to short circuit study publication-title: 8th PICA Conf. Proc. – volume: 4 year: 1997 ident: bib0026 article-title: ScaLAPACK User’s Guide – volume: 231 start-page: 1121 year: 2012 end-page: 1139 ident: bib0007 article-title: Extension and optimization of the find algorithm: computing greens and less-than greens functions publication-title: J. Comput. Phys. – volume: 29 start-page: 110 year: 2003 ident: bib0019 article-title: SuperLU_DIST: a scalable distributed-memory sparse direct solver for unsymmetric linear systems publication-title: ACM Trans. Math. Softw. – volume: 140 start-page: A1133 year: 1965 end-page: A1138 ident: bib0031 article-title: Self-consistent equations including exchange and correlation effects publication-title: Phys. Rev. – volume: 43 start-page: 21:1 year: 2016 end-page: 21:28 ident: bib0004 article-title: Pselinv – a distributed memory parallel algorithm for selected inversion: the symmetric case publication-title: ACM Trans. Math. Softw. – volume: 20 start-page: 359 year: 1998 end-page: 392 ident: bib0032 article-title: A fast and high quality multilevel scheme for partitioning irregular graphs publication-title: SIAM J. Sci. Comput. – start-page: 533 year: 2013 end-page: 544 ident: bib0013 article-title: Fast methods for computing selected elements of the Greens function in massively parallel nanoelectronic device simulations publication-title: Euro-Par 2013 Parallel Processing – volume: 37 start-page: C268 year: 2015 end-page: C284 ident: bib0010 article-title: Parallel computation of entries of publication-title: SIAM J. Sci. Comput. – volume: 26 start-page: 305503 year: 2014 ident: bib0015 article-title: SIESTA-PEXSI: massively parallel method for efficient and accurate publication-title: J. Phys. – volume: 23 start-page: 158 year: 2006 end-page: 179 ident: bib0025 article-title: On fast factorization pivoting methods for symmetric indefinite systems publication-title: Electron. Trans. Numer. Anal. – volume: 15 start-page: 1413 year: 1994 end-page: 1439 ident: bib0027 article-title: An efficient block-oriented approach to parallel sparse Cholesky factorization publication-title: SIAM J. Sci. Comput. – year: 1992 ident: bib0021 article-title: User’s Guide for the Harwell-Boeing Sparse Matrix Collection – year: 1996 ident: bib0018 article-title: Matrix Computations – year: 1995 ident: bib0005 article-title: Computing the sparse inverse subset: an inverse multifrontal approach publication-title: Technical Report, TR-95-021 – year: 2013 ident: bib0008 article-title: A nested dissection approach to modeling transport in nanodevices: algorithms and applications publication-title: Int. J. Numer. Method Eng. – volume: 14 start-page: 2745 year: 2002 end-page: 2779 ident: bib0029 article-title: The SIESTA method for ab initio order-N materials simulation publication-title: J. Phys. – reference: Z. Xu, A.C. Maggs, Solving fluctuation-enhanced Poisson-Boltzmann equations, – volume: 15 start-page: 291 year: 1989 end-page: 309 ident: bib0023 article-title: The influence of relaxed supernode partitions on the multifrontal method publication-title: ACM Trans. Math. Softw. – reference: (2013). – volume: 23 start-page: 15 year: 2001 end-page: 41 ident: bib0024 article-title: A fully asynchronous multifrontal solver using distributed dynamic scheduling publication-title: SIAM J. Matrix Anal. Appl. – volume: 7 start-page: 755 year: 2009 ident: bib0012 article-title: Fast algorithm for extracting the diagonal of the inverse matrix with application to the electronic structure analysis of metallic systems publication-title: Commun. Math. Sci. – volume: 78 start-page: 865 year: 2006 end-page: 952 ident: bib0016 article-title: Electronic structure calculations with dynamical mean-field theory publication-title: Rev. Mod. Phys. – start-page: 192 year: 2016 end-page: 201 ident: bib0020 article-title: Enhancing the scalability and load balancing of the parallel selected inversion algorithm via tree-based asynchronous communication publication-title: 2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS) – volume: 37 start-page: 40 year: 2011 ident: bib0003 article-title: SelInv – an algorithm for selected inversion of a sparse symmetric matrix publication-title: ACM. Trans. Math. Softw. – volume: 15 start-page: 1413 year: 1994 ident: 10.1016/j.parco.2017.11.009_bib0027 article-title: An efficient block-oriented approach to parallel sparse Cholesky factorization publication-title: SIAM J. Sci. Comput. doi: 10.1137/0915085 – volume: 37 start-page: C268 year: 2015 ident: 10.1016/j.parco.2017.11.009_bib0010 article-title: Parallel computation of entries of a−1 publication-title: SIAM J. Sci. Comput. doi: 10.1137/120902616 – volume: 43 start-page: 21:1 issue: 3 year: 2016 ident: 10.1016/j.parco.2017.11.009_bib0004 article-title: Pselinv – a distributed memory parallel algorithm for selected inversion: the symmetric case publication-title: ACM Trans. Math. Softw. – volume: 15 start-page: 291 year: 1989 ident: 10.1016/j.parco.2017.11.009_bib0023 article-title: The influence of relaxed supernode partitions on the multifrontal method publication-title: ACM Trans. Math. Softw. doi: 10.1145/76909.76910 – volume: 231 start-page: 2140 year: 2012 ident: 10.1016/j.parco.2017.11.009_bib0030 article-title: Adaptive local basis set for Kohn-Sham density functional theory in a discontinuous Galerkin framework I: total energy calculation publication-title: J. Comput. Phys. doi: 10.1016/j.jcp.2011.11.032 – volume: 227 start-page: 9408 year: 2008 ident: 10.1016/j.parco.2017.11.009_bib0006 article-title: Computing entries of the inverse of a sparse matrix using the FIND algorithm publication-title: J. Comput. Phys. doi: 10.1016/j.jcp.2008.06.033 – ident: 10.1016/j.parco.2017.11.009_bib0017 doi: 10.1016/j.jcp.2014.07.004 – volume: 231 start-page: 1121 issue: 4 year: 2012 ident: 10.1016/j.parco.2017.11.009_bib0007 article-title: Extension and optimization of the find algorithm: computing greens and less-than greens functions publication-title: J. Comput. Phys. doi: 10.1016/j.jcp.2011.05.027 – volume: 11 start-page: 134 year: 1990 ident: 10.1016/j.parco.2017.11.009_bib0028 article-title: The role of elimination trees in sparse factorization publication-title: SIAM J. Matrix Anal. Appl. doi: 10.1137/0611010 – volume: 20 start-page: 359 year: 1998 ident: 10.1016/j.parco.2017.11.009_bib0032 article-title: A fast and high quality multilevel scheme for partitioning irregular graphs publication-title: SIAM J. Sci. Comput. doi: 10.1137/S1064827595287997 – volume: 38 start-page: 1 year: 2011 ident: 10.1016/j.parco.2017.11.009_bib0022 article-title: The University of Florida sparse matrix collection publication-title: ACM Trans. Math. Software – year: 1992 ident: 10.1016/j.parco.2017.11.009_bib0021 – volume: 228 start-page: 5020 year: 2009 ident: 10.1016/j.parco.2017.11.009_bib0011 article-title: A hybrid method for the parallel computation of Green’s functions publication-title: J. Comput. Phys. doi: 10.1016/j.jcp.2009.03.035 – year: 1996 ident: 10.1016/j.parco.2017.11.009_bib0018 – volume: 23 start-page: 15 year: 2001 ident: 10.1016/j.parco.2017.11.009_bib0024 article-title: A fully asynchronous multifrontal solver using distributed dynamic scheduling publication-title: SIAM J. Matrix Anal. Appl. doi: 10.1137/S0895479899358194 – year: 1973 ident: 10.1016/j.parco.2017.11.009_bib0001 article-title: Formation of a sparse bus impedance matrix and its application to short circuit study – volume: 37 start-page: 40 year: 2011 ident: 10.1016/j.parco.2017.11.009_bib0003 article-title: SelInv – an algorithm for selected inversion of a sparse symmetric matrix publication-title: ACM. Trans. Math. Softw. doi: 10.1145/1916461.1916464 – volume: 7 start-page: 755 year: 2009 ident: 10.1016/j.parco.2017.11.009_bib0012 article-title: Fast algorithm for extracting the diagonal of the inverse matrix with application to the electronic structure analysis of metallic systems publication-title: Commun. Math. Sci. doi: 10.4310/CMS.2009.v7.n3.a12 – start-page: 533 year: 2013 ident: 10.1016/j.parco.2017.11.009_bib0013 article-title: Fast methods for computing selected elements of the Greens function in massively parallel nanoelectronic device simulations – volume: 23 start-page: 158 year: 2006 ident: 10.1016/j.parco.2017.11.009_bib0025 article-title: On fast factorization pivoting methods for symmetric indefinite systems publication-title: Electron. Trans. Numer. Anal. – volume: 78 start-page: 865 year: 2006 ident: 10.1016/j.parco.2017.11.009_bib0016 article-title: Electronic structure calculations with dynamical mean-field theory publication-title: Rev. Mod. Phys. doi: 10.1103/RevModPhys.78.865 – volume: 26 start-page: 305503 year: 2014 ident: 10.1016/j.parco.2017.11.009_bib0015 article-title: SIESTA-PEXSI: massively parallel method for efficient and accurate ab initio materials simulation without matrix diagonalization publication-title: J. Phys. – volume: 4 year: 1997 ident: 10.1016/j.parco.2017.11.009_bib0026 – volume: 18 start-page: 177 year: 1975 ident: 10.1016/j.parco.2017.11.009_bib0002 article-title: On computing certain elements of the inverse of a sparse matrix publication-title: Commun. ACM doi: 10.1145/360680.360704 – volume: 14 start-page: 2745 year: 2002 ident: 10.1016/j.parco.2017.11.009_bib0029 article-title: The SIESTA method for ab initio order-N materials simulation publication-title: J. Phys. – year: 2013 ident: 10.1016/j.parco.2017.11.009_bib0008 article-title: A nested dissection approach to modeling transport in nanodevices: algorithms and applications publication-title: Int. J. Numer. Method Eng. doi: 10.1002/nme.4518 – volume: 25 start-page: 295501 year: 2013 ident: 10.1016/j.parco.2017.11.009_bib0014 article-title: Accelerating atomic orbital-based electronic structure calculation via pole expansion and selected inversion publication-title: J. Phys. Condens. Matter doi: 10.1088/0953-8984/25/29/295501 – volume: 140 start-page: A1133 year: 1965 ident: 10.1016/j.parco.2017.11.009_bib0031 article-title: Self-consistent equations including exchange and correlation effects publication-title: Phys. Rev. doi: 10.1103/PhysRev.140.A1133 – year: 1995 ident: 10.1016/j.parco.2017.11.009_bib0005 article-title: Computing the sparse inverse subset: an inverse multifrontal approach – volume: 34 start-page: A1975 year: 2012 ident: 10.1016/j.parco.2017.11.009_bib0009 article-title: On computing inverse entries of a sparse matrix in an out-of-core environment publication-title: SIAM J. Sci. Comput. doi: 10.1137/100799411 – volume: 29 start-page: 110 year: 2003 ident: 10.1016/j.parco.2017.11.009_bib0019 article-title: SuperLU_DIST: a scalable distributed-memory sparse direct solver for unsymmetric linear systems publication-title: ACM Trans. Math. Softw. doi: 10.1145/779359.779361 – start-page: 192 year: 2016 ident: 10.1016/j.parco.2017.11.009_bib0020 article-title: Enhancing the scalability and load balancing of the parallel selected inversion algorithm via tree-based asynchronous communication |
| SSID | ssj0006480 |
| Score | 2.2505891 |
| Snippet | •Parallel selected inversion for non-symmetric sparse matrices.•High performance implementation on distributed memory platforms.•Experiments demonstrate... This paper generalizes the parallel selected inversion algorithm called PSelInv to sparse non-symmetric matrices. We assume a general sparse matrix A has been... |
| SourceID | osti crossref elsevier |
| SourceType | Open Access Repository Enrichment Source Index Database Publisher |
| StartPage | 84 |
| SubjectTerms | High performance computation MATHEMATICS AND COMPUTING Non-symmetric Parallel algorithm Selected inversion |
| Title | PSelInv – A distributed memory parallel algorithm for selected inversion: The non-symmetric case |
| URI | https://dx.doi.org/10.1016/j.parco.2017.11.009 https://www.osti.gov/servlets/purl/1478750 |
| Volume | 74 |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVESC databaseName: Elsevier SD Freedom Collection Journals 2021 customDbUrl: eissn: 1872-7336 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0006480 issn: 0167-8191 databaseCode: AIEXJ dateStart: 19950101 isFulltext: true titleUrlDefault: https://www.sciencedirect.com providerName: Elsevier |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Lj9MwELZQlwMX3ojdBeQDt5JVHk4cc6uqRSwqq0pUqJyi2HHYrtp01ZZq98Z_4B_yS5iJHw0CKjhwiSInTip_k_F4-nk-Ql4mEEKzTOtA8DoMmIBPqhRhGnDFcp2lSsYtN-fjiJ-f59OpGFu10XUrJ8CbJr--Flf_FWpoA7Bx6-w_wO0fCg1wDqDDEWCH418BP_6g52fNtu9oDAl8-xWWx0VlKwgvF8itveljze_5XCNB-fNyNdtcLFrG4brVxdFYkWlrUmmOl9Esm2B9s1igBJfqK_evjg1sx-5xqpWJcBOi2cUPc4_VhX-PZMfSx_Ej0zqaeRP9ZPPXw4ty2c1IRPmO_2fSZHZO72YtwRvjyrDrdo04jzWv4W-9uUksXJ7AiCjcqBnxE6y4Gord5OUphRFWGcLkzUHMUwEu-mBwdjp952fljLUqev6nuApULdfvlxf8KUrpLcHxdgKQyX1y164c6MAg_oDc0s1Dcs-pclDrpB8RaQ2Afv_6jQ5oB3pqoKcOeuqhpwA9ddBTD_1rCsDTn4CnCPxjMnlzOhm-DaySRqBigXoXiRSCC5WzsmKZSpSOdZiwuo7CKpWZ0iKvIgnBKkQnVaiznGldw708ErKuw-QJ6cG79FNCUx5JUdWxUKliWZXmNU94qGARIFNVSn1IYjdwhbJV5lHsZF44OuFl0Y52gaMN688CRvuQvPKdrkyRlf23Zw6RwsaJJv4rwHD2dzxG_LATVkhWSCWDXtZ0jvZePSZ3drb-jPQ2qy_6ObmttpvZevXCWtsPWi-Reg |
| linkProvider | Elsevier |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=PSelInv+%E2%80%93+A+distributed+memory+parallel+algorithm+for+selected+inversion%3A+The+non-symmetric+case&rft.jtitle=Parallel+computing&rft.au=Jacquelin%2C+Mathias&rft.au=Lin%2C+Lin&rft.au=Yang%2C+Chao&rft.date=2018-05-01&rft.pub=Elsevier&rft.issn=0167-8191&rft.volume=74&rft.issue=C&rft_id=info:doi/10.1016%2Fj.parco.2017.11.009&rft.externalDocID=1478750 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0167-8191&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0167-8191&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0167-8191&client=summon |