Fully parallel and pipelined sparse direct solver for large symmetric indefinite finite element problems

Sparse linear system solving is a primary computational cost in large-scale finite element analysis, and improving its performance is a key technological challenge in this field. Real-world engineering problems involve diverse materials, elements, and connectivity relationships, making it difficult...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Computers & mathematics with applications (1987) Jg. 175; S. 447 - 469
Hauptverfasser: Wang, Yujie, Wang, Shengquan, Cai, Yong, Wang, Guidong, Li, Guangyao
Format: Journal Article
Sprache:Englisch
Veröffentlicht: Elsevier Ltd 01.12.2024
Schlagworte:
ISSN:0898-1221
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Abstract Sparse linear system solving is a primary computational cost in large-scale finite element analysis, and improving its performance is a key technological challenge in this field. Real-world engineering problems involve diverse materials, elements, and connectivity relationships, making it difficult for iterative methods to handle their global stiffness matrices. Direct methods, owing to their robustness, emerge as the preferred choice. In this paper, a novel block-based supernodal LDLT numerical factorization method is introduced. The computational process is disassembled into distinct tasks, and the dependency relationships between these tasks are expressed via a directed acyclic graph to guide the calculation sequence. Based on this approach, a global task pool and local task stack are established to store task queues, enhancing data reuse and multicore collaboration efficiency. Additionally, an effective task dispatch and work-stealing mechanism is implemented to prevent performance degradation caused by load imbalances. Numerical experiments, including a publicly available matrix test set and real-world engineering finite element problems, are conducted to compare the parallel performances of the Pardiso, MUMPS, and proposed solver. The results illustrate that the proposed solver performs significantly better than the other solvers when handling various types of sparse matrices and diverse architectures of multicore processors.
AbstractList Sparse linear system solving is a primary computational cost in large-scale finite element analysis, and improving its performance is a key technological challenge in this field. Real-world engineering problems involve diverse materials, elements, and connectivity relationships, making it difficult for iterative methods to handle their global stiffness matrices. Direct methods, owing to their robustness, emerge as the preferred choice. In this paper, a novel block-based supernodal LDLT numerical factorization method is introduced. The computational process is disassembled into distinct tasks, and the dependency relationships between these tasks are expressed via a directed acyclic graph to guide the calculation sequence. Based on this approach, a global task pool and local task stack are established to store task queues, enhancing data reuse and multicore collaboration efficiency. Additionally, an effective task dispatch and work-stealing mechanism is implemented to prevent performance degradation caused by load imbalances. Numerical experiments, including a publicly available matrix test set and real-world engineering finite element problems, are conducted to compare the parallel performances of the Pardiso, MUMPS, and proposed solver. The results illustrate that the proposed solver performs significantly better than the other solvers when handling various types of sparse matrices and diverse architectures of multicore processors.
Author Wang, Shengquan
Wang, Guidong
Wang, Yujie
Cai, Yong
Li, Guangyao
Author_xml – sequence: 1
  givenname: Yujie
  surname: Wang
  fullname: Wang, Yujie
  organization: State Key Laboratory of Advanced Design and Manufacturing Technology for Vehicle, Hunan University, Changsha, 410082, China
– sequence: 2
  givenname: Shengquan
  surname: Wang
  fullname: Wang, Shengquan
  organization: State Key Laboratory of Advanced Design and Manufacturing Technology for Vehicle, Hunan University, Changsha, 410082, China
– sequence: 3
  givenname: Yong
  surname: Cai
  fullname: Cai, Yong
  email: caiyong@hnu.edu.cn
  organization: State Key Laboratory of Advanced Design and Manufacturing Technology for Vehicle, Hunan University, Changsha, 410082, China
– sequence: 4
  givenname: Guidong
  surname: Wang
  fullname: Wang, Guidong
  organization: State Key Laboratory of Advanced Design and Manufacturing Technology for Vehicle, Hunan University, Changsha, 410082, China
– sequence: 5
  givenname: Guangyao
  surname: Li
  fullname: Li, Guangyao
  organization: Shenzhen Automotive Research Institute, Beijing Institute of Technology, Shenzhen 518118, Guangdong, China
BookMark eNqFkLFOwzAURT0UibbwBSz-gYRnOwF7YEAVBaRKLDBbjvMMrhwnskNR_56UdmKA6T5d6TzpngWZxT4iIVcMSgbs5npbWtN9mZIDr6amBHY7I3OQShaMc3ZOFjlvAaASHObkY_0Zwp4OJpkQMFATWzr4AYOP2NI89Rlp6xPakeY-7DBR1ycaTHpHmvddh2PylvrYovPRj0hPgQE7jCMdUt9MZ74gZ86EjJenXJK39cPr6qnYvDw-r-43hRUgxkJVIFulTM2VrJl0ABIaV1cowEFTK2OUa6Di3EgLYpohoDWARrDKOSZBLIk6_rWpzzmh09aPZvR9HJPxQTPQB016q3806YOmQzlpmljxix2S70za_0PdHSmcZu08Jp2tx2jxqE23vf-T_waeWIh-
CitedBy_id crossref_primary_10_3390_app15095130
Cites_doi 10.1016/j.advengsoft.2022.103290
10.1016/j.cpc.2021.108190
10.1016/j.camwa.2021.04.013
10.1016/j.parco.2021.102870
10.1002/nme.1620090207
10.1109/SUPERC.1990.129995
10.1145/1326548.1326550
10.1016/j.jcp.2015.10.012
10.1016/j.enganabound.2023.09.004
10.1137/090757216
10.1016/S0167-739X(00)00076-5
10.1137/18M1225963
10.1002/cnm.2607
10.1016/j.parco.2014.02.003
10.1016/S0168-9274(01)00115-5
10.1016/j.cageo.2021.104901
10.1137/0611010
10.1145/567806.567807
10.1016/j.finel.2010.11.005
10.1137/110846427
10.1016/j.cpc.2022.108637
10.1145/2629641
10.1016/j.simpat.2013.09.004
10.1002/cpe.4460
10.1137/04061043X
10.1016/j.jcp.2013.10.017
10.1111/1365-2478.12132
10.1137/0610013
10.1016/j.acme.2013.05.009
10.1016/j.jpdc.2009.09.007
10.1002/cnm.887
10.1002/1098-2760(20000820)26:4<265::AID-MOP18>3.0.CO;2-O
10.1016/j.cpc.2017.12.006
10.1016/j.parco.2016.06.004
10.1137/S0895479897317685
10.1002/nla.2183
ContentType Journal Article
Copyright 2024 Elsevier Ltd
Copyright_xml – notice: 2024 Elsevier Ltd
DBID AAYXX
CITATION
DOI 10.1016/j.camwa.2024.10.017
DatabaseName CrossRef
DatabaseTitle CrossRef
DatabaseTitleList
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EndPage 469
ExternalDocumentID 10_1016_j_camwa_2024_10_017
S0898122124004589
GroupedDBID --K
--M
-~X
.DC
.~1
0R~
0SF
1B1
1RT
1~.
1~5
4.4
457
4G.
5GY
5VS
6I.
7-5
71M
8P~
9JN
AACTN
AAEDT
AAEDW
AAFTH
AAIKJ
AAKOC
AAOAW
AAQFI
AAXKI
AAXUO
AAYFN
ABAOU
ABBOA
ABMAC
ABVKL
ACDAQ
ACGFS
ACIWK
ACNCT
ACRLP
ACZNC
ADBBV
ADEZE
ADTZH
AEBSH
AECPX
AEKER
AENEX
AFJKZ
AFKWA
AFTJW
AGUBO
AGYEJ
AHHHB
AHJVU
AHZHX
AIALX
AIEXJ
AIGVJ
AIKHN
AITUG
AJOXV
AKRWK
ALMA_UNASSIGNED_HOLDINGS
AMFUW
AMRAJ
AOUOD
ARUGR
AXJTR
BJAXD
BKOJK
BLXMC
CS3
DU5
EBS
EFJIC
EO8
EO9
EP2
EP3
F5P
FDB
FEDTE
FIRID
FNPLU
FYGXN
G-Q
GBLVA
GBOLZ
HVGLF
IHE
IXB
J1W
JJJVA
KOM
MHUIS
MO0
N9A
O-L
O9-
OAUVE
OK1
OZT
P-8
P-9
P2P
PC.
PQQKQ
Q38
RNS
ROL
RPZ
SDF
SDG
SDP
SES
SEW
SPC
SPCBC
SST
SSV
SSW
T5K
TN5
XPP
ZMT
~G-
29F
9DU
AALRI
AAQXK
AATTM
AAYWO
AAYXX
ABFNM
ABJNI
ABWVN
ABXDB
ACLOT
ACNNM
ACRPL
ACVFH
ADCNI
ADMUD
ADNMO
ADVLN
AEIPS
AEUPX
AEXQZ
AFFNX
AFPUW
AGHFR
AGQPQ
AIGII
AIIUN
AKBMS
AKYEP
ANKPU
APXCP
ASPBG
AVWKF
AZFZN
CITATION
EFKBS
EFLBG
EJD
FGOYB
G-2
HZ~
LG9
M26
M41
R2-
SSZ
TAE
WUQ
ZY4
~HD
ID FETCH-LOGICAL-c303t-9408d99a5298518f0080bf54e30f0b59aa9fb0422a8c0300030da0ea314ff1803
ISICitedReferencesCount 1
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001344184800001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 0898-1221
IngestDate Tue Nov 18 21:46:54 EST 2025
Sat Nov 29 05:42:58 EST 2025
Sat Dec 21 15:59:49 EST 2024
IsPeerReviewed true
IsScholarly true
Keywords High performance computing
Sparse direct solver
FEM
Language English
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-c303t-9408d99a5298518f0080bf54e30f0b59aa9fb0422a8c0300030da0ea314ff1803
PageCount 23
ParticipantIDs crossref_citationtrail_10_1016_j_camwa_2024_10_017
crossref_primary_10_1016_j_camwa_2024_10_017
elsevier_sciencedirect_doi_10_1016_j_camwa_2024_10_017
PublicationCentury 2000
PublicationDate 2024-12-01
2024-12-00
PublicationDateYYYYMMDD 2024-12-01
PublicationDate_xml – month: 12
  year: 2024
  text: 2024-12-01
  day: 01
PublicationDecade 2020
PublicationTitle Computers & mathematics with applications (1987)
PublicationYear 2024
Publisher Elsevier Ltd
Publisher_xml – name: Elsevier Ltd
References Fialko (bib0022) 2021; 94
Buttari (bib0018) 2013; 35
Bathe, Ramm, Wilson (bib0034) 1975; 9
Demmel, Gilbert, Li (bib0019) 1999; 20
George, Saxena, Gupta, Singh, Choudhury (bib0026) 2011
Duff, Pralet (bib0045) 2005; 27
da Piedade, Régis, Nunes, da Silva (bib0007) 2021; 156
Davis, Hu (bib0047) 2011; 38
Duff, Lopez, Nakov (bib0032) 2018
Wang, Wang, Zhang, Cai, Li, Zheng (bib0014) 2023; 157
E. Anderson, Z. Bai, C. Bischof, J. Demmel, J. Dongarra, J. DuCroz, A. Greenbaum, S. Hammarling, A. McKenney, D. Sorensen, LAPACK: a Portable Line ar Al ge br a Li br ary fo r Hi g h-Pe rfor ma n ce Co mput ers, (1990).
Li, Demmel (bib0040) 1999
Arioli, Demmel, Duff (bib0041) 1989; 10
Huthwaite (bib0004) 2014; 257
Davis (bib0009) 2006
G. Karypis, V. Kumar, METIS: a software package for partitioning unstructured graphs, partitioning meshes, and computing fill-reducing orderings of sparse matrices, (1997).
Avron, Shklarski, Toledo (bib0017) 2008; 34
Rennich, Stosic, Davis (bib0036) 2016; 59
Blackford, Petitet, Pozo, Remington, Whaley, Demmel, Dongarra, Duff, Hammarling, Henry (bib0042) 2002; 28
Hogg, Reid, Scott (bib0020) 2010; 32
Fialko (bib0023) 2022; 174
.
Delmas, Soulaïmani (bib0006) 2022; 271
L'Excellent, Sid-Lakhdar (bib0044) 2014; 40
Duff, Hogg, Lopez (bib0028) 2020; 42
Świrydowicz, Darve, Jones, Maack, Regev, Saunders, Thomas, Peleš (bib0029) 2022; 111
D. Lukarski, Parallel Sparse Linear Algebra for Multi-core and Many-core Platforms : parallel Solvers and Preconditioners, 2012.
K.-J. Bathe, Finite element procedures, Klaus-Jurgen Bathe, 2006.
Krawezik, Poole (bib0025) 2009
Chung, Son, Lee, Kim, Shin (bib0016) 2014; 62
Liu (bib0038) 1990; 11
Amestoy, Duff, L'Excellent, Koster (bib0030) 2000
Anzt, Dongarra, Flegar, Higham, Quintana-Ortí (bib0048) 2019; 31
Yong, Guangyao, Hu (bib0003) 2013; 50
Fialko (bib0024) 2019
Magri, Franceschini, Ferronato, Janna (bib0049) 2018; 25
Xu, Ding, Fan, Chen (bib0012) 2011; 47
Schenk, Gärtner (bib0039) 2006; 23
Saad (bib0008) 2003
Zienkiewicz, Taylor, Nithiarasu, Zhu (bib0037) 1977
Cai, Cui, Li, Liu (bib0005) 2018; 225
Kim, Eijkhout (bib0033) 2014; 41
Aminfar, Ambikasaran, Darve (bib0015) 2016; 304
Schenk, Gärtner, Fichtner, Stricker (bib0031) 2001; 18
Paszyński, Pardo, Torres-Verdín, Demkowicz, Calo (bib0021) 2010; 70
Yang, Yang, Hsieh (bib0002) 2014; 40
Yang (bib0052) 2002; 41
Fialko (bib0010) 2014; 14
Miller, Joldes, Lance, Wittek (bib0035) 2007; 23
Wang, Wang, Zhang, Li, Cai (bib0027) 2023; 284
Sertel, Volakis (bib0050) 2000; 26
Mafi, Sirouspour (bib0013) 2014; 30
Ono, Kato, Ohshima, Nanri (bib0011) 2020
Schenk (10.1016/j.camwa.2024.10.017_bib0031) 2001; 18
Rennich (10.1016/j.camwa.2024.10.017_bib0036) 2016; 59
10.1016/j.camwa.2024.10.017_bib0051
Wang (10.1016/j.camwa.2024.10.017_bib0014) 2023; 157
Saad (10.1016/j.camwa.2024.10.017_bib0008) 2003
da Piedade (10.1016/j.camwa.2024.10.017_bib0007) 2021; 156
Li (10.1016/j.camwa.2024.10.017_bib0040) 1999
Fialko (10.1016/j.camwa.2024.10.017_bib0024) 2019
Kim (10.1016/j.camwa.2024.10.017_bib0033) 2014; 41
Yang (10.1016/j.camwa.2024.10.017_bib0052) 2002; 41
Davis (10.1016/j.camwa.2024.10.017_bib0047) 2011; 38
Wang (10.1016/j.camwa.2024.10.017_bib0027) 2023; 284
Magri (10.1016/j.camwa.2024.10.017_bib0049) 2018; 25
Arioli (10.1016/j.camwa.2024.10.017_bib0041) 1989; 10
Miller (10.1016/j.camwa.2024.10.017_bib0035) 2007; 23
Duff (10.1016/j.camwa.2024.10.017_bib0045) 2005; 27
Xu (10.1016/j.camwa.2024.10.017_bib0012) 2011; 47
Duff (10.1016/j.camwa.2024.10.017_bib0032) 2018
Sertel (10.1016/j.camwa.2024.10.017_bib0050) 2000; 26
Świrydowicz (10.1016/j.camwa.2024.10.017_bib0029) 2022; 111
Fialko (10.1016/j.camwa.2024.10.017_bib0022) 2021; 94
Zienkiewicz (10.1016/j.camwa.2024.10.017_bib0037) 1977
Fialko (10.1016/j.camwa.2024.10.017_bib0010) 2014; 14
Delmas (10.1016/j.camwa.2024.10.017_bib0006) 2022; 271
L'Excellent (10.1016/j.camwa.2024.10.017_bib0044) 2014; 40
Schenk (10.1016/j.camwa.2024.10.017_bib0039) 2006; 23
Amestoy (10.1016/j.camwa.2024.10.017_bib0030) 2000
Yong (10.1016/j.camwa.2024.10.017_bib0003) 2013; 50
Cai (10.1016/j.camwa.2024.10.017_bib0005) 2018; 225
Fialko (10.1016/j.camwa.2024.10.017_bib0023) 2022; 174
Liu (10.1016/j.camwa.2024.10.017_bib0038) 1990; 11
Bathe (10.1016/j.camwa.2024.10.017_bib0034) 1975; 9
Ono (10.1016/j.camwa.2024.10.017_bib0011) 2020
Buttari (10.1016/j.camwa.2024.10.017_bib0018) 2013; 35
Krawezik (10.1016/j.camwa.2024.10.017_bib0025) 2009
Chung (10.1016/j.camwa.2024.10.017_bib0016) 2014; 62
Avron (10.1016/j.camwa.2024.10.017_bib0017) 2008; 34
Hogg (10.1016/j.camwa.2024.10.017_bib0020) 2010; 32
Blackford (10.1016/j.camwa.2024.10.017_bib0042) 2002; 28
Huthwaite (10.1016/j.camwa.2024.10.017_bib0004) 2014; 257
10.1016/j.camwa.2024.10.017_bib0046
10.1016/j.camwa.2024.10.017_bib0001
10.1016/j.camwa.2024.10.017_bib0043
Aminfar (10.1016/j.camwa.2024.10.017_bib0015) 2016; 304
Davis (10.1016/j.camwa.2024.10.017_bib0009) 2006
Duff (10.1016/j.camwa.2024.10.017_bib0028) 2020; 42
Paszyński (10.1016/j.camwa.2024.10.017_bib0021) 2010; 70
Mafi (10.1016/j.camwa.2024.10.017_bib0013) 2014; 30
George (10.1016/j.camwa.2024.10.017_bib0026) 2011
Yang (10.1016/j.camwa.2024.10.017_bib0002) 2014; 40
Anzt (10.1016/j.camwa.2024.10.017_bib0048) 2019; 31
Demmel (10.1016/j.camwa.2024.10.017_bib0019) 1999; 20
References_xml – reference: K.-J. Bathe, Finite element procedures, Klaus-Jurgen Bathe, 2006.
– year: 2019
  ident: bib0024
  article-title: Parallel algorithms for forward and back substitution in linear algebraic equations of finite element method
  publication-title: J. Telecommun. Inf. Technol.
– reference: G. Karypis, V. Kumar, METIS: a software package for partitioning unstructured graphs, partitioning meshes, and computing fill-reducing orderings of sparse matrices, (1997).
– volume: 26
  start-page: 265
  year: 2000
  end-page: 267
  ident: bib0050
  article-title: Incomplete LU preconditioner for FMM implementation
  publication-title: Microw. Opt. Technol. Lett.
– volume: 156
  year: 2021
  ident: bib0007
  article-title: Computational cost comparison between nodal and vector finite elements in the modeling of controlled source electromagnetic data using a direct solver
  publication-title: Comput. Geosci.
– volume: 47
  start-page: 387
  year: 2011
  end-page: 393
  ident: bib0012
  article-title: FSAI preconditioned CG algorithm combined with GPU technique for the finite element analysis of electromagnetic scattering problems
  publication-title: Finite Elem. Anal. Des.
– start-page: 121
  year: 2000
  end-page: 130
  ident: bib0030
  article-title: MUMPS: a general purpose distributed memory sparse solver
  publication-title: Int. Work. Appl. Parallel Comput.
– volume: 41
  start-page: 1
  year: 2014
  end-page: 27
  ident: bib0033
  article-title: A parallel sparse direct solver via hierarchical DAG scheduling
  publication-title: Acm Trans. Math. Softw.
– volume: 32
  start-page: 3627
  year: 2010
  end-page: 3649
  ident: bib0020
  article-title: Design of a multicore sparse Cholesky factorization using DAGs
  publication-title: SIAM J. Sci. Comput.
– start-page: 11
  year: 2020
  end-page: 21
  ident: bib0011
  article-title: Scalable direct-iterative hybrid solver for sparse matrices on multi-core and vector architectures
  publication-title: Proc. Int. Conf. High Perform. Comput. Asia-Pacific Reg.
– reference: D. Lukarski, Parallel Sparse Linear Algebra for Multi-core and Many-core Platforms : parallel Solvers and Preconditioners, 2012.
– volume: 271
  year: 2022
  ident: bib0006
  article-title: Multi-GPU implementation of a time-explicit finite volume solver using CUDA and a CUDA-Aware version of OpenMPI with application to shallow water flows
  publication-title: Comput. Phys. Commun.
– volume: 41
  start-page: 155
  year: 2002
  end-page: 177
  ident: bib0052
  article-title: BoomerAMG: a parallel algebraic multigrid solver and preconditioner
  publication-title: Appl. Numer. Math.
– volume: 38
  start-page: 1
  year: 2011
  end-page: 25
  ident: bib0047
  article-title: The University of Florida sparse matrix collection
  publication-title: ACM Trans. Math. Softw.
– volume: 34
  start-page: 1
  year: 2008
  end-page: 31
  ident: bib0017
  article-title: Parallel unsymmetric-pattern multifrontal sparse LU with column preordering
  publication-title: ACM Trans. Math. Softw.
– volume: 174
  year: 2022
  ident: bib0023
  article-title: Parallel finite element solver PARFES for the structural analysis in NUMA architecture
  publication-title: Adv. Eng. Softw.
– volume: 257
  start-page: 687
  year: 2014
  end-page: 707
  ident: bib0004
  article-title: Accelerated finite element elastodynamic simulations using the GPU
  publication-title: J. Comput. Phys.
– volume: 284
  year: 2023
  ident: bib0027
  article-title: Fine-grained heterogeneous parallel direct solver for finite element problems
  publication-title: Comput. Phys. Commun.
– volume: 18
  start-page: 69
  year: 2001
  end-page: 78
  ident: bib0031
  article-title: PARDISO: a high-performance serial and parallel sparse linear solver in semiconductor device simulation
  publication-title: Futur. Gener. Comput. Syst.
– volume: 40
  start-page: 112
  year: 2014
  end-page: 121
  ident: bib0002
  article-title: GPU parallelization of an object-oriented nonlinear dynamic structural analysis platform
  publication-title: Simul. Model. Pract. Theory.
– volume: 157
  start-page: 177
  year: 2023
  end-page: 190
  ident: bib0014
  article-title: Heterogeneous parallel computing method for 3D transient nonlinear thermomechanical problems on CPU-GPU platforms
  publication-title: Eng. Anal. Bound. Elem.
– volume: 10
  start-page: 165
  year: 1989
  end-page: 190
  ident: bib0041
  article-title: Solving sparse linear systems with sparse backward error
  publication-title: SIAM J. Matrix Anal. Appl.
– volume: 59
  start-page: 140
  year: 2016
  end-page: 150
  ident: bib0036
  article-title: Accelerating sparse Cholesky factorization on GPUs
  publication-title: Parallel Comput
– volume: 23
  start-page: 158
  year: 2006
  end-page: 179
  ident: bib0039
  article-title: On fast factorization pivoting methods for sparse symmetric indefinite systems
  publication-title: Electron. Trans. Numer. Anal.
– year: 2003
  ident: bib0008
  article-title: Iterative Methods For Sparse Linear Systems
– volume: 42
  start-page: C23
  year: 2020
  end-page: C42
  ident: bib0028
  article-title: A new sparse LDL^T solver using a posteriori threshold pivoting
  publication-title: SIAM J. Sci. Comput.
– start-page: 67
  year: 2018
  end-page: 98
  ident: bib0032
  article-title: Sparse direct solution on parallel computers
  publication-title: Numer. Anal. Optim. NAO-IV, Muscat, Oman, January 2017 NAO-IV
– volume: 111
  year: 2022
  ident: bib0029
  article-title: Linear solvers for power grid optimization problems: a review of GPU-accelerated linear solvers
  publication-title: Parallel Comput
– volume: 225
  start-page: 47
  year: 2018
  end-page: 58
  ident: bib0005
  article-title: A parallel finite element procedure for contact-impact problems using edge-based smooth triangular element and GPU
  publication-title: Comput. Phys. Commun.
– reference: E. Anderson, Z. Bai, C. Bischof, J. Demmel, J. Dongarra, J. DuCroz, A. Greenbaum, S. Hammarling, A. McKenney, D. Sorensen, LAPACK: a Portable Line ar Al ge br a Li br ary fo r Hi g h-Pe rfor ma n ce Co mput ers, (1990).
– volume: 40
  start-page: 34
  year: 2014
  end-page: 46
  ident: bib0044
  article-title: A study of shared-memory parallelism in a multifrontal solver
  publication-title: Parallel Comput
– volume: 62
  start-page: 1468
  year: 2014
  end-page: 1483
  ident: bib0016
  article-title: Three-dimensional modelling of controlled-source electromagnetic surveys using an edge finite-element method with a direct solver
  publication-title: Geophys. Prospect.
– year: 1999
  ident: bib0040
  article-title: A Scalable Sparse Direct Solver Using Static Pivoting
– volume: 94
  start-page: 1
  year: 2021
  end-page: 14
  ident: bib0022
  article-title: Parallel finite element solver for multi-core computers with shared memory
  publication-title: Comput. Math. with Appl.
– volume: 23
  start-page: 121
  year: 2007
  end-page: 134
  ident: bib0035
  article-title: Total Lagrangian explicit dynamics finite element algorithm for computing soft tissue deformation
  publication-title: Commun. Numer. Methods Eng.
– volume: 50
  start-page: 412
  year: 2013
  ident: bib0003
  article-title: Parallel computing of central difference explicit finite element based on GPU general computing platform
  publication-title: J. Comput. Res. Dev.
– start-page: 372
  year: 2011
  end-page: 383
  ident: bib0026
  article-title: Multifrontal factorization of sparse SPD matrices on GPUs
  publication-title: 2011 IEEE Int. Parallel Distrib. Process. Symp.
– volume: 27
  start-page: 313
  year: 2005
  end-page: 340
  ident: bib0045
  article-title: Strategies for scaling and pivoting for sparse symmetric indefinite problems
  publication-title: SIAM J. Matrix Anal. Appl.
– year: 1977
  ident: bib0037
  article-title: The Finite Element Method
– volume: 28
  start-page: 135
  year: 2002
  end-page: 151
  ident: bib0042
  article-title: An updated set of basic linear algebra subprograms (BLAS)
  publication-title: ACM Trans. Math. Softw.
– year: 2006
  ident: bib0009
  article-title: Direct Methods For Sparse Linear Systems
– volume: 31
  start-page: e4460
  year: 2019
  ident: bib0048
  article-title: Adaptive precision in block-Jacobi preconditioning for iterative sparse linear system solvers
  publication-title: Concurr. Comput. Pract. Exp.
– reference: .
– volume: 70
  start-page: 270
  year: 2010
  end-page: 281
  ident: bib0021
  article-title: A parallel direct solver for the self-adaptive hp Finite Element Method
  publication-title: J. Parallel Distrib. Comput.
– volume: 20
  start-page: 915
  year: 1999
  end-page: 952
  ident: bib0019
  article-title: An asynchronous parallel supernodal algorithm for sparse gaussian elimination
  publication-title: SIAM J. Matrix Anal. Appl.
– volume: 9
  start-page: 353
  year: 1975
  end-page: 386
  ident: bib0034
  article-title: Finite element formulations for large deformation dynamic analysis
  publication-title: Int. J. Numer. Methods Eng.
– volume: 25
  start-page: e2183
  year: 2018
  ident: bib0049
  article-title: Multilevel approaches for FSAI preconditioning
  publication-title: Numer. Linear Algebr. with Appl.
– volume: 30
  start-page: 365
  year: 2014
  end-page: 381
  ident: bib0013
  article-title: GPU-based acceleration of computations in nonlinear finite element deformation analysis
  publication-title: Int. j. Numer. Method. Biomed. Eng.
– volume: 35
  start-page: C323
  year: 2013
  end-page: C345
  ident: bib0018
  article-title: Fine-grained multithreading for the multifrontal QR factorization of sparse matrices
  publication-title: SIAM J. Sci. Comput.
– volume: 304
  start-page: 170
  year: 2016
  end-page: 188
  ident: bib0015
  article-title: A fast block low-rank dense solver with applications to finite-element matrices
  publication-title: J. Comput. Phys.
– year: 2009
  ident: bib0025
  article-title: Accelerating the ANSYS direct sparse solver with GPUs
  publication-title: Symp. Appl. Accel. High Perform. Comput.
– volume: 11
  start-page: 134
  year: 1990
  end-page: 172
  ident: bib0038
  article-title: The role of elimination trees in sparse factorization
  publication-title: SIAM J. Matrix Anal. Appl.
– volume: 14
  start-page: 190
  year: 2014
  end-page: 203
  ident: bib0010
  article-title: Iterative methods for solving large-scale problems of structural mechanics using multi-core computers
  publication-title: Arch. Civ. Mech. Eng.
– volume: 174
  year: 2022
  ident: 10.1016/j.camwa.2024.10.017_bib0023
  article-title: Parallel finite element solver PARFES for the structural analysis in NUMA architecture
  publication-title: Adv. Eng. Softw.
  doi: 10.1016/j.advengsoft.2022.103290
– volume: 271
  year: 2022
  ident: 10.1016/j.camwa.2024.10.017_bib0006
  article-title: Multi-GPU implementation of a time-explicit finite volume solver using CUDA and a CUDA-Aware version of OpenMPI with application to shallow water flows
  publication-title: Comput. Phys. Commun.
  doi: 10.1016/j.cpc.2021.108190
– volume: 94
  start-page: 1
  year: 2021
  ident: 10.1016/j.camwa.2024.10.017_bib0022
  article-title: Parallel finite element solver for multi-core computers with shared memory
  publication-title: Comput. Math. with Appl.
  doi: 10.1016/j.camwa.2021.04.013
– volume: 111
  year: 2022
  ident: 10.1016/j.camwa.2024.10.017_bib0029
  article-title: Linear solvers for power grid optimization problems: a review of GPU-accelerated linear solvers
  publication-title: Parallel Comput
  doi: 10.1016/j.parco.2021.102870
– volume: 9
  start-page: 353
  year: 1975
  ident: 10.1016/j.camwa.2024.10.017_bib0034
  article-title: Finite element formulations for large deformation dynamic analysis
  publication-title: Int. J. Numer. Methods Eng.
  doi: 10.1002/nme.1620090207
– ident: 10.1016/j.camwa.2024.10.017_bib0043
  doi: 10.1109/SUPERC.1990.129995
– start-page: 372
  year: 2011
  ident: 10.1016/j.camwa.2024.10.017_bib0026
  article-title: Multifrontal factorization of sparse SPD matrices on GPUs
– ident: 10.1016/j.camwa.2024.10.017_bib0001
– volume: 34
  start-page: 1
  year: 2008
  ident: 10.1016/j.camwa.2024.10.017_bib0017
  article-title: Parallel unsymmetric-pattern multifrontal sparse LU with column preordering
  publication-title: ACM Trans. Math. Softw.
  doi: 10.1145/1326548.1326550
– year: 1977
  ident: 10.1016/j.camwa.2024.10.017_bib0037
– volume: 304
  start-page: 170
  year: 2016
  ident: 10.1016/j.camwa.2024.10.017_bib0015
  article-title: A fast block low-rank dense solver with applications to finite-element matrices
  publication-title: J. Comput. Phys.
  doi: 10.1016/j.jcp.2015.10.012
– volume: 157
  start-page: 177
  year: 2023
  ident: 10.1016/j.camwa.2024.10.017_bib0014
  article-title: Heterogeneous parallel computing method for 3D transient nonlinear thermomechanical problems on CPU-GPU platforms
  publication-title: Eng. Anal. Bound. Elem.
  doi: 10.1016/j.enganabound.2023.09.004
– volume: 32
  start-page: 3627
  year: 2010
  ident: 10.1016/j.camwa.2024.10.017_bib0020
  article-title: Design of a multicore sparse Cholesky factorization using DAGs
  publication-title: SIAM J. Sci. Comput.
  doi: 10.1137/090757216
– volume: 18
  start-page: 69
  year: 2001
  ident: 10.1016/j.camwa.2024.10.017_bib0031
  article-title: PARDISO: a high-performance serial and parallel sparse linear solver in semiconductor device simulation
  publication-title: Futur. Gener. Comput. Syst.
  doi: 10.1016/S0167-739X(00)00076-5
– volume: 42
  start-page: C23
  year: 2020
  ident: 10.1016/j.camwa.2024.10.017_bib0028
  article-title: A new sparse LDL^T solver using a posteriori threshold pivoting
  publication-title: SIAM J. Sci. Comput.
  doi: 10.1137/18M1225963
– volume: 30
  start-page: 365
  year: 2014
  ident: 10.1016/j.camwa.2024.10.017_bib0013
  article-title: GPU-based acceleration of computations in nonlinear finite element deformation analysis
  publication-title: Int. j. Numer. Method. Biomed. Eng.
  doi: 10.1002/cnm.2607
– volume: 23
  start-page: 158
  year: 2006
  ident: 10.1016/j.camwa.2024.10.017_bib0039
  article-title: On fast factorization pivoting methods for sparse symmetric indefinite systems
  publication-title: Electron. Trans. Numer. Anal.
– volume: 40
  start-page: 34
  year: 2014
  ident: 10.1016/j.camwa.2024.10.017_bib0044
  article-title: A study of shared-memory parallelism in a multifrontal solver
  publication-title: Parallel Comput
  doi: 10.1016/j.parco.2014.02.003
– volume: 38
  start-page: 1
  year: 2011
  ident: 10.1016/j.camwa.2024.10.017_bib0047
  article-title: The University of Florida sparse matrix collection
  publication-title: ACM Trans. Math. Softw.
– volume: 41
  start-page: 155
  year: 2002
  ident: 10.1016/j.camwa.2024.10.017_bib0052
  article-title: BoomerAMG: a parallel algebraic multigrid solver and preconditioner
  publication-title: Appl. Numer. Math.
  doi: 10.1016/S0168-9274(01)00115-5
– year: 2003
  ident: 10.1016/j.camwa.2024.10.017_bib0008
– volume: 156
  year: 2021
  ident: 10.1016/j.camwa.2024.10.017_bib0007
  article-title: Computational cost comparison between nodal and vector finite elements in the modeling of controlled source electromagnetic data using a direct solver
  publication-title: Comput. Geosci.
  doi: 10.1016/j.cageo.2021.104901
– volume: 50
  start-page: 412
  year: 2013
  ident: 10.1016/j.camwa.2024.10.017_bib0003
  article-title: Parallel computing of central difference explicit finite element based on GPU general computing platform
  publication-title: J. Comput. Res. Dev.
– volume: 11
  start-page: 134
  year: 1990
  ident: 10.1016/j.camwa.2024.10.017_bib0038
  article-title: The role of elimination trees in sparse factorization
  publication-title: SIAM J. Matrix Anal. Appl.
  doi: 10.1137/0611010
– year: 2009
  ident: 10.1016/j.camwa.2024.10.017_bib0025
  article-title: Accelerating the ANSYS direct sparse solver with GPUs
– volume: 28
  start-page: 135
  year: 2002
  ident: 10.1016/j.camwa.2024.10.017_bib0042
  article-title: An updated set of basic linear algebra subprograms (BLAS)
  publication-title: ACM Trans. Math. Softw.
  doi: 10.1145/567806.567807
– year: 1999
  ident: 10.1016/j.camwa.2024.10.017_bib0040
– volume: 47
  start-page: 387
  year: 2011
  ident: 10.1016/j.camwa.2024.10.017_bib0012
  article-title: FSAI preconditioned CG algorithm combined with GPU technique for the finite element analysis of electromagnetic scattering problems
  publication-title: Finite Elem. Anal. Des.
  doi: 10.1016/j.finel.2010.11.005
– volume: 35
  start-page: C323
  year: 2013
  ident: 10.1016/j.camwa.2024.10.017_bib0018
  article-title: Fine-grained multithreading for the multifrontal QR factorization of sparse matrices
  publication-title: SIAM J. Sci. Comput.
  doi: 10.1137/110846427
– start-page: 121
  year: 2000
  ident: 10.1016/j.camwa.2024.10.017_bib0030
  article-title: MUMPS: a general purpose distributed memory sparse solver
– volume: 284
  year: 2023
  ident: 10.1016/j.camwa.2024.10.017_bib0027
  article-title: Fine-grained heterogeneous parallel direct solver for finite element problems
  publication-title: Comput. Phys. Commun.
  doi: 10.1016/j.cpc.2022.108637
– volume: 41
  start-page: 1
  year: 2014
  ident: 10.1016/j.camwa.2024.10.017_bib0033
  article-title: A parallel sparse direct solver via hierarchical DAG scheduling
  publication-title: Acm Trans. Math. Softw.
  doi: 10.1145/2629641
– volume: 40
  start-page: 112
  year: 2014
  ident: 10.1016/j.camwa.2024.10.017_bib0002
  article-title: GPU parallelization of an object-oriented nonlinear dynamic structural analysis platform
  publication-title: Simul. Model. Pract. Theory.
  doi: 10.1016/j.simpat.2013.09.004
– year: 2006
  ident: 10.1016/j.camwa.2024.10.017_bib0009
– ident: 10.1016/j.camwa.2024.10.017_bib0046
– volume: 31
  start-page: e4460
  year: 2019
  ident: 10.1016/j.camwa.2024.10.017_bib0048
  article-title: Adaptive precision in block-Jacobi preconditioning for iterative sparse linear system solvers
  publication-title: Concurr. Comput. Pract. Exp.
  doi: 10.1002/cpe.4460
– volume: 27
  start-page: 313
  year: 2005
  ident: 10.1016/j.camwa.2024.10.017_bib0045
  article-title: Strategies for scaling and pivoting for sparse symmetric indefinite problems
  publication-title: SIAM J. Matrix Anal. Appl.
  doi: 10.1137/04061043X
– volume: 257
  start-page: 687
  year: 2014
  ident: 10.1016/j.camwa.2024.10.017_bib0004
  article-title: Accelerated finite element elastodynamic simulations using the GPU
  publication-title: J. Comput. Phys.
  doi: 10.1016/j.jcp.2013.10.017
– volume: 62
  start-page: 1468
  year: 2014
  ident: 10.1016/j.camwa.2024.10.017_bib0016
  article-title: Three-dimensional modelling of controlled-source electromagnetic surveys using an edge finite-element method with a direct solver
  publication-title: Geophys. Prospect.
  doi: 10.1111/1365-2478.12132
– volume: 10
  start-page: 165
  year: 1989
  ident: 10.1016/j.camwa.2024.10.017_bib0041
  article-title: Solving sparse linear systems with sparse backward error
  publication-title: SIAM J. Matrix Anal. Appl.
  doi: 10.1137/0610013
– ident: 10.1016/j.camwa.2024.10.017_bib0051
– volume: 14
  start-page: 190
  year: 2014
  ident: 10.1016/j.camwa.2024.10.017_bib0010
  article-title: Iterative methods for solving large-scale problems of structural mechanics using multi-core computers
  publication-title: Arch. Civ. Mech. Eng.
  doi: 10.1016/j.acme.2013.05.009
– volume: 70
  start-page: 270
  year: 2010
  ident: 10.1016/j.camwa.2024.10.017_bib0021
  article-title: A parallel direct solver for the self-adaptive hp Finite Element Method
  publication-title: J. Parallel Distrib. Comput.
  doi: 10.1016/j.jpdc.2009.09.007
– volume: 23
  start-page: 121
  year: 2007
  ident: 10.1016/j.camwa.2024.10.017_bib0035
  article-title: Total Lagrangian explicit dynamics finite element algorithm for computing soft tissue deformation
  publication-title: Commun. Numer. Methods Eng.
  doi: 10.1002/cnm.887
– start-page: 11
  year: 2020
  ident: 10.1016/j.camwa.2024.10.017_bib0011
  article-title: Scalable direct-iterative hybrid solver for sparse matrices on multi-core and vector architectures
– start-page: 67
  year: 2018
  ident: 10.1016/j.camwa.2024.10.017_bib0032
  article-title: Sparse direct solution on parallel computers
– year: 2019
  ident: 10.1016/j.camwa.2024.10.017_bib0024
  article-title: Parallel algorithms for forward and back substitution in linear algebraic equations of finite element method
  publication-title: J. Telecommun. Inf. Technol.
– volume: 26
  start-page: 265
  year: 2000
  ident: 10.1016/j.camwa.2024.10.017_bib0050
  article-title: Incomplete LU preconditioner for FMM implementation
  publication-title: Microw. Opt. Technol. Lett.
  doi: 10.1002/1098-2760(20000820)26:4<265::AID-MOP18>3.0.CO;2-O
– volume: 225
  start-page: 47
  year: 2018
  ident: 10.1016/j.camwa.2024.10.017_bib0005
  article-title: A parallel finite element procedure for contact-impact problems using edge-based smooth triangular element and GPU
  publication-title: Comput. Phys. Commun.
  doi: 10.1016/j.cpc.2017.12.006
– volume: 59
  start-page: 140
  year: 2016
  ident: 10.1016/j.camwa.2024.10.017_bib0036
  article-title: Accelerating sparse Cholesky factorization on GPUs
  publication-title: Parallel Comput
  doi: 10.1016/j.parco.2016.06.004
– volume: 20
  start-page: 915
  year: 1999
  ident: 10.1016/j.camwa.2024.10.017_bib0019
  article-title: An asynchronous parallel supernodal algorithm for sparse gaussian elimination
  publication-title: SIAM J. Matrix Anal. Appl.
  doi: 10.1137/S0895479897317685
– volume: 25
  start-page: e2183
  year: 2018
  ident: 10.1016/j.camwa.2024.10.017_bib0049
  article-title: Multilevel approaches for FSAI preconditioning
  publication-title: Numer. Linear Algebr. with Appl.
  doi: 10.1002/nla.2183
SSID ssj0004320
Score 2.4354818
Snippet Sparse linear system solving is a primary computational cost in large-scale finite element analysis, and improving its performance is a key technological...
SourceID crossref
elsevier
SourceType Enrichment Source
Index Database
Publisher
StartPage 447
SubjectTerms FEM
High performance computing
Sparse direct solver
Title Fully parallel and pipelined sparse direct solver for large symmetric indefinite finite element problems
URI https://dx.doi.org/10.1016/j.camwa.2024.10.017
Volume 175
WOSCitedRecordID wos001344184800001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVESC
  databaseName: Elsevier SD Freedom Collection Journals 2021
  issn: 0898-1221
  databaseCode: AIEXJ
  dateStart: 20211207
  customDbUrl:
  isFulltext: true
  dateEnd: 99991231
  titleUrlDefault: https://www.sciencedirect.com
  omitProxy: false
  ssIdentifier: ssj0004320
  providerName: Elsevier
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Lb9QwELaWlgMX3oiWh3zgFoISx3n4WFXlJVQhUWB7ihzHplll02Wz28dv4k8yju0ktGhFD1ySlZU4TuZbz4w98w1Cr0JZCBigAienAAeFRLHPuSh80M0iAgOBim6j_dun9PAwm07Z58nkl8uFOavTpskuLtjiv4oa2kDYOnX2BuLuO4UG-A1ChyOIHY7_JHjtVF56mtK7rqUlAqgWOu0cbEuYP5at9Iwi82Ag8HpdpGGtI8K99nI-1yW2hKdZFFWlDVLPnqQJNPdsCZp2bNa62hBth6R5TwXrcudGu-QdM9SfSxDf7Zr18XpWyauNX05k8-PnegDxvimgfXxqde7o2nfrqnTNdiWD0FFUiJ3wGHi0xGRM97NzGo_mV2roOa2qpqbKyzUtYBYkZuDhz881txShb3QEX5gOSs9t9F_RhX2Eogt-m-VdJ7nuBFpy6OQW2iZpzGAK3d77cDD9OKThRoYF1L2GI7nqwgmvjeXvhtDIuDm6j-5arwTvGTQ9QBPZPET3nFSxVQCP0EkHLuzAhQFcuAcXNuDCBlzYgAsDuHAHLtyDCw_gwvZkwYUduB6jr28Pjvbf-7ZWhy_ACFr5jAZZyRiPCQMbPlPaEylUTGUUqKCIGedMFZpvjmcC9IrWLSUPJI9CqlSYBdETtNWcNvIpwjA7FClJpCbRplEZFkmZUEkUSRgXIpQ7iLgPlwtLZK_rqdT5BqHtoNf9TQvD47L58sRJJLemqPl4OWBs0427N3vOM3Rn-Cc8R1ur5Vq-QLfF2apqly8twH4DJSmuYA
linkProvider Elsevier
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Fully+parallel+and+pipelined+sparse+direct+solver+for+large+symmetric+indefinite+finite+element+problems&rft.jtitle=Computers+%26+mathematics+with+applications+%281987%29&rft.au=Wang%2C+Yujie&rft.au=Wang%2C+Shengquan&rft.au=Cai%2C+Yong&rft.au=Wang%2C+Guidong&rft.date=2024-12-01&rft.issn=0898-1221&rft.volume=175&rft.spage=447&rft.epage=469&rft_id=info:doi/10.1016%2Fj.camwa.2024.10.017&rft.externalDBID=n%2Fa&rft.externalDocID=10_1016_j_camwa_2024_10_017
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0898-1221&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0898-1221&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0898-1221&client=summon