Fully parallel and pipelined sparse direct solver for large symmetric indefinite finite element problems

Sparse linear system solving is a primary computational cost in large-scale finite element analysis, and improving its performance is a key technological challenge in this field. Real-world engineering problems involve diverse materials, elements, and connectivity relationships, making it difficult...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Computers & mathematics with applications (1987) Jg. 175; S. 447 - 469
Hauptverfasser:	Wang, Yujie, Wang, Shengquan, Cai, Yong, Wang, Guidong, Li, Guangyao
Format:	Journal Article
Sprache:	Englisch
Veröffentlicht:	Elsevier Ltd 01.12.2024
Schlagworte:	FEM High performance computing Sparse direct solver High performance computing Sparse direct solver FEM
ISSN:	0898-1221
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Abstract	Sparse linear system solving is a primary computational cost in large-scale finite element analysis, and improving its performance is a key technological challenge in this field. Real-world engineering problems involve diverse materials, elements, and connectivity relationships, making it difficult for iterative methods to handle their global stiffness matrices. Direct methods, owing to their robustness, emerge as the preferred choice. In this paper, a novel block-based supernodal LDLT numerical factorization method is introduced. The computational process is disassembled into distinct tasks, and the dependency relationships between these tasks are expressed via a directed acyclic graph to guide the calculation sequence. Based on this approach, a global task pool and local task stack are established to store task queues, enhancing data reuse and multicore collaboration efficiency. Additionally, an effective task dispatch and work-stealing mechanism is implemented to prevent performance degradation caused by load imbalances. Numerical experiments, including a publicly available matrix test set and real-world engineering finite element problems, are conducted to compare the parallel performances of the Pardiso, MUMPS, and proposed solver. The results illustrate that the proposed solver performs significantly better than the other solvers when handling various types of sparse matrices and diverse architectures of multicore processors.
AbstractList	Sparse linear system solving is a primary computational cost in large-scale finite element analysis, and improving its performance is a key technological challenge in this field. Real-world engineering problems involve diverse materials, elements, and connectivity relationships, making it difficult for iterative methods to handle their global stiffness matrices. Direct methods, owing to their robustness, emerge as the preferred choice. In this paper, a novel block-based supernodal LDLT numerical factorization method is introduced. The computational process is disassembled into distinct tasks, and the dependency relationships between these tasks are expressed via a directed acyclic graph to guide the calculation sequence. Based on this approach, a global task pool and local task stack are established to store task queues, enhancing data reuse and multicore collaboration efficiency. Additionally, an effective task dispatch and work-stealing mechanism is implemented to prevent performance degradation caused by load imbalances. Numerical experiments, including a publicly available matrix test set and real-world engineering finite element problems, are conducted to compare the parallel performances of the Pardiso, MUMPS, and proposed solver. The results illustrate that the proposed solver performs significantly better than the other solvers when handling various types of sparse matrices and diverse architectures of multicore processors.
Author	Wang, Shengquan Wang, Guidong Wang, Yujie Cai, Yong Li, Guangyao
Author_xml	– sequence: 1 givenname: Yujie surname: Wang fullname: Wang, Yujie organization: State Key Laboratory of Advanced Design and Manufacturing Technology for Vehicle, Hunan University, Changsha, 410082, China – sequence: 2 givenname: Shengquan surname: Wang fullname: Wang, Shengquan organization: State Key Laboratory of Advanced Design and Manufacturing Technology for Vehicle, Hunan University, Changsha, 410082, China – sequence: 3 givenname: Yong surname: Cai fullname: Cai, Yong email: caiyong@hnu.edu.cn organization: State Key Laboratory of Advanced Design and Manufacturing Technology for Vehicle, Hunan University, Changsha, 410082, China – sequence: 4 givenname: Guidong surname: Wang fullname: Wang, Guidong organization: State Key Laboratory of Advanced Design and Manufacturing Technology for Vehicle, Hunan University, Changsha, 410082, China – sequence: 5 givenname: Guangyao surname: Li fullname: Li, Guangyao organization: Shenzhen Automotive Research Institute, Beijing Institute of Technology, Shenzhen 518118, Guangdong, China
BookMark	eNqFkLFOwzAURT0UibbwBSz-gYRnOwF7YEAVBaRKLDBbjvMMrhwnskNR_56UdmKA6T5d6TzpngWZxT4iIVcMSgbs5npbWtN9mZIDr6amBHY7I3OQShaMc3ZOFjlvAaASHObkY_0Zwp4OJpkQMFATWzr4AYOP2NI89Rlp6xPakeY-7DBR1ycaTHpHmvddh2PylvrYovPRj0hPgQE7jCMdUt9MZ74gZ86EjJenXJK39cPr6qnYvDw-r-43hRUgxkJVIFulTM2VrJl0ABIaV1cowEFTK2OUa6Di3EgLYpohoDWARrDKOSZBLIk6_rWpzzmh09aPZvR9HJPxQTPQB016q3806YOmQzlpmljxix2S70za_0PdHSmcZu08Jp2tx2jxqE23vf-T_waeWIh-
CitedBy_id	crossref_primary_10_3390_app15095130
Cites_doi	10.1016/j.advengsoft.2022.103290 10.1016/j.cpc.2021.108190 10.1016/j.camwa.2021.04.013 10.1016/j.parco.2021.102870 10.1002/nme.1620090207 10.1109/SUPERC.1990.129995 10.1145/1326548.1326550 10.1016/j.jcp.2015.10.012 10.1016/j.enganabound.2023.09.004 10.1137/090757216 10.1016/S0167-739X(00)00076-5 10.1137/18M1225963 10.1002/cnm.2607 10.1016/j.parco.2014.02.003 10.1016/S0168-9274(01)00115-5 10.1016/j.cageo.2021.104901 10.1137/0611010 10.1145/567806.567807 10.1016/j.finel.2010.11.005 10.1137/110846427 10.1016/j.cpc.2022.108637 10.1145/2629641 10.1016/j.simpat.2013.09.004 10.1002/cpe.4460 10.1137/04061043X 10.1016/j.jcp.2013.10.017 10.1111/1365-2478.12132 10.1137/0610013 10.1016/j.acme.2013.05.009 10.1016/j.jpdc.2009.09.007 10.1002/cnm.887 10.1002/1098-2760(20000820)26:4<265::AID-MOP18>3.0.CO;2-O 10.1016/j.cpc.2017.12.006 10.1016/j.parco.2016.06.004 10.1137/S0895479897317685 10.1002/nla.2183
ContentType	Journal Article
Copyright	2024 Elsevier Ltd
Copyright_xml	– notice: 2024 Elsevier Ltd
DBID	AAYXX CITATION
DOI	10.1016/j.camwa.2024.10.017
DatabaseName	CrossRef
DatabaseTitle	CrossRef
DatabaseTitleList
DeliveryMethod	fulltext_linktorsrc
Discipline	Computer Science
EndPage	469
ExternalDocumentID	10_1016_j_camwa_2024_10_017 S0898122124004589
GroupedDBID	--K --M -~X .DC .~1 0R~ 0SF 1B1 1RT 1~. 1~5 4.4 457 4G. 5GY 5VS 6I. 7-5 71M 8P~ 9JN AACTN AAEDT AAEDW AAFTH AAIKJ AAKOC AAOAW AAQFI AAXKI AAXUO AAYFN ABAOU ABBOA ABMAC ABVKL ACDAQ ACGFS ACIWK ACNCT ACRLP ACZNC ADBBV ADEZE ADTZH AEBSH AECPX AEKER AENEX AFJKZ AFKWA AFTJW AGUBO AGYEJ AHHHB AHJVU AHZHX AIALX AIEXJ AIGVJ AIKHN AITUG AJOXV AKRWK ALMA_UNASSIGNED_HOLDINGS AMFUW AMRAJ AOUOD ARUGR AXJTR BJAXD BKOJK BLXMC CS3 DU5 EBS EFJIC EO8 EO9 EP2 EP3 F5P FDB FEDTE FIRID FNPLU FYGXN G-Q GBLVA GBOLZ HVGLF IHE IXB J1W JJJVA KOM MHUIS MO0 N9A O-L O9- OAUVE OK1 OZT P-8 P-9 P2P PC. PQQKQ Q38 RNS ROL RPZ SDF SDG SDP SES SEW SPC SPCBC SST SSV SSW T5K TN5 XPP ZMT ~G- 29F 9DU AALRI AAQXK AATTM AAYWO AAYXX ABFNM ABJNI ABWVN ABXDB ACLOT ACNNM ACRPL ACVFH ADCNI ADMUD ADNMO ADVLN AEIPS AEUPX AEXQZ AFFNX AFPUW AGHFR AGQPQ AIGII AIIUN AKBMS AKYEP ANKPU APXCP ASPBG AVWKF AZFZN CITATION EFKBS EFLBG EJD FGOYB G-2 HZ~ LG9 M26 M41 R2- SSZ TAE WUQ ZY4 ~HD
ID	FETCH-LOGICAL-c303t-9408d99a5298518f0080bf54e30f0b59aa9fb0422a8c0300030da0ea314ff1803
ISICitedReferencesCount	1
ISICitedReferencesURI	http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001344184800001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN	0898-1221
IngestDate	Tue Nov 18 21:46:54 EST 2025 Sat Nov 29 05:42:58 EST 2025 Sat Dec 21 15:59:49 EST 2024
IsPeerReviewed	true
IsScholarly	true
Keywords	High performance computing Sparse direct solver FEM
Language	English
LinkModel	OpenURL
MergedId	FETCHMERGED-LOGICAL-c303t-9408d99a5298518f0080bf54e30f0b59aa9fb0422a8c0300030da0ea314ff1803
PageCount	23
ParticipantIDs	crossref_citationtrail_10_1016_j_camwa_2024_10_017 crossref_primary_10_1016_j_camwa_2024_10_017 elsevier_sciencedirect_doi_10_1016_j_camwa_2024_10_017
PublicationCentury	2000
PublicationDate	2024-12-01 2024-12-00
PublicationDateYYYYMMDD	2024-12-01
PublicationDate_xml	– month: 12 year: 2024 text: 2024-12-01 day: 01
PublicationDecade	2020
PublicationTitle	Computers & mathematics with applications (1987)
PublicationYear	2024
Publisher	Elsevier Ltd
Publisher_xml	– name: Elsevier Ltd
References	Fialko (bib0022) 2021; 94 Buttari (bib0018) 2013; 35 Bathe, Ramm, Wilson (bib0034) 1975; 9 Demmel, Gilbert, Li (bib0019) 1999; 20 George, Saxena, Gupta, Singh, Choudhury (bib0026) 2011 Duff, Pralet (bib0045) 2005; 27 da Piedade, Régis, Nunes, da Silva (bib0007) 2021; 156 Davis, Hu (bib0047) 2011; 38 Duff, Lopez, Nakov (bib0032) 2018 Wang, Wang, Zhang, Cai, Li, Zheng (bib0014) 2023; 157 E. Anderson, Z. Bai, C. Bischof, J. Demmel, J. Dongarra, J. DuCroz, A. Greenbaum, S. Hammarling, A. McKenney, D. Sorensen, LAPACK: a Portable Line ar Al ge br a Li br ary fo r Hi g h-Pe rfor ma n ce Co mput ers, (1990). Li, Demmel (bib0040) 1999 Arioli, Demmel, Duff (bib0041) 1989; 10 Huthwaite (bib0004) 2014; 257 Davis (bib0009) 2006 G. Karypis, V. Kumar, METIS: a software package for partitioning unstructured graphs, partitioning meshes, and computing fill-reducing orderings of sparse matrices, (1997). Avron, Shklarski, Toledo (bib0017) 2008; 34 Rennich, Stosic, Davis (bib0036) 2016; 59 Blackford, Petitet, Pozo, Remington, Whaley, Demmel, Dongarra, Duff, Hammarling, Henry (bib0042) 2002; 28 Hogg, Reid, Scott (bib0020) 2010; 32 Fialko (bib0023) 2022; 174 . Delmas, Soulaïmani (bib0006) 2022; 271 L'Excellent, Sid-Lakhdar (bib0044) 2014; 40 Duff, Hogg, Lopez (bib0028) 2020; 42 Świrydowicz, Darve, Jones, Maack, Regev, Saunders, Thomas, Peleš (bib0029) 2022; 111 D. Lukarski, Parallel Sparse Linear Algebra for Multi-core and Many-core Platforms : parallel Solvers and Preconditioners, 2012. K.-J. Bathe, Finite element procedures, Klaus-Jurgen Bathe, 2006. Krawezik, Poole (bib0025) 2009 Chung, Son, Lee, Kim, Shin (bib0016) 2014; 62 Liu (bib0038) 1990; 11 Amestoy, Duff, L'Excellent, Koster (bib0030) 2000 Anzt, Dongarra, Flegar, Higham, Quintana-Ortí (bib0048) 2019; 31 Yong, Guangyao, Hu (bib0003) 2013; 50 Fialko (bib0024) 2019 Magri, Franceschini, Ferronato, Janna (bib0049) 2018; 25 Xu, Ding, Fan, Chen (bib0012) 2011; 47 Schenk, Gärtner (bib0039) 2006; 23 Saad (bib0008) 2003 Zienkiewicz, Taylor, Nithiarasu, Zhu (bib0037) 1977 Cai, Cui, Li, Liu (bib0005) 2018; 225 Kim, Eijkhout (bib0033) 2014; 41 Aminfar, Ambikasaran, Darve (bib0015) 2016; 304 Schenk, Gärtner, Fichtner, Stricker (bib0031) 2001; 18 Paszyński, Pardo, Torres-Verdín, Demkowicz, Calo (bib0021) 2010; 70 Yang, Yang, Hsieh (bib0002) 2014; 40 Yang (bib0052) 2002; 41 Fialko (bib0010) 2014; 14 Miller, Joldes, Lance, Wittek (bib0035) 2007; 23 Wang, Wang, Zhang, Li, Cai (bib0027) 2023; 284 Sertel, Volakis (bib0050) 2000; 26 Mafi, Sirouspour (bib0013) 2014; 30 Ono, Kato, Ohshima, Nanri (bib0011) 2020 Schenk (10.1016/j.camwa.2024.10.017_bib0031) 2001; 18 Rennich (10.1016/j.camwa.2024.10.017_bib0036) 2016; 59 10.1016/j.camwa.2024.10.017_bib0051 Wang (10.1016/j.camwa.2024.10.017_bib0014) 2023; 157 Saad (10.1016/j.camwa.2024.10.017_bib0008) 2003 da Piedade (10.1016/j.camwa.2024.10.017_bib0007) 2021; 156 Li (10.1016/j.camwa.2024.10.017_bib0040) 1999 Fialko (10.1016/j.camwa.2024.10.017_bib0024) 2019 Kim (10.1016/j.camwa.2024.10.017_bib0033) 2014; 41 Yang (10.1016/j.camwa.2024.10.017_bib0052) 2002; 41 Davis (10.1016/j.camwa.2024.10.017_bib0047) 2011; 38 Wang (10.1016/j.camwa.2024.10.017_bib0027) 2023; 284 Magri (10.1016/j.camwa.2024.10.017_bib0049) 2018; 25 Arioli (10.1016/j.camwa.2024.10.017_bib0041) 1989; 10 Miller (10.1016/j.camwa.2024.10.017_bib0035) 2007; 23 Duff (10.1016/j.camwa.2024.10.017_bib0045) 2005; 27 Xu (10.1016/j.camwa.2024.10.017_bib0012) 2011; 47 Duff (10.1016/j.camwa.2024.10.017_bib0032) 2018 Sertel (10.1016/j.camwa.2024.10.017_bib0050) 2000; 26 Świrydowicz (10.1016/j.camwa.2024.10.017_bib0029) 2022; 111 Fialko (10.1016/j.camwa.2024.10.017_bib0022) 2021; 94 Zienkiewicz (10.1016/j.camwa.2024.10.017_bib0037) 1977 Fialko (10.1016/j.camwa.2024.10.017_bib0010) 2014; 14 Delmas (10.1016/j.camwa.2024.10.017_bib0006) 2022; 271 L'Excellent (10.1016/j.camwa.2024.10.017_bib0044) 2014; 40 Schenk (10.1016/j.camwa.2024.10.017_bib0039) 2006; 23 Amestoy (10.1016/j.camwa.2024.10.017_bib0030) 2000 Yong (10.1016/j.camwa.2024.10.017_bib0003) 2013; 50 Cai (10.1016/j.camwa.2024.10.017_bib0005) 2018; 225 Fialko (10.1016/j.camwa.2024.10.017_bib0023) 2022; 174 Liu (10.1016/j.camwa.2024.10.017_bib0038) 1990; 11 Bathe (10.1016/j.camwa.2024.10.017_bib0034) 1975; 9 Ono (10.1016/j.camwa.2024.10.017_bib0011) 2020 Buttari (10.1016/j.camwa.2024.10.017_bib0018) 2013; 35 Krawezik (10.1016/j.camwa.2024.10.017_bib0025) 2009 Chung (10.1016/j.camwa.2024.10.017_bib0016) 2014; 62 Avron (10.1016/j.camwa.2024.10.017_bib0017) 2008; 34 Hogg (10.1016/j.camwa.2024.10.017_bib0020) 2010; 32 Blackford (10.1016/j.camwa.2024.10.017_bib0042) 2002; 28 Huthwaite (10.1016/j.camwa.2024.10.017_bib0004) 2014; 257 10.1016/j.camwa.2024.10.017_bib0046 10.1016/j.camwa.2024.10.017_bib0001 10.1016/j.camwa.2024.10.017_bib0043 Aminfar (10.1016/j.camwa.2024.10.017_bib0015) 2016; 304 Davis (10.1016/j.camwa.2024.10.017_bib0009) 2006 Duff (10.1016/j.camwa.2024.10.017_bib0028) 2020; 42 Paszyński (10.1016/j.camwa.2024.10.017_bib0021) 2010; 70 Mafi (10.1016/j.camwa.2024.10.017_bib0013) 2014; 30 George (10.1016/j.camwa.2024.10.017_bib0026) 2011 Yang (10.1016/j.camwa.2024.10.017_bib0002) 2014; 40 Anzt (10.1016/j.camwa.2024.10.017_bib0048) 2019; 31 Demmel (10.1016/j.camwa.2024.10.017_bib0019) 1999; 20
References_xml	– reference: K.-J. Bathe, Finite element procedures, Klaus-Jurgen Bathe, 2006. – year: 2019 ident: bib0024 article-title: Parallel algorithms for forward and back substitution in linear algebraic equations of finite element method publication-title: J. Telecommun. Inf. Technol. – reference: G. Karypis, V. Kumar, METIS: a software package for partitioning unstructured graphs, partitioning meshes, and computing fill-reducing orderings of sparse matrices, (1997). – volume: 26 start-page: 265 year: 2000 end-page: 267 ident: bib0050 article-title: Incomplete LU preconditioner for FMM implementation publication-title: Microw. Opt. Technol. Lett. – volume: 156 year: 2021 ident: bib0007 article-title: Computational cost comparison between nodal and vector finite elements in the modeling of controlled source electromagnetic data using a direct solver publication-title: Comput. Geosci. – volume: 47 start-page: 387 year: 2011 end-page: 393 ident: bib0012 article-title: FSAI preconditioned CG algorithm combined with GPU technique for the finite element analysis of electromagnetic scattering problems publication-title: Finite Elem. Anal. Des. – start-page: 121 year: 2000 end-page: 130 ident: bib0030 article-title: MUMPS: a general purpose distributed memory sparse solver publication-title: Int. Work. Appl. Parallel Comput. – volume: 41 start-page: 1 year: 2014 end-page: 27 ident: bib0033 article-title: A parallel sparse direct solver via hierarchical DAG scheduling publication-title: Acm Trans. Math. Softw. – volume: 32 start-page: 3627 year: 2010 end-page: 3649 ident: bib0020 article-title: Design of a multicore sparse Cholesky factorization using DAGs publication-title: SIAM J. Sci. Comput. – start-page: 11 year: 2020 end-page: 21 ident: bib0011 article-title: Scalable direct-iterative hybrid solver for sparse matrices on multi-core and vector architectures publication-title: Proc. Int. Conf. High Perform. Comput. Asia-Pacific Reg. – reference: D. Lukarski, Parallel Sparse Linear Algebra for Multi-core and Many-core Platforms : parallel Solvers and Preconditioners, 2012. – volume: 271 year: 2022 ident: bib0006 article-title: Multi-GPU implementation of a time-explicit finite volume solver using CUDA and a CUDA-Aware version of OpenMPI with application to shallow water flows publication-title: Comput. Phys. Commun. – volume: 41 start-page: 155 year: 2002 end-page: 177 ident: bib0052 article-title: BoomerAMG: a parallel algebraic multigrid solver and preconditioner publication-title: Appl. Numer. Math. – volume: 38 start-page: 1 year: 2011 end-page: 25 ident: bib0047 article-title: The University of Florida sparse matrix collection publication-title: ACM Trans. Math. Softw. – volume: 34 start-page: 1 year: 2008 end-page: 31 ident: bib0017 article-title: Parallel unsymmetric-pattern multifrontal sparse LU with column preordering publication-title: ACM Trans. Math. Softw. – volume: 174 year: 2022 ident: bib0023 article-title: Parallel finite element solver PARFES for the structural analysis in NUMA architecture publication-title: Adv. Eng. Softw. – volume: 257 start-page: 687 year: 2014 end-page: 707 ident: bib0004 article-title: Accelerated finite element elastodynamic simulations using the GPU publication-title: J. Comput. Phys. – volume: 284 year: 2023 ident: bib0027 article-title: Fine-grained heterogeneous parallel direct solver for finite element problems publication-title: Comput. Phys. Commun. – volume: 18 start-page: 69 year: 2001 end-page: 78 ident: bib0031 article-title: PARDISO: a high-performance serial and parallel sparse linear solver in semiconductor device simulation publication-title: Futur. Gener. Comput. Syst. – volume: 40 start-page: 112 year: 2014 end-page: 121 ident: bib0002 article-title: GPU parallelization of an object-oriented nonlinear dynamic structural analysis platform publication-title: Simul. Model. Pract. Theory. – volume: 157 start-page: 177 year: 2023 end-page: 190 ident: bib0014 article-title: Heterogeneous parallel computing method for 3D transient nonlinear thermomechanical problems on CPU-GPU platforms publication-title: Eng. Anal. Bound. Elem. – volume: 10 start-page: 165 year: 1989 end-page: 190 ident: bib0041 article-title: Solving sparse linear systems with sparse backward error publication-title: SIAM J. Matrix Anal. Appl. – volume: 59 start-page: 140 year: 2016 end-page: 150 ident: bib0036 article-title: Accelerating sparse Cholesky factorization on GPUs publication-title: Parallel Comput – volume: 23 start-page: 158 year: 2006 end-page: 179 ident: bib0039 article-title: On fast factorization pivoting methods for sparse symmetric indefinite systems publication-title: Electron. Trans. Numer. Anal. – year: 2003 ident: bib0008 article-title: Iterative Methods For Sparse Linear Systems – volume: 42 start-page: C23 year: 2020 end-page: C42 ident: bib0028 article-title: A new sparse LDL^T solver using a posteriori threshold pivoting publication-title: SIAM J. Sci. Comput. – start-page: 67 year: 2018 end-page: 98 ident: bib0032 article-title: Sparse direct solution on parallel computers publication-title: Numer. Anal. Optim. NAO-IV, Muscat, Oman, January 2017 NAO-IV – volume: 111 year: 2022 ident: bib0029 article-title: Linear solvers for power grid optimization problems: a review of GPU-accelerated linear solvers publication-title: Parallel Comput – volume: 225 start-page: 47 year: 2018 end-page: 58 ident: bib0005 article-title: A parallel finite element procedure for contact-impact problems using edge-based smooth triangular element and GPU publication-title: Comput. Phys. Commun. – reference: E. Anderson, Z. Bai, C. Bischof, J. Demmel, J. Dongarra, J. DuCroz, A. Greenbaum, S. Hammarling, A. McKenney, D. Sorensen, LAPACK: a Portable Line ar Al ge br a Li br ary fo r Hi g h-Pe rfor ma n ce Co mput ers, (1990). – volume: 40 start-page: 34 year: 2014 end-page: 46 ident: bib0044 article-title: A study of shared-memory parallelism in a multifrontal solver publication-title: Parallel Comput – volume: 62 start-page: 1468 year: 2014 end-page: 1483 ident: bib0016 article-title: Three-dimensional modelling of controlled-source electromagnetic surveys using an edge finite-element method with a direct solver publication-title: Geophys. Prospect. – year: 1999 ident: bib0040 article-title: A Scalable Sparse Direct Solver Using Static Pivoting – volume: 94 start-page: 1 year: 2021 end-page: 14 ident: bib0022 article-title: Parallel finite element solver for multi-core computers with shared memory publication-title: Comput. Math. with Appl. – volume: 23 start-page: 121 year: 2007 end-page: 134 ident: bib0035 article-title: Total Lagrangian explicit dynamics finite element algorithm for computing soft tissue deformation publication-title: Commun. Numer. Methods Eng. – volume: 50 start-page: 412 year: 2013 ident: bib0003 article-title: Parallel computing of central difference explicit finite element based on GPU general computing platform publication-title: J. Comput. Res. Dev. – start-page: 372 year: 2011 end-page: 383 ident: bib0026 article-title: Multifrontal factorization of sparse SPD matrices on GPUs publication-title: 2011 IEEE Int. Parallel Distrib. Process. Symp. – volume: 27 start-page: 313 year: 2005 end-page: 340 ident: bib0045 article-title: Strategies for scaling and pivoting for sparse symmetric indefinite problems publication-title: SIAM J. Matrix Anal. Appl. – year: 1977 ident: bib0037 article-title: The Finite Element Method – volume: 28 start-page: 135 year: 2002 end-page: 151 ident: bib0042 article-title: An updated set of basic linear algebra subprograms (BLAS) publication-title: ACM Trans. Math. Softw. – year: 2006 ident: bib0009 article-title: Direct Methods For Sparse Linear Systems – volume: 31 start-page: e4460 year: 2019 ident: bib0048 article-title: Adaptive precision in block-Jacobi preconditioning for iterative sparse linear system solvers publication-title: Concurr. Comput. Pract. Exp. – reference: . – volume: 70 start-page: 270 year: 2010 end-page: 281 ident: bib0021 article-title: A parallel direct solver for the self-adaptive hp Finite Element Method publication-title: J. Parallel Distrib. Comput. – volume: 20 start-page: 915 year: 1999 end-page: 952 ident: bib0019 article-title: An asynchronous parallel supernodal algorithm for sparse gaussian elimination publication-title: SIAM J. Matrix Anal. Appl. – volume: 9 start-page: 353 year: 1975 end-page: 386 ident: bib0034 article-title: Finite element formulations for large deformation dynamic analysis publication-title: Int. J. Numer. Methods Eng. – volume: 25 start-page: e2183 year: 2018 ident: bib0049 article-title: Multilevel approaches for FSAI preconditioning publication-title: Numer. Linear Algebr. with Appl. – volume: 30 start-page: 365 year: 2014 end-page: 381 ident: bib0013 article-title: GPU-based acceleration of computations in nonlinear finite element deformation analysis publication-title: Int. j. Numer. Method. Biomed. Eng. – volume: 35 start-page: C323 year: 2013 end-page: C345 ident: bib0018 article-title: Fine-grained multithreading for the multifrontal QR factorization of sparse matrices publication-title: SIAM J. Sci. Comput. – volume: 304 start-page: 170 year: 2016 end-page: 188 ident: bib0015 article-title: A fast block low-rank dense solver with applications to finite-element matrices publication-title: J. Comput. Phys. – year: 2009 ident: bib0025 article-title: Accelerating the ANSYS direct sparse solver with GPUs publication-title: Symp. Appl. Accel. High Perform. Comput. – volume: 11 start-page: 134 year: 1990 end-page: 172 ident: bib0038 article-title: The role of elimination trees in sparse factorization publication-title: SIAM J. Matrix Anal. Appl. – volume: 14 start-page: 190 year: 2014 end-page: 203 ident: bib0010 article-title: Iterative methods for solving large-scale problems of structural mechanics using multi-core computers publication-title: Arch. Civ. Mech. Eng. – volume: 174 year: 2022 ident: 10.1016/j.camwa.2024.10.017_bib0023 article-title: Parallel finite element solver PARFES for the structural analysis in NUMA architecture publication-title: Adv. Eng. Softw. doi: 10.1016/j.advengsoft.2022.103290 – volume: 271 year: 2022 ident: 10.1016/j.camwa.2024.10.017_bib0006 article-title: Multi-GPU implementation of a time-explicit finite volume solver using CUDA and a CUDA-Aware version of OpenMPI with application to shallow water flows publication-title: Comput. Phys. Commun. doi: 10.1016/j.cpc.2021.108190 – volume: 94 start-page: 1 year: 2021 ident: 10.1016/j.camwa.2024.10.017_bib0022 article-title: Parallel finite element solver for multi-core computers with shared memory publication-title: Comput. Math. with Appl. doi: 10.1016/j.camwa.2021.04.013 – volume: 111 year: 2022 ident: 10.1016/j.camwa.2024.10.017_bib0029 article-title: Linear solvers for power grid optimization problems: a review of GPU-accelerated linear solvers publication-title: Parallel Comput doi: 10.1016/j.parco.2021.102870 – volume: 9 start-page: 353 year: 1975 ident: 10.1016/j.camwa.2024.10.017_bib0034 article-title: Finite element formulations for large deformation dynamic analysis publication-title: Int. J. Numer. Methods Eng. doi: 10.1002/nme.1620090207 – ident: 10.1016/j.camwa.2024.10.017_bib0043 doi: 10.1109/SUPERC.1990.129995 – start-page: 372 year: 2011 ident: 10.1016/j.camwa.2024.10.017_bib0026 article-title: Multifrontal factorization of sparse SPD matrices on GPUs – ident: 10.1016/j.camwa.2024.10.017_bib0001 – volume: 34 start-page: 1 year: 2008 ident: 10.1016/j.camwa.2024.10.017_bib0017 article-title: Parallel unsymmetric-pattern multifrontal sparse LU with column preordering publication-title: ACM Trans. Math. Softw. doi: 10.1145/1326548.1326550 – year: 1977 ident: 10.1016/j.camwa.2024.10.017_bib0037 – volume: 304 start-page: 170 year: 2016 ident: 10.1016/j.camwa.2024.10.017_bib0015 article-title: A fast block low-rank dense solver with applications to finite-element matrices publication-title: J. Comput. Phys. doi: 10.1016/j.jcp.2015.10.012 – volume: 157 start-page: 177 year: 2023 ident: 10.1016/j.camwa.2024.10.017_bib0014 article-title: Heterogeneous parallel computing method for 3D transient nonlinear thermomechanical problems on CPU-GPU platforms publication-title: Eng. Anal. Bound. Elem. doi: 10.1016/j.enganabound.2023.09.004 – volume: 32 start-page: 3627 year: 2010 ident: 10.1016/j.camwa.2024.10.017_bib0020 article-title: Design of a multicore sparse Cholesky factorization using DAGs publication-title: SIAM J. Sci. Comput. doi: 10.1137/090757216 – volume: 18 start-page: 69 year: 2001 ident: 10.1016/j.camwa.2024.10.017_bib0031 article-title: PARDISO: a high-performance serial and parallel sparse linear solver in semiconductor device simulation publication-title: Futur. Gener. Comput. Syst. doi: 10.1016/S0167-739X(00)00076-5 – volume: 42 start-page: C23 year: 2020 ident: 10.1016/j.camwa.2024.10.017_bib0028 article-title: A new sparse LDL^T solver using a posteriori threshold pivoting publication-title: SIAM J. Sci. Comput. doi: 10.1137/18M1225963 – volume: 30 start-page: 365 year: 2014 ident: 10.1016/j.camwa.2024.10.017_bib0013 article-title: GPU-based acceleration of computations in nonlinear finite element deformation analysis publication-title: Int. j. Numer. Method. Biomed. Eng. doi: 10.1002/cnm.2607 – volume: 23 start-page: 158 year: 2006 ident: 10.1016/j.camwa.2024.10.017_bib0039 article-title: On fast factorization pivoting methods for sparse symmetric indefinite systems publication-title: Electron. Trans. Numer. Anal. – volume: 40 start-page: 34 year: 2014 ident: 10.1016/j.camwa.2024.10.017_bib0044 article-title: A study of shared-memory parallelism in a multifrontal solver publication-title: Parallel Comput doi: 10.1016/j.parco.2014.02.003 – volume: 38 start-page: 1 year: 2011 ident: 10.1016/j.camwa.2024.10.017_bib0047 article-title: The University of Florida sparse matrix collection publication-title: ACM Trans. Math. Softw. – volume: 41 start-page: 155 year: 2002 ident: 10.1016/j.camwa.2024.10.017_bib0052 article-title: BoomerAMG: a parallel algebraic multigrid solver and preconditioner publication-title: Appl. Numer. Math. doi: 10.1016/S0168-9274(01)00115-5 – year: 2003 ident: 10.1016/j.camwa.2024.10.017_bib0008 – volume: 156 year: 2021 ident: 10.1016/j.camwa.2024.10.017_bib0007 article-title: Computational cost comparison between nodal and vector finite elements in the modeling of controlled source electromagnetic data using a direct solver publication-title: Comput. Geosci. doi: 10.1016/j.cageo.2021.104901 – volume: 50 start-page: 412 year: 2013 ident: 10.1016/j.camwa.2024.10.017_bib0003 article-title: Parallel computing of central difference explicit finite element based on GPU general computing platform publication-title: J. Comput. Res. Dev. – volume: 11 start-page: 134 year: 1990 ident: 10.1016/j.camwa.2024.10.017_bib0038 article-title: The role of elimination trees in sparse factorization publication-title: SIAM J. Matrix Anal. Appl. doi: 10.1137/0611010 – year: 2009 ident: 10.1016/j.camwa.2024.10.017_bib0025 article-title: Accelerating the ANSYS direct sparse solver with GPUs – volume: 28 start-page: 135 year: 2002 ident: 10.1016/j.camwa.2024.10.017_bib0042 article-title: An updated set of basic linear algebra subprograms (BLAS) publication-title: ACM Trans. Math. Softw. doi: 10.1145/567806.567807 – year: 1999 ident: 10.1016/j.camwa.2024.10.017_bib0040 – volume: 47 start-page: 387 year: 2011 ident: 10.1016/j.camwa.2024.10.017_bib0012 article-title: FSAI preconditioned CG algorithm combined with GPU technique for the finite element analysis of electromagnetic scattering problems publication-title: Finite Elem. Anal. Des. doi: 10.1016/j.finel.2010.11.005 – volume: 35 start-page: C323 year: 2013 ident: 10.1016/j.camwa.2024.10.017_bib0018 article-title: Fine-grained multithreading for the multifrontal QR factorization of sparse matrices publication-title: SIAM J. Sci. Comput. doi: 10.1137/110846427 – start-page: 121 year: 2000 ident: 10.1016/j.camwa.2024.10.017_bib0030 article-title: MUMPS: a general purpose distributed memory sparse solver – volume: 284 year: 2023 ident: 10.1016/j.camwa.2024.10.017_bib0027 article-title: Fine-grained heterogeneous parallel direct solver for finite element problems publication-title: Comput. Phys. Commun. doi: 10.1016/j.cpc.2022.108637 – volume: 41 start-page: 1 year: 2014 ident: 10.1016/j.camwa.2024.10.017_bib0033 article-title: A parallel sparse direct solver via hierarchical DAG scheduling publication-title: Acm Trans. Math. Softw. doi: 10.1145/2629641 – volume: 40 start-page: 112 year: 2014 ident: 10.1016/j.camwa.2024.10.017_bib0002 article-title: GPU parallelization of an object-oriented nonlinear dynamic structural analysis platform publication-title: Simul. Model. Pract. Theory. doi: 10.1016/j.simpat.2013.09.004 – year: 2006 ident: 10.1016/j.camwa.2024.10.017_bib0009 – ident: 10.1016/j.camwa.2024.10.017_bib0046 – volume: 31 start-page: e4460 year: 2019 ident: 10.1016/j.camwa.2024.10.017_bib0048 article-title: Adaptive precision in block-Jacobi preconditioning for iterative sparse linear system solvers publication-title: Concurr. Comput. Pract. Exp. doi: 10.1002/cpe.4460 – volume: 27 start-page: 313 year: 2005 ident: 10.1016/j.camwa.2024.10.017_bib0045 article-title: Strategies for scaling and pivoting for sparse symmetric indefinite problems publication-title: SIAM J. Matrix Anal. Appl. doi: 10.1137/04061043X – volume: 257 start-page: 687 year: 2014 ident: 10.1016/j.camwa.2024.10.017_bib0004 article-title: Accelerated finite element elastodynamic simulations using the GPU publication-title: J. Comput. Phys. doi: 10.1016/j.jcp.2013.10.017 – volume: 62 start-page: 1468 year: 2014 ident: 10.1016/j.camwa.2024.10.017_bib0016 article-title: Three-dimensional modelling of controlled-source electromagnetic surveys using an edge finite-element method with a direct solver publication-title: Geophys. Prospect. doi: 10.1111/1365-2478.12132 – volume: 10 start-page: 165 year: 1989 ident: 10.1016/j.camwa.2024.10.017_bib0041 article-title: Solving sparse linear systems with sparse backward error publication-title: SIAM J. Matrix Anal. Appl. doi: 10.1137/0610013 – ident: 10.1016/j.camwa.2024.10.017_bib0051 – volume: 14 start-page: 190 year: 2014 ident: 10.1016/j.camwa.2024.10.017_bib0010 article-title: Iterative methods for solving large-scale problems of structural mechanics using multi-core computers publication-title: Arch. Civ. Mech. Eng. doi: 10.1016/j.acme.2013.05.009 – volume: 70 start-page: 270 year: 2010 ident: 10.1016/j.camwa.2024.10.017_bib0021 article-title: A parallel direct solver for the self-adaptive hp Finite Element Method publication-title: J. Parallel Distrib. Comput. doi: 10.1016/j.jpdc.2009.09.007 – volume: 23 start-page: 121 year: 2007 ident: 10.1016/j.camwa.2024.10.017_bib0035 article-title: Total Lagrangian explicit dynamics finite element algorithm for computing soft tissue deformation publication-title: Commun. Numer. Methods Eng. doi: 10.1002/cnm.887 – start-page: 11 year: 2020 ident: 10.1016/j.camwa.2024.10.017_bib0011 article-title: Scalable direct-iterative hybrid solver for sparse matrices on multi-core and vector architectures – start-page: 67 year: 2018 ident: 10.1016/j.camwa.2024.10.017_bib0032 article-title: Sparse direct solution on parallel computers – year: 2019 ident: 10.1016/j.camwa.2024.10.017_bib0024 article-title: Parallel algorithms for forward and back substitution in linear algebraic equations of finite element method publication-title: J. Telecommun. Inf. Technol. – volume: 26 start-page: 265 year: 2000 ident: 10.1016/j.camwa.2024.10.017_bib0050 article-title: Incomplete LU preconditioner for FMM implementation publication-title: Microw. Opt. Technol. Lett. doi: 10.1002/1098-2760(20000820)26:4<265::AID-MOP18>3.0.CO;2-O – volume: 225 start-page: 47 year: 2018 ident: 10.1016/j.camwa.2024.10.017_bib0005 article-title: A parallel finite element procedure for contact-impact problems using edge-based smooth triangular element and GPU publication-title: Comput. Phys. Commun. doi: 10.1016/j.cpc.2017.12.006 – volume: 59 start-page: 140 year: 2016 ident: 10.1016/j.camwa.2024.10.017_bib0036 article-title: Accelerating sparse Cholesky factorization on GPUs publication-title: Parallel Comput doi: 10.1016/j.parco.2016.06.004 – volume: 20 start-page: 915 year: 1999 ident: 10.1016/j.camwa.2024.10.017_bib0019 article-title: An asynchronous parallel supernodal algorithm for sparse gaussian elimination publication-title: SIAM J. Matrix Anal. Appl. doi: 10.1137/S0895479897317685 – volume: 25 start-page: e2183 year: 2018 ident: 10.1016/j.camwa.2024.10.017_bib0049 article-title: Multilevel approaches for FSAI preconditioning publication-title: Numer. Linear Algebr. with Appl. doi: 10.1002/nla.2183
SSID	ssj0004320
Score	2.4354818
Snippet	Sparse linear system solving is a primary computational cost in large-scale finite element analysis, and improving its performance is a key technological...
SourceID	crossref elsevier
SourceType	Enrichment Source Index Database Publisher
StartPage	447
SubjectTerms	FEM High performance computing Sparse direct solver
Title	Fully parallel and pipelined sparse direct solver for large symmetric indefinite finite element problems
URI	https://dx.doi.org/10.1016/j.camwa.2024.10.017
Volume	175
WOSCitedRecordID	wos001344184800001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
journalDatabaseRights	– providerCode: PRVESC databaseName: Elsevier SD Freedom Collection Journals 2021 issn: 0898-1221 databaseCode: AIEXJ dateStart: 20211207 customDbUrl: isFulltext: true dateEnd: 99991231 titleUrlDefault: https://www.sciencedirect.com omitProxy: false ssIdentifier: ssj0004320 providerName: Elsevier
link	http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Lb9QwELaWlgMX3oiWh3zgFoISx3n4WFXlJVQhUWB7ihzHplll02Wz28dv4k8yju0ktGhFD1ySlZU4TuZbz4w98w1Cr0JZCBigAienAAeFRLHPuSh80M0iAgOBim6j_dun9PAwm07Z58nkl8uFOavTpskuLtjiv4oa2kDYOnX2BuLuO4UG-A1ChyOIHY7_JHjtVF56mtK7rqUlAqgWOu0cbEuYP5at9Iwi82Ag8HpdpGGtI8K99nI-1yW2hKdZFFWlDVLPnqQJNPdsCZp2bNa62hBth6R5TwXrcudGu-QdM9SfSxDf7Zr18XpWyauNX05k8-PnegDxvimgfXxqde7o2nfrqnTNdiWD0FFUiJ3wGHi0xGRM97NzGo_mV2roOa2qpqbKyzUtYBYkZuDhz881txShb3QEX5gOSs9t9F_RhX2Eogt-m-VdJ7nuBFpy6OQW2iZpzGAK3d77cDD9OKThRoYF1L2GI7nqwgmvjeXvhtDIuDm6j-5arwTvGTQ9QBPZPET3nFSxVQCP0EkHLuzAhQFcuAcXNuDCBlzYgAsDuHAHLtyDCw_gwvZkwYUduB6jr28Pjvbf-7ZWhy_ACFr5jAZZyRiPCQMbPlPaEylUTGUUqKCIGedMFZpvjmcC9IrWLSUPJI9CqlSYBdETtNWcNvIpwjA7FClJpCbRplEZFkmZUEkUSRgXIpQ7iLgPlwtLZK_rqdT5BqHtoNf9TQvD47L58sRJJLemqPl4OWBs0427N3vOM3Rn-Cc8R1ur5Vq-QLfF2apqly8twH4DJSmuYA
linkProvider	Elsevier
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Fully+parallel+and+pipelined+sparse+direct+solver+for+large+symmetric+indefinite+finite+element+problems&rft.jtitle=Computers+%26+mathematics+with+applications+%281987%29&rft.au=Wang%2C+Yujie&rft.au=Wang%2C+Shengquan&rft.au=Cai%2C+Yong&rft.au=Wang%2C+Guidong&rft.date=2024-12-01&rft.issn=0898-1221&rft.volume=175&rft.spage=447&rft.epage=469&rft_id=info:doi/10.1016%2Fj.camwa.2024.10.017&rft.externalDBID=n%2Fa&rft.externalDocID=10_1016_j_camwa_2024_10_017
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0898-1221&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0898-1221&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0898-1221&client=summon