Fully parallel and pipelined sparse direct solver for large symmetric indefinite finite element problems

Sparse linear system solving is a primary computational cost in large-scale finite element analysis, and improving its performance is a key technological challenge in this field. Real-world engineering problems involve diverse materials, elements, and connectivity relationships, making it difficult...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:Computers & mathematics with applications (1987) Ročník 175; s. 447 - 469
Hlavní autori: Wang, Yujie, Wang, Shengquan, Cai, Yong, Wang, Guidong, Li, Guangyao
Médium: Journal Article
Jazyk:English
Vydavateľské údaje: Elsevier Ltd 01.12.2024
Predmet:
ISSN:0898-1221
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Popis
Shrnutí:Sparse linear system solving is a primary computational cost in large-scale finite element analysis, and improving its performance is a key technological challenge in this field. Real-world engineering problems involve diverse materials, elements, and connectivity relationships, making it difficult for iterative methods to handle their global stiffness matrices. Direct methods, owing to their robustness, emerge as the preferred choice. In this paper, a novel block-based supernodal LDLT numerical factorization method is introduced. The computational process is disassembled into distinct tasks, and the dependency relationships between these tasks are expressed via a directed acyclic graph to guide the calculation sequence. Based on this approach, a global task pool and local task stack are established to store task queues, enhancing data reuse and multicore collaboration efficiency. Additionally, an effective task dispatch and work-stealing mechanism is implemented to prevent performance degradation caused by load imbalances. Numerical experiments, including a publicly available matrix test set and real-world engineering finite element problems, are conducted to compare the parallel performances of the Pardiso, MUMPS, and proposed solver. The results illustrate that the proposed solver performs significantly better than the other solvers when handling various types of sparse matrices and diverse architectures of multicore processors.
ISSN:0898-1221
DOI:10.1016/j.camwa.2024.10.017