GPU-Accelerated Adaptive PCBSO Mode-Based Hybrid RLA for Sparse LU Factorization in Circuit Simulation

LU factorization is extensively used in engineering and scientific computations for solution of large set of linear equations. Particularly, circuit simulators rely heavily on sparse version of LU factorization for solution involving circuit matrices. One of the recent advances in this field is expl...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on computer-aided design of integrated circuits and systems Jg. 40; H. 11; S. 2320 - 2330
Hauptverfasser: Lee, Wai-Kong, Achar, Ramachandra
Format: Journal Article
Sprache:Englisch
Veröffentlicht: New York IEEE 01.11.2021
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Schlagworte:
ISSN:0278-0070, 1937-4151
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:LU factorization is extensively used in engineering and scientific computations for solution of large set of linear equations. Particularly, circuit simulators rely heavily on sparse version of LU factorization for solution involving circuit matrices. One of the recent advances in this field is exploiting the emerging computing platform of graphics processing units (GPUs) for parallel and sparse LU factorization. In this article, following contributions are made to advance the state of the art in hybrid right-looking algorithm (RLA): 1) a novel GPU kernel based on parallel column and block size optimization (PCBSO) is developed for adaptively allocating the block size while optimizing the number of columns for parallel execution based on the size of their associated submatrices at every level. The proposed approach helps to minimize the resource contention and to improve the computational performance and 2) an algorithm is developed to enable the execution of the new adaptive mode with dynamic parallelism. Also, a comprehensive performance comparison using a set of benchmark circuit examples is presented. The results indicate that, the proposed advancements can improve the results of state-of-the-art right looking sparse LU factorization in GPU by <inline-formula> <tex-math notation="LaTeX">1.54\times </tex-math></inline-formula> (Arithmetic Mean).
Bibliographie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0278-0070
1937-4151
DOI:10.1109/TCAD.2020.3046572