AmgT: Algebraic Multigrid Solver on Tensor Cores

Algebraic multigrid (AMG) methods are particularly efficient to solve a wide range of sparse linear systems, due to their good flexibility and adaptability. Even though modern parallel devices, such as GPUs, brought massive parallelism to AMG, the latest major hardware features, i.e., tensor core un...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:SC24: International Conference for High Performance Computing, Networking, Storage and Analysis s. 1 - 16
Hlavní autori: Lu, Yuechen, Zeng, Lijie, Wang, Tengcheng, Fu, Xu, Li, Wenxuan, Cheng, Helin, Yang, Dechuang, Jin, Zhou, Casas, Marc, Liu, Weifeng
Médium: Konferenčný príspevok..
Jazyk:English
Vydavateľské údaje: IEEE 17.11.2024
Predmet:
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Popis
Shrnutí:Algebraic multigrid (AMG) methods are particularly efficient to solve a wide range of sparse linear systems, due to their good flexibility and adaptability. Even though modern parallel devices, such as GPUs, brought massive parallelism to AMG, the latest major hardware features, i.e., tensor core units and their low precision compute power, have not been exploited to accelerate AMG. This paper proposes AmgT, a new AMG solver that utilizes the tensor core and mixed precision ability of the latest GPUs during multiple phases of the AMG algorithm. Considering that the sparse general matrix-matrix multiplication (SpGEMM) and sparse matrix-vector multiplication (SpMV) are extensively used in the setup and solve phases, respectively, we propose a novel method based on a new unified sparse storage format that leverages tensor cores and their variable precision. Our method improves both the performance of GPU kernels, and also reduces the cost of format conversion in the whole data flow of AMG. To better utilize the algorithm components in existing libraries, the data format and compute kernels of the AmgT solver are incorporated into the HYPRE library. The experimental results on NVIDIA A100, H100 and AMD MI210 GPUs show that our AmgT outperforms the original GPU version of HYPRE by a factor of on geomean 1.46 \times, 1.32 \times and 2.24 \times (up to 2.10 \times, 2.06 \times and 3.67 \times), respectively.
DOI:10.1109/SC41406.2024.00058