AmgT: Algebraic Multigrid Solver on Tensor Cores

Algebraic multigrid (AMG) methods are particularly efficient to solve a wide range of sparse linear systems, due to their good flexibility and adaptability. Even though modern parallel devices, such as GPUs, brought massive parallelism to AMG, the latest major hardware features, i.e., tensor core un...

Full description

Saved in:
Bibliographic Details
Published in:SC24: International Conference for High Performance Computing, Networking, Storage and Analysis pp. 1 - 16
Main Authors: Lu, Yuechen, Zeng, Lijie, Wang, Tengcheng, Fu, Xu, Li, Wenxuan, Cheng, Helin, Yang, Dechuang, Jin, Zhou, Casas, Marc, Liu, Weifeng
Format: Conference Proceeding
Language:English
Published: IEEE 17.11.2024
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Algebraic multigrid (AMG) methods are particularly efficient to solve a wide range of sparse linear systems, due to their good flexibility and adaptability. Even though modern parallel devices, such as GPUs, brought massive parallelism to AMG, the latest major hardware features, i.e., tensor core units and their low precision compute power, have not been exploited to accelerate AMG. This paper proposes AmgT, a new AMG solver that utilizes the tensor core and mixed precision ability of the latest GPUs during multiple phases of the AMG algorithm. Considering that the sparse general matrix-matrix multiplication (SpGEMM) and sparse matrix-vector multiplication (SpMV) are extensively used in the setup and solve phases, respectively, we propose a novel method based on a new unified sparse storage format that leverages tensor cores and their variable precision. Our method improves both the performance of GPU kernels, and also reduces the cost of format conversion in the whole data flow of AMG. To better utilize the algorithm components in existing libraries, the data format and compute kernels of the AmgT solver are incorporated into the HYPRE library. The experimental results on NVIDIA A100, H100 and AMD MI210 GPUs show that our AmgT outperforms the original GPU version of HYPRE by a factor of on geomean 1.46 \times, 1.32 \times and 2.24 \times (up to 2.10 \times, 2.06 \times and 3.67 \times), respectively.
DOI:10.1109/SC41406.2024.00058