Bibliographic Details
| Title: |
A Comparative Study of Block Incomplete Sparse Approximate Inverses Preconditioning on Tesla K20 and V100 GPUs. |
| Authors: |
Ma, Wenpeng, Yuan, Wu, Liu, Xiazhen |
| Source: |
Algorithms; Jul2021, Vol. 14 Issue 7, p204, 1p |
| Subject Terms: |
COMPARATIVE studies, GRAPHICS processing units, ALGORITHMS |
| Company/Entity: |
NVIDIA Corp. |
| Abstract: |
Incomplete Sparse Approximate Inverses (ISAI) has shown some advantages over sparse triangular solves on GPUs when it is used for the incomplete LU based preconditioner. In this paper, we extend the single GPU method for Block–ISAI to multiple GPUs algorithm by coupling Block–Jacobi preconditioner, and introduce the detailed implementation in the open source numerical package PETSc. In the experiments, two representative cases are performed and a comparative study of Block–ISAI on up to four GPUs are conducted on two major generations of NVIDIA's GPUs (Tesla K20 and Tesla V100). Block–Jacobi preconditioning with Block–ISAI (BJPB-ISAI) shows an advantage over the level-scheduling based triangular solves from the cuSPARSE library for the cases, and the overhead of setting up Block–ISAI and the total wall clock times of GMRES is greatly reduced using Tesla V100 GPUs compared to Tesla K20 GPUs. [ABSTRACT FROM AUTHOR] |
|
Copyright of Algorithms is the property of MDPI and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.) |
| Database: |
Complementary Index |