Efficient algorithms for computing rank‐revealing factorizations on a GPU

Standard rank‐revealing factorizations such as the singular value decomposition (SVD) and column pivoted QR factorization are challenging to implement efficiently on a GPU. A major difficulty in this regard is the inability of standard algorithms to cast most operations in terms of the Level‐3 BLAS....

Full description

Saved in:
Bibliographic Details
Published in:Numerical linear algebra with applications Vol. 30; no. 6
Main Authors: Heavner, Nathan, Chen, Chao, Gopal, Abinand, Martinsson, Per‐Gunnar
Format: Journal Article
Language:English
Published: Oxford Wiley Subscription Services, Inc 01.12.2023
Subjects:
ISSN:1070-5325, 1099-1506
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Standard rank‐revealing factorizations such as the singular value decomposition (SVD) and column pivoted QR factorization are challenging to implement efficiently on a GPU. A major difficulty in this regard is the inability of standard algorithms to cast most operations in terms of the Level‐3 BLAS. This article presents two alternative algorithms for computing a rank‐revealing factorization of the form , where and are orthogonal and is trapezoidal (or triangular if is square). Both algorithms use randomized projection techniques to cast most of the flops in terms of matrix‐matrix multiplication, which is exceptionally efficient on the GPU. Numerical experiments illustrate that these algorithms achieve significant acceleration over finely tuned GPU implementations of the SVD while providing low rank approximation errors close to that of the SVD.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1070-5325
1099-1506
DOI:10.1002/nla.2515