A fast randomized algorithm for the approximation of matrices

We introduce a randomized procedure that, given an m × n matrix A and a positive integer k, approximates A with a matrix Z of rank k. The algorithm relies on applying a structured l × m random matrix R to each column of A, where l is an integer near to, but greater than, k. The structure of R allows...

Full description

Saved in:
Bibliographic Details
Published in:Applied and computational harmonic analysis Vol. 25; no. 3; pp. 335 - 366
Main Authors: Woolfe, Franco, Liberty, Edo, Rokhlin, Vladimir, Tygert, Mark
Format: Journal Article
Language:English
Published: Elsevier Inc 01.11.2008
Subjects:
ISSN:1063-5203, 1096-603X
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:We introduce a randomized procedure that, given an m × n matrix A and a positive integer k, approximates A with a matrix Z of rank k. The algorithm relies on applying a structured l × m random matrix R to each column of A, where l is an integer near to, but greater than, k. The structure of R allows us to apply it to an arbitrary m × 1 vector at a cost proportional to m log ( l ) ; the resulting procedure can construct a rank- k approximation Z from the entries of A at a cost proportional to m n log ( k ) + l 2 ( m + n ) . We prove several bounds on the accuracy of the algorithm; one such bound guarantees that the spectral norm ‖ A − Z ‖ of the discrepancy between A and Z is of the same order as max { m , n } times the ( k + 1 ) st greatest singular value σ k + 1 of A, with small probability of large deviations. In contrast, the classical pivoted “ QR” decomposition algorithms (such as Gram–Schmidt or Householder) require at least kmn floating-point operations in order to compute a similarly accurate rank- k approximation. In practice, the algorithm of this paper runs faster than the classical algorithms, even when k is quite small or large. Furthermore, the algorithm operates reliably independently of the structure of the matrix A, can access each column of A independently and at most twice, and parallelizes naturally. Thus, the algorithm provides an efficient, reliable means for computing several of the greatest singular values and corresponding singular vectors of A. The results are illustrated via several numerical examples.
ISSN:1063-5203
1096-603X
DOI:10.1016/j.acha.2007.12.002