Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions

Low-rank matrix approximations, such as the truncated singular value decomposition and the rank-revealing QR decomposition, play a central role in data analysis and scientific computing. This work surveys and extends recent research which demonstrates that randomization offers a powerful tool for pe...

Full description

Saved in:

Bibliographic Details
Published in:	SIAM review Vol. 53; no. 2; pp. 217 - 288
Main Authors:	Halko, N., Martinsson, P. G., Tropp, J. A.
Format:	Journal Article
Language:	English
Published:	Philadelphia, PA Society for Industrial and Applied Mathematics 01.01.2011
Subjects:	Algorithmics. Computability. Computer arithmetics Algorithms Applied sciences Approximation Computer science; control theory; systems Dimensionality reduction Eigenvalues Error analysis Error bounds Error rates Exact sciences and technology Factorization Linear algebra Mathematical vectors Mathematics Matrices Matrix Methods of scientific computing (including symbolic computation, algebraic computation) Numerical analysis Numerical analysis. Scientific computation Numerical linear algebra Principal components analysis Randomized algorithms Sciences and techniques of general use Studies SURVEY and REVIEW Theoretical computing Matrix approximation 68W20 random matrix Error analysis Random sampling Memory Decomposition method Krylov subspace method Eigenvalue 60B20 Johnson-Lindenstrauss lemma Research Computing dimension reduction Multiprocessor Input Randomization 65F30 Primary Sparse matrix Floating point Robustness pass-efficient algorithm Singular value decomposition Parallel algorithm Data analysis Secondary rank-revealing QR factorization Architecture Rank randomized algorithm streaming algorithm Factorization interpolative decomposition Survey Scientific computation Applied mathematics eigenvalue decomposition Randomness Environment M matrix Principal component analysis
ISSN:	0036-1445, 1095-7200
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Low-rank matrix approximations, such as the truncated singular value decomposition and the rank-revealing QR decomposition, play a central role in data analysis and scientific computing. This work surveys and extends recent research which demonstrates that randomization offers a powerful tool for performing low-rank matrix approximation. These techniques exploit modern computational architectures more fully than classical methods and open the possibility of dealing with truly massive data sets. This paper presents a modular framework for constructing randomized algorithms that compute partial matrix decompositions. These methods use random sampling to identify a subspace that captures most of the action of a matrix. The input matrix is then compressed—either explicitly or implicitly—to this subspace, and the reduced matrix is manipulated deterministically to obtain the desired low-rank factorization. In many cases, this approach beats its classical competitors in terms of accuracy, robustness, and/or speed. These claims are supported by extensive numerical experiments and a detailed error analysis. The specific benefits of randomized techniques depend on the computational environment. Consider the model problem of finding the k dominant components of the singular value decomposition of an m × n matrix. (i) For a dense input matrix, randomized algorithms require O(mn log(k)) floating-point operations (flops) in contrast to O(mnk) for classical algorithms. (ii) For a sparse input matrix, the flop count matches classical Krylov subspace methods, but the randomized approach is more robust and can easily be reorganized to exploit multi-processor architectures. (iii) For a matrix that is too large to fit in fast memory, the randomized techniques require only a constant number of passes over the data, as opposed to O(k) passes for classical algorithms. In fact, it is sometimes possible to perform matrix approximation with a single pass over the data.
Bibliography:	SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 14
ISSN:	0036-1445 1095-7200
DOI:	10.1137/090771806