Sketch Kernel Ridge Regression Using Circulant Matrix: Algorithm and Theory

Kernel ridge regression (KRR) is a powerful method for nonparametric regression. The time and space complexity of computing the KRR estimate directly are <inline-formula> <tex-math notation="LaTeX">\mathcal {O}(n^{3}) </tex-math></inline-formula> and <inline-form...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transaction on neural networks and learning systems Vol. 31; no. 9; pp. 3512 - 3524
Main Authors: Yin, Rong, Liu, Yong, Wang, Weiping, Meng, Dan
Format: Journal Article
Language:English
Published: Piscataway IEEE 01.09.2020
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects:
ISSN:2162-237X, 2162-2388, 2162-2388
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Kernel ridge regression (KRR) is a powerful method for nonparametric regression. The time and space complexity of computing the KRR estimate directly are <inline-formula> <tex-math notation="LaTeX">\mathcal {O}(n^{3}) </tex-math></inline-formula> and <inline-formula> <tex-math notation="LaTeX">\mathcal {O}(n^{2}) </tex-math></inline-formula>, respectively, which are prohibitive for large-scale data sets, where <inline-formula> <tex-math notation="LaTeX">n </tex-math></inline-formula> is the number of data. In this article, we propose a novel random sketch technique based on the circulant matrix that achieves savings in storage space and accelerates the solution of the KRR approximation. The circulant matrix has the following advantages: It can save time complexity by using the fast Fourier transform (FFT) to compute the product of matrix and vector, its space complexity is linear, and the circulant matrix, whose entries in the first column are independent of each other and obey the Gaussian distribution, is almost as effective as the i.i.d. Gaussian random matrix for approximating KRR. Combining the characteristics of the circulant matrix and our careful design, theoretical analysis and experimental results demonstrate that our proposed sketch method, making the estimate kernel methods scalable and practical for large-scale data problems, outperforms the state-of-the-art KRR estimates in time complexity while retaining similar accuracies. Meanwhile, our sketch method provides the theoretical bound that keeps the optimal convergence rate for approximating KRR.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:2162-237X
2162-2388
2162-2388
DOI:10.1109/TNNLS.2019.2944959