Efficient Kernel Sparse Coding Via First-Order Smooth Optimization
We consider the problem of dictionary learning and sparse coding, where the task is to find a concise set of basis vectors that accurately represent the observation data with only small numbers of active bases. Typically formulated as an L1-regularized least-squares problem, the problem incurs compu...
Saved in:
| Published in: | IEEE transaction on neural networks and learning systems Vol. 25; no. 8; pp. 1447 - 1459 |
|---|---|
| Main Author: | |
| Format: | Journal Article |
| Language: | English |
| Published: |
New York, NY
IEEE
01.08.2014
Institute of Electrical and Electronics Engineers The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Subjects: | |
| ISSN: | 2162-237X, 2162-2388, 2162-2388 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | We consider the problem of dictionary learning and sparse coding, where the task is to find a concise set of basis vectors that accurately represent the observation data with only small numbers of active bases. Typically formulated as an L1-regularized least-squares problem, the problem incurs computational difficulty originating from the nondifferentiable objective. Recent approaches to sparse coding thus have mainly focused on acceleration of the learning algorithm. In this paper, we propose an even more efficient and scalable sparse coding algorithm based on the first-order smooth optimization technique. The algorithm finds the theoretically guaranteed optimal sparse codes of the epsilon-approximate problem in a series of optimization subproblems, where each subproblem admits analytic solution, hence very fast and scalable with large-scale data. We further extend it to nonlinear sparse coding using kernel trick by showing that the representer theorem holds for the kernel sparse coding problem. This allows us to apply dual optimization, which essentially results in the same linear sparse coding problem in dual variables, highly beneficial compared with the existing methods that suffer from local minima and restricted forms of kernel function. The efficiency of our algorithms is demonstrated for natural stimuli data sets and several image classification problems. |
|---|---|
| Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 ObjectType-Article-2 ObjectType-Feature-1 |
| ISSN: | 2162-237X 2162-2388 2162-2388 |
| DOI: | 10.1109/TNNLS.2013.2294059 |