Efficient Kernel Sparse Coding Via First-Order Smooth Optimization

We consider the problem of dictionary learning and sparse coding, where the task is to find a concise set of basis vectors that accurately represent the observation data with only small numbers of active bases. Typically formulated as an L1-regularized least-squares problem, the problem incurs compu...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transaction on neural networks and learning systems Vol. 25; no. 8; pp. 1447 - 1459
Main Author: Kim, Minyoung
Format: Journal Article
Language:English
Published: New York, NY IEEE 01.08.2014
Institute of Electrical and Electronics Engineers
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects:
ISSN:2162-237X, 2162-2388, 2162-2388
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:We consider the problem of dictionary learning and sparse coding, where the task is to find a concise set of basis vectors that accurately represent the observation data with only small numbers of active bases. Typically formulated as an L1-regularized least-squares problem, the problem incurs computational difficulty originating from the nondifferentiable objective. Recent approaches to sparse coding thus have mainly focused on acceleration of the learning algorithm. In this paper, we propose an even more efficient and scalable sparse coding algorithm based on the first-order smooth optimization technique. The algorithm finds the theoretically guaranteed optimal sparse codes of the epsilon-approximate problem in a series of optimization subproblems, where each subproblem admits analytic solution, hence very fast and scalable with large-scale data. We further extend it to nonlinear sparse coding using kernel trick by showing that the representer theorem holds for the kernel sparse coding problem. This allows us to apply dual optimization, which essentially results in the same linear sparse coding problem in dual variables, highly beneficial compared with the existing methods that suffer from local minima and restricted forms of kernel function. The efficiency of our algorithms is demonstrated for natural stimuli data sets and several image classification problems.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ObjectType-Article-2
ObjectType-Feature-1
ISSN:2162-237X
2162-2388
2162-2388
DOI:10.1109/TNNLS.2013.2294059