A proximal difference-of-convex algorithm with extrapolation

We consider a class of difference-of-convex (DC) optimization problems whose objective is level-bounded and is the sum of a smooth convex function with Lipschitz gradient, a proper closed convex function and a continuous concave function. While this kind of problems can be solved by the classical di...

Full description

Saved in:
Bibliographic Details
Published in:Computational optimization and applications Vol. 69; no. 2; pp. 297 - 324
Main Authors: Wen, Bo, Chen, Xiaojun, Pong, Ting Kei
Format: Journal Article
Language:English
Published: New York Springer US 01.03.2018
Springer Nature B.V
Subjects:
ISSN:0926-6003, 1573-2894
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:We consider a class of difference-of-convex (DC) optimization problems whose objective is level-bounded and is the sum of a smooth convex function with Lipschitz gradient, a proper closed convex function and a continuous concave function. While this kind of problems can be solved by the classical difference-of-convex algorithm (DCA) (Pham et al. Acta Math Vietnam 22:289–355, 1997 ), the difficulty of the subproblems of this algorithm depends heavily on the choice of DC decomposition. Simpler subproblems can be obtained by using a specific DC decomposition described in Pham et al. (SIAM J Optim 8:476–505, 1998 ). This decomposition has been proposed in numerous work such as Gotoh et al. (DC formulations and algorithms for sparse optimization problems, 2017 ), and we refer to the resulting DCA as the proximal DCA. Although the subproblems are simpler, the proximal DCA is the same as the proximal gradient algorithm when the concave part of the objective is void, and hence is potentially slow in practice. In this paper, motivated by the extrapolation techniques for accelerating the proximal gradient algorithm in the convex settings, we consider a proximal difference-of-convex algorithm with extrapolation to possibly accelerate the proximal DCA. We show that any cluster point of the sequence generated by our algorithm is a stationary point of the DC optimization problem for a fairly general choice of extrapolation parameters: in particular, the parameters can be chosen as in FISTA with fixed restart (O’Donoghue and Candès in Found Comput Math 15, 715–732, 2015 ). In addition, by assuming the Kurdyka-Łojasiewicz property of the objective and the differentiability of the concave part, we establish global convergence of the sequence generated by our algorithm and analyze its convergence rate. Our numerical experiments on two difference-of-convex regularized least squares models show that our algorithm usually outperforms the proximal DCA and the general iterative shrinkage and thresholding algorithm proposed in Gong et al. (A general iterative shrinkage and thresholding algorithm for non-convex regularized optimization problems, 2013 ).
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0926-6003
1573-2894
DOI:10.1007/s10589-017-9954-1