Boundary Conditions for Linear Exit Time Gradient Trajectories Around Saddle Points: Analysis and Algorithm

Gradient-related first-order methods have become the workhorse of large-scale numerical optimization problems. Many of these problems involve nonconvex objective functions with multiple saddle points, which necessitates an understanding of the behavior of discrete trajectories of first-order methods...

Celý popis

Uloženo v:

Podrobná bibliografie
Vydáno v:	IEEE transactions on information theory Ročník 69; číslo 4; s. 2556 - 2602
Hlavní autoři:	Dixit, Rishabh, Gurbuzbalaban, Mert, Bajwa, Waheed U.
Médium:	Journal Article
Jazyk:	angličtina
Vydáno:	New York IEEE 01.04.2023 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Témata:	Algorithms Boundary conditions Convergence Geometry gradient descent linear-time exit Morse function Neighborhoods nonconvex optimization Optimization Perturbation methods saddle escape Saddle points strict-saddle property Sufficient conditions Trajectory Trajectory analysis
ISSN:	0018-9448, 1557-9654
On-line přístup:	Získat plný text
Tagy:	Přidat tag Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!

Popis
Shrnutí:	Gradient-related first-order methods have become the workhorse of large-scale numerical optimization problems. Many of these problems involve nonconvex objective functions with multiple saddle points, which necessitates an understanding of the behavior of discrete trajectories of first-order methods within the geometrical landscape of these functions. This paper concerns convergence of first-order discrete methods to a local minimum of nonconvex optimization problems that comprise strict-saddle points within the geometrical landscape. To this end, it focuses on analysis of discrete gradient trajectories around saddle neighborhoods, derives sufficient conditions under which these trajectories can escape strict-saddle neighborhoods in linear time, explores the contractive and expansive dynamics of these trajectories in neighborhoods of strict-saddle points that are characterized by gradients of moderate magnitude, characterizes the non-curving nature of these trajectories, and highlights the inability of these trajectories to re-enter the neighborhoods around strict-saddle points after exiting them. Based on these insights and analyses, the paper then proposes a simple variant of the vanilla gradient descent algorithm, termed Curvature Conditioned Regularized Gradient Descent (CCRGD) algorithm, which utilizes a check for an initial boundary condition to ensure its trajectories can escape strict-saddle neighborhoods in linear time. Convergence analysis of the CCRGD algorithm, which includes its rate of convergence to a local minimum, is also presented in the paper. Numerical experiments are then provided on a test function as well as a low-rank matrix factorization problem to evaluate the efficacy of the proposed algorithm.
Bibliografie:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	0018-9448 1557-9654
DOI:	10.1109/TIT.2022.3213607