A control-theoretic perspective on optimal high-order optimization
We provide a control-theoretic perspective on optimal tensor algorithms for minimizing a convex function in a finite-dimensional Euclidean space. Given a function Φ : R d → R that is convex and twice continuously differentiable, we study a closed-loop control system that is governed by the operators...
Gespeichert in:
| Veröffentlicht in: | Mathematical programming Jg. 195; H. 1-2; S. 929 - 975 |
|---|---|
| Hauptverfasser: | , |
| Format: | Journal Article |
| Sprache: | Englisch |
| Veröffentlicht: |
Berlin/Heidelberg
Springer Berlin Heidelberg
01.09.2022
Springer Springer Nature B.V |
| Schlagworte: | |
| ISSN: | 0025-5610, 1436-4646 |
| Online-Zugang: | Volltext |
| Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
| Zusammenfassung: | We provide a control-theoretic perspective on optimal tensor algorithms for minimizing a convex function in a finite-dimensional Euclidean space. Given a function
Φ
:
R
d
→
R
that is convex and twice continuously differentiable, we study a closed-loop control system that is governed by the operators
∇
Φ
and
∇
2
Φ
together with a feedback control law
λ
(
·
)
satisfying the algebraic equation
(
λ
(
t
)
)
p
‖
∇
Φ
(
x
(
t
)
)
‖
p
-
1
=
θ
for some
θ
∈
(
0
,
1
)
. Our first contribution is to prove the existence and uniqueness of a local solution to this system via the Banach fixed-point theorem. We present a simple yet nontrivial Lyapunov function that allows us to establish the existence and uniqueness of a global solution under certain regularity conditions and analyze the convergence properties of trajectories. The rate of convergence is
O
(
1
/
t
(
3
p
+
1
)
/
2
)
in terms of objective function gap and
O
(
1
/
t
3
p
)
in terms of squared gradient norm. Our second contribution is to provide two algorithmic frameworks obtained from discretization of our continuous-time system, one of which generalizes the large-step A-HPE framework of Monteiro and Svaiter (SIAM J Optim 23(2):1092–1125, 2013) and the other of which leads to a new optimal
p
-th order tensor algorithm. While our discrete-time analysis can be seen as a simplification and generalization of Monteiro and Svaiter (2013), it is largely motivated by the aforementioned continuous-time analysis, demonstrating the fundamental role that the feedback control plays in optimal acceleration and the clear advantage that the continuous-time perspective brings to algorithmic design. A highlight of our analysis is that we show that all of the
p
-th order optimal tensor algorithms that we discuss minimize the squared gradient norm at a rate of
O
(
k
-
3
p
)
, which complements the recent analysis in Gasnikov et al. (in: COLT, PMLR, pp 1374–1391, 2019), Jiang et al. (in: COLT, PMLR, pp 1799–1801, 2019) and Bubeck et al. (in: COLT, PMLR, pp 492–507, 2019). |
|---|---|
| Bibliographie: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ISSN: | 0025-5610 1436-4646 |
| DOI: | 10.1007/s10107-021-01721-3 |