A control-theoretic perspective on optimal high-order optimization

We provide a control-theoretic perspective on optimal tensor algorithms for minimizing a convex function in a finite-dimensional Euclidean space. Given a function Φ : R d → R that is convex and twice continuously differentiable, we study a closed-loop control system that is governed by the operators...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Mathematical programming Ročník 195; číslo 1-2; s. 929 - 975
Hlavní autoři: Lin, Tianyi, Jordan, Michael I.
Médium: Journal Article
Jazyk:angličtina
Vydáno: Berlin/Heidelberg Springer Berlin Heidelberg 01.09.2022
Springer
Springer Nature B.V
Témata:
ISSN:0025-5610, 1436-4646
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:We provide a control-theoretic perspective on optimal tensor algorithms for minimizing a convex function in a finite-dimensional Euclidean space. Given a function Φ : R d → R that is convex and twice continuously differentiable, we study a closed-loop control system that is governed by the operators ∇ Φ and ∇ 2 Φ together with a feedback control law λ ( · ) satisfying the algebraic equation ( λ ( t ) ) p ‖ ∇ Φ ( x ( t ) ) ‖ p - 1 = θ for some θ ∈ ( 0 , 1 ) . Our first contribution is to prove the existence and uniqueness of a local solution to this system via the Banach fixed-point theorem. We present a simple yet nontrivial Lyapunov function that allows us to establish the existence and uniqueness of a global solution under certain regularity conditions and analyze the convergence properties of trajectories. The rate of convergence is O ( 1 / t ( 3 p + 1 ) / 2 ) in terms of objective function gap and O ( 1 / t 3 p ) in terms of squared gradient norm. Our second contribution is to provide two algorithmic frameworks obtained from discretization of our continuous-time system, one of which generalizes the large-step A-HPE framework of Monteiro and Svaiter (SIAM J Optim 23(2):1092–1125, 2013) and the other of which leads to a new optimal p -th order tensor algorithm. While our discrete-time analysis can be seen as a simplification and generalization of Monteiro and Svaiter (2013), it is largely motivated by the aforementioned continuous-time analysis, demonstrating the fundamental role that the feedback control plays in optimal acceleration and the clear advantage that the continuous-time perspective brings to algorithmic design. A highlight of our analysis is that we show that all of the p -th order optimal tensor algorithms that we discuss minimize the squared gradient norm at a rate of O ( k - 3 p ) , which complements the recent analysis in Gasnikov et al. (in: COLT, PMLR, pp 1374–1391, 2019), Jiang et al. (in: COLT, PMLR, pp 1799–1801, 2019) and Bubeck et al. (in: COLT, PMLR, pp 492–507, 2019).
Bibliografie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0025-5610
1436-4646
DOI:10.1007/s10107-021-01721-3