Lower bounds for non-convex stochastic optimization

We lower bound the complexity of finding ϵ -stationary points (with gradient norm at most ϵ ) using stochastic first-order methods. In a well-studied model where algorithms access smooth, potentially non-convex functions through queries to an unbiased stochastic gradient oracle with bounded variance...

Celý popis

Uloženo v:

Podrobná bibliografie
Vydáno v:	Mathematical programming Ročník 199; číslo 1-2; s. 165 - 214
Hlavní autoři:	Arjevani, Yossi, Carmon, Yair, Duchi, John C., Foster, Dylan J., Srebro, Nathan, Woodworth, Blake
Médium:	Journal Article
Jazyk:	angličtina
Vydáno:	Berlin/Heidelberg Springer Berlin Heidelberg 01.05.2023 Springer
Témata:	Algorithms Analysis Calculus of Variations and Optimal Control; Optimization Combinatorics Full Length Paper Mathematical and Computational Physics Mathematical Methods in Physics Mathematics Mathematics and Statistics Mathematics of Computing Numerical Analysis Theoretical 90C06 68Q25 90C15 90C26 90C30 90C60
ISSN:	0025-5610, 1436-4646
On-line přístup:	Získat plný text
Tagy:	Přidat tag Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!

Popis
Shrnutí:	We lower bound the complexity of finding ϵ -stationary points (with gradient norm at most ϵ ) using stochastic first-order methods. In a well-studied model where algorithms access smooth, potentially non-convex functions through queries to an unbiased stochastic gradient oracle with bounded variance, we prove that (in the worst case) any algorithm requires at least ϵ - 4 queries to find an ϵ -stationary point. The lower bound is tight, and establishes that stochastic gradient descent is minimax optimal in this model. In a more restrictive model where the noisy gradient estimates satisfy a mean-squared smoothness property, we prove a lower bound of ϵ - 3 queries, establishing the optimality of recently proposed variance reduction techniques.
ISSN:	0025-5610 1436-4646
DOI:	10.1007/s10107-022-01822-7