Stochastic variance reduced gradient with hyper-gradient for non-convex large-scale learning

Non-convex optimization, which can better capture the problem structure, has received considerable attention in the applications of machine learning, image/signal processing, statistics, etc. With faster convergence rate, there have been tremendous studies on developing stochastic variance reduced a...

Celý popis

Uloženo v:

Podrobná bibliografie
Vydáno v:	Applied intelligence (Dordrecht, Netherlands) Ročník 53; číslo 23; s. 28627 - 28641
Hlavní autor:	Yang, Zhuang
Médium:	Journal Article
Jazyk:	angličtina
Vydáno:	New York Springer US 01.12.2023 Springer Nature B.V
Témata:	Algorithms Artificial Intelligence Computer Science Convergence Convex analysis Convexity Empirical analysis Machine learning Machines Manufacturing Mechanical Engineering Optimization Processes Variance Non-convex optimization Online step size Hyper-gradient Variance reduction
ISSN:	0924-669X, 1573-7497
On-line přístup:	Získat plný text
Tagy:	Přidat tag Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!

Popis
Shrnutí:	Non-convex optimization, which can better capture the problem structure, has received considerable attention in the applications of machine learning, image/signal processing, statistics, etc. With faster convergence rate, there have been tremendous studies on developing stochastic variance reduced algorithms to solve these non-convex optimization problems. However, as a crucial hyper-parameter for stochastic variance reduced algorithms, that how to select an appropriate step size is less researched in solving non-convex optimization problems. To address this gap, we propose a new class of stochastic variance reduced algorithms based on hyper-gradient, which has the ability to automatically obtain the online step size. Specifically, we focus on the variance-reduced stochastic optimization algorithms, the stochastic variance reduced gradient (SVRG) algorithm, which computes a full gradient periodically. We analyze theoretically the convergence of the proposed algorithm for non-convex optimization problems. Moreover, we show that the proposed algorithm enjoys the same complexities as state-of-the-art algorithms for solving non-convex problems in terms of finding an approximate stationary point. Thorough numerical results on empirical risk minimization with non-convex loss functions validate the efficacy of our method.
Bibliografie:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	0924-669X 1573-7497
DOI:	10.1007/s10489-023-05025-1