Zeroth-order algorithms for nonconvex–strongly-concave minimax problems with improved complexities

In this paper, we study zeroth-order algorithms for minimax optimization problems that are nonconvex in one variable and strongly-concave in the other variable. Such minimax optimization problems have attracted significant attention lately due to their applications in modern machine learning tasks....

Celý popis

Uložené v:

Podrobná bibliografia
Vydané v:	Journal of global optimization Ročník 87; číslo 2-4; s. 709 - 740
Hlavní autori:	Wang, Zhongruo, Balasubramanian, Krishnakumar, Ma, Shiqian, Razaviyayn, Meisam
Médium:	Journal Article
Jazyk:	English
Vydavateľské údaje:	New York Springer US 01.11.2023 Springer
Predmet:	Algorithms Comparative analysis Computer Science Machine learning Mathematics Mathematics and Statistics Operations Research/Decision Theory Optimization Real Functions Stochastic algorithms Oracle complexity Gradient descent ascent Minimax problem Zeroth-order algorithms
ISSN:	0925-5001, 1573-2916
On-line prístup:	Získať plný text
Tagy:	Pridať tag Žiadne tagy, Buďte prvý, kto otaguje tento záznam!

Popis
Shrnutí:	In this paper, we study zeroth-order algorithms for minimax optimization problems that are nonconvex in one variable and strongly-concave in the other variable. Such minimax optimization problems have attracted significant attention lately due to their applications in modern machine learning tasks. We first consider a deterministic version of the problem. We design and analyze the Zeroth-Order Gradient Descent Ascent (ZO-GDA) algorithm, and provide improved results compared to existing works, in terms of oracle complexity. We also propose the Zeroth-Order Gradient Descent Multi-Step Ascent (ZO-GDMSA) algorithm that significantly improves the oracle complexity of ZO-GDA. We then consider stochastic versions of ZO-GDA and ZO-GDMSA, to handle stochastic nonconvex minimax problems. For this case, we provide oracle complexity results under two assumptions on the stochastic gradient: (i) the uniformly bounded variance assumption, which is common in traditional stochastic optimization, and (ii) the Strong Growth Condition (SGC), which has been known to be satisfied by modern over-parameterized machine learning models. We establish that under the SGC assumption, the complexities of the stochastic algorithms match that of deterministic algorithms. Numerical experiments are presented to support our theoretical results.
ISSN:	0925-5001 1573-2916
DOI:	10.1007/s10898-022-01160-0