First-Order Penalty Methods for Bilevel Optimization: First-order penalty methods for bilevel optimization

Gespeichert in:
Bibliographische Detailangaben
Titel: First-Order Penalty Methods for Bilevel Optimization: First-order penalty methods for bilevel optimization
Autoren: Zhaosong Lu, Sanyou Mei
Quelle: SIAM Journal on Optimization. 34:1937-1969
Publication Status: Preprint
Verlagsinformationen: Society for Industrial & Applied Mathematics (SIAM), 2024.
Publikationsjahr: 2024
Schlagwörter: bilevel optimization, first-order methods, FOS: Computer and information sciences, Computer Science - Machine Learning, 0211 other engineering and technologies, operation complexity, Machine Learning (stat.ML), 02 engineering and technology, Numerical Analysis (math.NA), Nonconvex programming, global optimization, Minimax problems in mathematical programming, Machine Learning (cs.LG), penalty methods, Numerical mathematical programming methods, Nonlinear programming, Statistics - Machine Learning, Optimization and Control (math.OC), FOS: Mathematics, minimax optimization, Mathematics - Numerical Analysis, 90C26, 90C30, 90C47, 90C99, 65K05, Mathematics - Optimization and Control
Beschreibung: In this paper we study a class of unconstrained and constrained bilevel optimization problems in which the lower level is a possibly nonsmooth convex optimization problem, while the upper level is a possibly nonconvex optimization problem. We introduce a notion of $\varepsilon$-KKT solution for them and show that an $\varepsilon$-KKT solution leads to an $O(\sqrt{\varepsilon})$- or $O(\varepsilon)$-hypergradient based stionary point under suitable assumptions. We also propose first-order penalty methods for finding an $\varepsilon$-KKT solution of them, whose subproblems turn out to be a structured minimax problem and can be suitably solved by a first-order method recently developed by the authors. Under suitable assumptions, an \emph{operation complexity} of $O(\varepsilon^{-4}\log\varepsilon^{-1})$ and $O(\varepsilon^{-7}\log\varepsilon^{-1})$, measured by their fundamental operations, is established for the proposed penalty methods for finding an $\varepsilon$-KKT solution of the unconstrained and constrained bilevel optimization problems, respectively. Preliminary numerical results are presented to illustrate the performance of our proposed methods. To the best of our knowledge, this paper is the first work to demonstrate that bilevel optimization can be approximately solved as minimax optimization, and moreover, it provides the first implementable method with complexity guarantees for such sophisticated bilevel optimization.
Accepted by SIAM Journal on Optimization
Publikationsart: Article
Dateibeschreibung: application/xml
Sprache: English
ISSN: 1095-7189
1052-6234
DOI: 10.1137/23m1566753
DOI: 10.48550/arxiv.2301.01716
Zugangs-URL: http://arxiv.org/abs/2301.01716
Rights: arXiv Non-Exclusive Distribution
Dokumentencode: edsair.doi.dedup.....b25023fbb7d51a13ea0e567f16d3366d
Datenbank: OpenAIRE
Beschreibung
Abstract:In this paper we study a class of unconstrained and constrained bilevel optimization problems in which the lower level is a possibly nonsmooth convex optimization problem, while the upper level is a possibly nonconvex optimization problem. We introduce a notion of $\varepsilon$-KKT solution for them and show that an $\varepsilon$-KKT solution leads to an $O(\sqrt{\varepsilon})$- or $O(\varepsilon)$-hypergradient based stionary point under suitable assumptions. We also propose first-order penalty methods for finding an $\varepsilon$-KKT solution of them, whose subproblems turn out to be a structured minimax problem and can be suitably solved by a first-order method recently developed by the authors. Under suitable assumptions, an \emph{operation complexity} of $O(\varepsilon^{-4}\log\varepsilon^{-1})$ and $O(\varepsilon^{-7}\log\varepsilon^{-1})$, measured by their fundamental operations, is established for the proposed penalty methods for finding an $\varepsilon$-KKT solution of the unconstrained and constrained bilevel optimization problems, respectively. Preliminary numerical results are presented to illustrate the performance of our proposed methods. To the best of our knowledge, this paper is the first work to demonstrate that bilevel optimization can be approximately solved as minimax optimization, and moreover, it provides the first implementable method with complexity guarantees for such sophisticated bilevel optimization.<br />Accepted by SIAM Journal on Optimization
ISSN:10957189
10526234
DOI:10.1137/23m1566753