No dimension-free deterministic algorithm computes approximate stationarities of Lipschitzians

We consider the oracle complexity of computing an approximate stationary point of a Lipschitz function. When the function is smooth, it is well known that the simple deterministic gradient method has finite dimension-free oracle complexity. However, when the function can be nonsmooth, it is only rec...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Mathematical programming Jg. 208; H. 1-2; S. 51 - 74
Hauptverfasser: Tian, Lai, So, Anthony Man-Cho
Format: Journal Article
Sprache:Englisch
Veröffentlicht: Berlin/Heidelberg Springer Berlin Heidelberg 01.11.2024
Springer
Schlagworte:
ISSN:0025-5610, 1436-4646
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Abstract We consider the oracle complexity of computing an approximate stationary point of a Lipschitz function. When the function is smooth, it is well known that the simple deterministic gradient method has finite dimension-free oracle complexity. However, when the function can be nonsmooth, it is only recently that a randomized algorithm with finite dimension-free oracle complexity has been developed. In this paper, we show that no deterministic algorithm can do the same. Moreover, even without the dimension-free requirement, we show that any finite-time deterministic method cannot be general zero-respecting. In particular, this implies that a natural derandomization of the aforementioned randomized algorithm cannot have finite-time complexity. Our results reveal a fundamental hurdle in modern large-scale nonconvex nonsmooth optimization.
AbstractList We consider the oracle complexity of computing an approximate stationary point of a Lipschitz function. When the function is smooth, it is well known that the simple deterministic gradient method has finite dimension-free oracle complexity. However, when the function can be nonsmooth, it is only recently that a randomized algorithm with finite dimension-free oracle complexity has been developed. In this paper, we show that no deterministic algorithm can do the same. Moreover, even without the dimension-free requirement, we show that any finite-time deterministic method cannot be general zero-respecting. In particular, this implies that a natural derandomization of the aforementioned randomized algorithm cannot have finite-time complexity. Our results reveal a fundamental hurdle in modern large-scale nonconvex nonsmooth optimization.
Audience Academic
Author So, Anthony Man-Cho
Tian, Lai
Author_xml – sequence: 1
  givenname: Lai
  surname: Tian
  fullname: Tian, Lai
  organization: Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong
– sequence: 2
  givenname: Anthony Man-Cho
  orcidid: 0000-0003-2588-7851
  surname: So
  fullname: So, Anthony Man-Cho
  email: manchoso@se.cuhk.edu.hk
  organization: Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong
BookMark eNp9kMFO5DAMQCMEEgPLD3DqD3Rw6iZtj2i0uyCN4ALXjTKpMwRNkyrJSAtfv2G7pz2MLMuS7WfJ74qd--CJsVsOaw7Q3SUOHLoaGiwJyGt5xla8RVm3spXnbAXQiFpIDpfsKqV3AODY9yv26ylUo5vIJxd8bSNRNVKmODnvUnam0od9iC6_TZUJ03zMlCo9zzH8dpPOVKWscyF1WXFlFGy1dXMyby5_Ou3TN3Zh9SHRzb96zV5_fH_ZPNTb55-Pm_ttbRAh153EAcahRYONlV2LhAJJaG57IcwgdmJn-n40prwhd8NIGqnrm53QFuwAHV6z9XJ3rw-knLchR21KjDQ5U1xZV_r3PW9aBJRDAfoFMDGkFMkq45ZXCugOioP6EqsWsaqIVX_FKlnQ5j90jkVG_DgN4QKlsuz3FNV7OEZfnJyi_gAIjo8c
CitedBy_id crossref_primary_10_1007_s10107_025_02262_9
Cites_doi 10.1137/1.9781611976748
10.1137/090748408
10.1137/030601296
10.1137/18M1178244
10.1007/BF01589116
10.1007/s10957-022-02093-0
10.1287/moor.2022.0289
10.1137/120880811
10.1007/s10107-019-01431-x
10.1145/102782.102783
10.1137/1.9781611971309
10.1007/BF01584320
10.1007/s10208-018-09409-5
10.1145/3418526
10.1007/s10107-006-0706-8
10.1109/TIT.2017.2701343
10.1137/S0363012904439301
10.1016/j.tcs.2014.06.006
10.1007/s10107-019-01406-y
10.1137/17M1151031
10.1137/0803004
10.1109/MSP.2020.3003845
10.1007/978-3-319-91578-4
10.1137/050639673
10.1137/090774100
10.1137/1.9780898719857
10.1007/978-3-030-34910-3_6
10.1137/19M1298147
ContentType Journal Article
Copyright Springer-Verlag GmbH Germany, part of Springer Nature and Mathematical Optimization Society 2024. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
COPYRIGHT 2024 Springer
Copyright_xml – notice: Springer-Verlag GmbH Germany, part of Springer Nature and Mathematical Optimization Society 2024. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
– notice: COPYRIGHT 2024 Springer
DBID AAYXX
CITATION
DOI 10.1007/s10107-023-02031-6
DatabaseName CrossRef
DatabaseTitle CrossRef
DatabaseTitleList

DeliveryMethod fulltext_linktorsrc
Discipline Engineering
Mathematics
EISSN 1436-4646
EndPage 74
ExternalDocumentID A812430369
10_1007_s10107_023_02031_6
GrantInformation_xml – fundername: Hong Kong Research Grants Council (RGC) General Research Fund (GRF)
  grantid: CUHK 14216122
GroupedDBID --K
--Z
-52
-5D
-5G
-BR
-EM
-Y2
-~C
-~X
.4S
.86
.DC
.VR
06D
0R~
0VY
199
1B1
1N0
1OL
1SB
203
28-
29M
2J2
2JN
2JY
2KG
2KM
2LR
2P1
2VQ
2~H
30V
3V.
4.4
406
408
409
40D
40E
5GY
5QI
5VS
67Z
6NX
6TJ
78A
7WY
88I
8AO
8FE
8FG
8FL
8TC
8UJ
8VB
95-
95.
95~
96X
AAAVM
AABHQ
AACDK
AAHNG
AAIAL
AAJBT
AAJKR
AANZL
AARHV
AARTL
AASML
AATNV
AATVU
AAUYE
AAWCG
AAYIU
AAYQN
AAYTO
AAYZH
ABAKF
ABBBX
ABBXA
ABDBF
ABDZT
ABECU
ABFTV
ABHLI
ABHQN
ABJCF
ABJNI
ABJOX
ABKCH
ABKTR
ABMNI
ABMQK
ABNWP
ABQBU
ABQSL
ABSXP
ABTEG
ABTHY
ABTKH
ABTMW
ABULA
ABUWG
ABWNU
ABXPI
ACAOD
ACBXY
ACDTI
ACGFS
ACGOD
ACHSB
ACHXU
ACIWK
ACKNC
ACMDZ
ACMLO
ACNCT
ACOKC
ACOMO
ACPIV
ACUHS
ACZOJ
ADHHG
ADHIR
ADIMF
ADINQ
ADKNI
ADKPE
ADRFC
ADTPH
ADURQ
ADYFF
ADZKW
AEBTG
AEFIE
AEFQL
AEGAL
AEGNC
AEJHL
AEJRE
AEKMD
AEMOZ
AEMSY
AENEX
AEOHA
AEPYU
AESKC
AETLH
AEVLU
AEXYK
AFBBN
AFEXP
AFFNX
AFGCZ
AFKRA
AFLOW
AFQWF
AFWTZ
AFZKB
AGAYW
AGDGC
AGGDS
AGJBK
AGMZJ
AGQEE
AGQMX
AGRTI
AGWIL
AGWZB
AGYKE
AHAVH
AHBYD
AHKAY
AHQJS
AHSBF
AHYZX
AIAKS
AIGIU
AIIXL
AILAN
AITGF
AJBLW
AJRNO
AJZVZ
AKVCP
ALMA_UNASSIGNED_HOLDINGS
ALWAN
AMKLP
AMXSW
AMYLF
AMYQR
AOCGG
ARAPS
ARCSS
ARMRJ
ASPBG
AVWKF
AXYYD
AYJHY
AZFZN
AZQEC
B-.
B0M
BA0
BAPOH
BBWZM
BDATZ
BENPR
BEZIV
BGLVJ
BGNMA
BPHCQ
BSONS
CAG
CCPQU
COF
CS3
CSCUP
DDRTE
DL5
DNIVK
DPUIP
DU5
DWQXO
EAD
EAP
EBA
EBLON
EBR
EBS
EBU
ECS
EDO
EIOEI
EJD
EMI
EMK
EPL
ESBYG
EST
ESX
FEDTE
FERAY
FFXSO
FIGPU
FINBP
FNLPD
FRNLG
FRRFC
FSGXE
FWDCC
GGCAI
GGRSB
GJIRD
GNUQQ
GNWQR
GQ6
GQ7
GQ8
GROUPED_ABI_INFORM_COMPLETE
GXS
H13
HCIFZ
HF~
HG5
HG6
HMJXF
HQYDN
HRMNR
HVGLF
HZ~
H~9
I-F
I09
IAO
IHE
IJ-
IKXTQ
ITM
IWAJR
IXC
IZIGR
IZQ
I~X
I~Z
J-C
J0Z
JBSCW
JCJTX
JZLTJ
K1G
K60
K6V
K6~
K7-
KDC
KOV
KOW
L6V
LAS
LLZTM
M0C
M0N
M2P
M4Y
M7S
MA-
N2Q
N9A
NB0
NDZJH
NPVJJ
NQ-
NQJWS
NU0
O9-
O93
O9G
O9I
O9J
OAM
P19
P2P
P62
P9R
PF0
PQBIZ
PQBZA
PQQKQ
PROAC
PT4
PT5
PTHSS
Q2X
QOK
QOS
QWB
R4E
R89
R9I
RHV
RIG
RNI
RNS
ROL
RPX
RPZ
RSV
RZK
S16
S1Z
S26
S27
S28
S3B
SAP
SCLPG
SDD
SDH
SDM
SHX
SISQX
SJYHP
SMT
SNE
SNPRN
SNX
SOHCF
SOJ
SPISZ
SRMVM
SSLCW
STPWE
SZN
T13
T16
TH9
TN5
TSG
TSK
TSV
TUC
TUS
U2A
UG4
UOJIU
UTJUX
UZXMN
VC2
VFIZW
W23
W48
WH7
WK8
XPP
YLTOR
Z45
Z5O
Z7R
Z7S
Z7X
Z7Y
Z7Z
Z81
Z83
Z86
Z88
Z8M
Z8N
Z8R
Z8T
Z8W
Z92
ZL0
ZMTXR
ZWQNP
~02
~8M
~EX
AAPKM
AAYXX
ABBRH
ABDBE
ABFSG
ABRTQ
ACSTC
ADHKG
ADXHL
AEZWR
AFDZB
AFFHD
AFHIU
AFOHR
AGQPQ
AHPBZ
AHWEU
AIXLP
AMVHM
ATHPR
AYFIA
CITATION
PHGZM
PHGZT
PQGLB
ID FETCH-LOGICAL-c330t-76390d943c32f6743e353e5a1f855c95b5bc88dcc0026b9dea3e782b5af0f9073
IEDL.DBID RSV
ISICitedReferencesCount 2
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001137011800001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 0025-5610
IngestDate Sat Nov 29 10:28:28 EST 2025
Tue Nov 18 22:27:36 EST 2025
Sat Nov 29 03:34:04 EST 2025
Fri Feb 21 02:37:19 EST 2025
IsPeerReviewed true
IsScholarly true
Issue 1-2
Keywords Dimension-free rates
Lower bounds
68Q25
Information-based complexity
Black-box optimization
90C56
Stationary points
90C60
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c330t-76390d943c32f6743e353e5a1f855c95b5bc88dcc0026b9dea3e782b5af0f9073
ORCID 0000-0003-2588-7851
PageCount 24
ParticipantIDs gale_infotracacademiconefile_A812430369
crossref_citationtrail_10_1007_s10107_023_02031_6
crossref_primary_10_1007_s10107_023_02031_6
springer_journals_10_1007_s10107_023_02031_6
PublicationCentury 2000
PublicationDate 20241100
2024-11-00
20241101
PublicationDateYYYYMMDD 2024-11-01
PublicationDate_xml – month: 11
  year: 2024
  text: 20241100
PublicationDecade 2020
PublicationPlace Berlin/Heidelberg
PublicationPlace_xml – name: Berlin/Heidelberg
PublicationSubtitle A Publication of the Mathematical Optimization Society
PublicationTitle Mathematical programming
PublicationTitleAbbrev Math. Program
PublicationYear 2024
Publisher Springer Berlin Heidelberg
Springer
Publisher_xml – name: Springer Berlin Heidelberg
– name: Springer
References ClarkeFHOptimization and Nonsmooth Analysis1990PhiladelphiaSIAM10.1137/1.9781611971309
MunkresJRTopology2000HobokenPearson Prentice Hall
ConnARGouldNITointPLTrust Region Methods2000PhiladelphiaSIAM10.1137/1.9780898719857
DyerMFriezeAComputing the volume of convex bodies: a case where randomness provably helpsProbab. Comb. Appl.1991441231701141926
Zhang, J., Lin, H., Jegelka, S., Jadbabaie, A., Sra, S.: Complexity of finding stationary points of nonsmooth nonconvex functions. In: Proceedings of the 37th International Conference on Machine Learning, pp. 11173–11182 (2020)
BöckenhauerHJHromkovičJKommDKrugSSmulaJSprockAThe string guessing problem as a method to prove lower bounds on the advice complexityTheor. Comput. Sci.201455495108326288010.1016/j.tcs.2014.06.006
Jordan, M., Kornowski, G., Lin, T., Shamir, O., Zampetakis, M.: Deterministic nonsmooth nonconvex optimization. In: Proceedings of the 36th Conference on Learning Theory, pp. 4570–4597 (2023)
Woodworth, B., Srebro, N.: Tight complexity bounds for optimizing composite objectives. In: Advances in Neural Information Processing Systems, vol. 29, pp. 3646–3654 (2016)
Davis, D., Drusvyatskiy, D., Lee, Y.T., Padmanabhan, S., Ye, G.: A gradient sampling method with complexity guarantees for Lipschitz functions in high and low dimensions. In: Advances in Neural Information Processing Systems, vol. 35, pp. 6692–6703 (2022)
KornowskiGShamirOOracle complexity in nonsmooth nonconvex optimizationJ. Mach. Learn. Res.2022233141444577753
Kornowski, G., Shamir, O.: On the complexity of finding small subgradients in nonsmooth optimization. arXiv preprint arXiv:2209.10346 (2022)
CuiYPangJSModern Nonconvex Nondifferentiable Optimization2021PhiladelphiaSIAM10.1137/1.9781611976748
NesterovYuLectures on Convex Optimization2018BerlinSpringer10.1007/978-3-319-91578-4
BraunGGuzmánCPokuttaSLower bounds on the oracle complexity of nonsmooth convex optimization via information theoryIEEE Trans. Inf. Theory201763747094724366698510.1109/TIT.2017.2701343
CartisCGouldNIMTointPLOn the complexity of steepest descent, Newton’s and regularized Newton’s methods for nonconvex unconstrained optimization problemsSIAM J. Optim.201020628332852272115710.1137/090774100
Arora, R., Basu, A., Mianjy, P., Mukherjee, A.: Understanding deep neural networks with rectified linear units. In: International Conference on Learning Representations (2018)
Tian, L., So, A.M.C.: On the hardness of computing near-approximate stationary points of Clarke regular nonsmooth nonconvex problems and certain DC programs. In: ICML Workshop on Beyond First-Order Methods in ML Systems (2021)
NesterovYuA method of solving a convex programming problem with convergence rate O(1/k2)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(1/k^2)$$\end{document}Dokl. Akad. Nauk1983269543547701288
NesterovYuHow to make the gradients smallOptima2012881011
BlythTSSet Theory and Abstract Algebra1975New YorkLongman Publishing Group
DaniilidisADrusvyatskiyDPathological subgradient dynamicsSIAM J. Optim.202030213271338409336510.1137/19M1298147
VavasisSABlack-box complexity of local minimizationSIAM J. Optim.1993316080120200210.1137/0803004
Carmon, Y., Duchi, J.C., Hinder, O., Sidford, A.: “Convex until proven guilty”: dimension-free acceleration of gradient descent on non-convex functions. In: Proceedings of the 34th International Conference on Machine Learning, pp. 654–663 (2017)
Kakade, S.M., Lee, J.D.: Provably correct automatic subdifferentiation for qualified programs. In: Advances in Neural Information Processing Systems, vol. 31, pp. 7125–7135 (2018)
HagerWWZhangHA survey of nonlinear conjugate gradient methodsPac. J. Optim.20062135582548208
GoldsteinAOptimization of Lipschitz continuous functionsMath. Program.1977131142244333510.1007/BF01584320
LiJSoAMCMaWKUnderstanding notions of stationarity in nonsmooth optimization: a guided tour of various constructions of subdifferential for nonsmooth functionsIEEE Signal Process. Mag.2020375183110.1109/MSP.2020.3003845
DavisDDrusvyatskiyDKakadeSLeeJDStochastic subgradient method converges on tame functionsFound. Comput. Math.2020201119154405692710.1007/s10208-018-09409-5
JinCNetrapalliPGeRKakadeSMJordanMIOn nonconvex optimization for machine learning: gradients, stochasticity, and saddle pointsJ. ACM202168211426706010.1145/3418526
LiuDCNocedalJOn the limited memory BFGS method for large scale optimizationMath. Program.1989451503528103824510.1007/BF01589116
BenaïmMHofbauerJSorinSStochastic approximations and differential inclusionsSIAM J. Control Optim.2005441328348217715910.1137/S0363012904439301
BurkeJVCurtisFELewisASOvertonMLSimõesLEGradient sampling methods for nonsmooth optimizationNumerical Nonsmooth Optimization: State of the Art Algorithms2020ChamSpringer20122510.1007/978-3-030-34910-3_6
CarmonYDuchiJCHinderOSidfordALower bounds for finding stationary points IMath. Program.20201841–271120416354110.1007/s10107-019-01406-y
Kong, S., Lewis, A.: The cost of nonconvexity in deterministic nonsmooth optimization. arXiv preprint arXiv:2210.00652 (2022)
CarmonYDuchiJCHinderOSidfordALower bounds for finding stationary points II: first-order methodsMath. Program.20211851315355420171610.1007/s10107-019-01431-x
Hiriart-UrrutyJBLemaréchalCFundamentals of Convex Analysis2004BerlinSpringer Science & Business Media
DavisDDrusvyatskiyDStochastic model-based minimization of weakly convex functionsSIAM J. Optim.2019291207239390245510.1137/18M1178244
NesterovYuPolyakBTCubic regularization of Newton method and its global performanceMath. Program.20061081177205222945910.1007/s10107-006-0706-8
DavisDGrimmerBProximally guided stochastic subgradient method for nonsmooth, nonconvex problemsSIAM J. Optim.201929319081930398268210.1137/17M1151031
DyerMFriezeAKannanRA random polynomial-time algorithm for approximating the volume of convex bodiesJ. ACM1991381117109591610.1145/102782.102783
KiwielKCConvergence of the gradient sampling algorithm for nonsmooth nonconvex optimizationSIAM J. Optim.2007182379388233844310.1137/050639673
NemirovskijASYudinDBProblem Complexity and Method Efficiency in Optimization1983HobokenWiley-Interscience
Majewski, S., Miasojedow, B., Moulines, E.: Analysis of nonsmooth stochastic approximation: the differential inclusion approach. arXiv preprint arXiv:1805.01916 (2018)
Tian, L., Zhou, K., So, A.M.C.: On the finite-time complexity and practical computation of approximate stationarity concepts of Lipschitz functions. In: Proceedings of the 39th International Conference on Machine Learning, pp. 21360–21379 (2022)
RockafellarRTWetsRJBVariational Analysis2009BerlinSpringer Science & Business Media
BurkeJVLewisASOvertonMLA robust gradient sampling algorithm for nonsmooth, nonconvex optimizationSIAM J. Optim.2005153751779214285910.1137/030601296
Metel, M.R., Takeda, A.: Perturbed iterate SGD for Lipschitz continuous loss functions. J. Optim. Theory Appl. 195(2), 504–547 (2022)
KiwielKCA nonderivative version of the gradient sampling algorithm for nonsmooth nonconvex optimizationSIAM J. Optim.201020419831994260024910.1137/090748408
GhadimiSLanGStochastic first-and zeroth-order methods for nonconvex stochastic programmingSIAM J. Optim.201323423412368313443910.1137/120880811
G Braun (2031_CR5) 2017; 63
D Davis (2031_CR17) 2020; 20
M Dyer (2031_CR21) 1991; 38
2031_CR8
Yu Nesterov (2031_CR42) 2018
SA Vavasis (2031_CR47) 1993; 3
JV Burke (2031_CR6) 2020
HJ Böckenhauer (2031_CR4) 2014; 554
AR Conn (2031_CR13) 2000
Yu Nesterov (2031_CR41) 2012; 88
A Goldstein (2031_CR23) 1977; 13
Yu Nesterov (2031_CR40) 1983; 269
RT Rockafellar (2031_CR44) 2009
2031_CR45
M Benaïm (2031_CR2) 2005; 44
S Ghadimi (2031_CR22) 2013; 23
2031_CR49
2031_CR48
2031_CR46
C Cartis (2031_CR11) 2010; 20
JR Munkres (2031_CR38) 2000
KC Kiwiel (2031_CR30) 2010; 20
FH Clarke (2031_CR12) 1990
2031_CR32
TS Blyth (2031_CR3) 1975
2031_CR31
2031_CR37
Y Carmon (2031_CR9) 2020; 184
G Kornowski (2031_CR33) 2022; 23
2031_CR36
Y Carmon (2031_CR10) 2021; 185
M Dyer (2031_CR20) 1991; 44
DC Liu (2031_CR35) 1989; 45
AS Nemirovskij (2031_CR39) 1983
WW Hager (2031_CR24) 2006; 2
Yu Nesterov (2031_CR43) 2006; 108
2031_CR28
JB Hiriart-Urruty (2031_CR25) 2004
C Jin (2031_CR26) 2021; 68
2031_CR27
A Daniilidis (2031_CR15) 2020; 30
KC Kiwiel (2031_CR29) 2007; 18
J Li (2031_CR34) 2020; 37
JV Burke (2031_CR7) 2005; 15
D Davis (2031_CR16) 2019; 29
2031_CR18
2031_CR1
D Davis (2031_CR19) 2019; 29
Y Cui (2031_CR14) 2021
References_xml – reference: NesterovYuPolyakBTCubic regularization of Newton method and its global performanceMath. Program.20061081177205222945910.1007/s10107-006-0706-8
– reference: Hiriart-UrrutyJBLemaréchalCFundamentals of Convex Analysis2004BerlinSpringer Science & Business Media
– reference: NemirovskijASYudinDBProblem Complexity and Method Efficiency in Optimization1983HobokenWiley-Interscience
– reference: BöckenhauerHJHromkovičJKommDKrugSSmulaJSprockAThe string guessing problem as a method to prove lower bounds on the advice complexityTheor. Comput. Sci.201455495108326288010.1016/j.tcs.2014.06.006
– reference: Kakade, S.M., Lee, J.D.: Provably correct automatic subdifferentiation for qualified programs. In: Advances in Neural Information Processing Systems, vol. 31, pp. 7125–7135 (2018)
– reference: BenaïmMHofbauerJSorinSStochastic approximations and differential inclusionsSIAM J. Control Optim.2005441328348217715910.1137/S0363012904439301
– reference: DyerMFriezeAKannanRA random polynomial-time algorithm for approximating the volume of convex bodiesJ. ACM1991381117109591610.1145/102782.102783
– reference: JinCNetrapalliPGeRKakadeSMJordanMIOn nonconvex optimization for machine learning: gradients, stochasticity, and saddle pointsJ. ACM202168211426706010.1145/3418526
– reference: Kong, S., Lewis, A.: The cost of nonconvexity in deterministic nonsmooth optimization. arXiv preprint arXiv:2210.00652 (2022)
– reference: Jordan, M., Kornowski, G., Lin, T., Shamir, O., Zampetakis, M.: Deterministic nonsmooth nonconvex optimization. In: Proceedings of the 36th Conference on Learning Theory, pp. 4570–4597 (2023)
– reference: Zhang, J., Lin, H., Jegelka, S., Jadbabaie, A., Sra, S.: Complexity of finding stationary points of nonsmooth nonconvex functions. In: Proceedings of the 37th International Conference on Machine Learning, pp. 11173–11182 (2020)
– reference: Woodworth, B., Srebro, N.: Tight complexity bounds for optimizing composite objectives. In: Advances in Neural Information Processing Systems, vol. 29, pp. 3646–3654 (2016)
– reference: BurkeJVLewisASOvertonMLA robust gradient sampling algorithm for nonsmooth, nonconvex optimizationSIAM J. Optim.2005153751779214285910.1137/030601296
– reference: KornowskiGShamirOOracle complexity in nonsmooth nonconvex optimizationJ. Mach. Learn. Res.2022233141444577753
– reference: BlythTSSet Theory and Abstract Algebra1975New YorkLongman Publishing Group
– reference: Carmon, Y., Duchi, J.C., Hinder, O., Sidford, A.: “Convex until proven guilty”: dimension-free acceleration of gradient descent on non-convex functions. In: Proceedings of the 34th International Conference on Machine Learning, pp. 654–663 (2017)
– reference: GhadimiSLanGStochastic first-and zeroth-order methods for nonconvex stochastic programmingSIAM J. Optim.201323423412368313443910.1137/120880811
– reference: Metel, M.R., Takeda, A.: Perturbed iterate SGD for Lipschitz continuous loss functions. J. Optim. Theory Appl. 195(2), 504–547 (2022)
– reference: Majewski, S., Miasojedow, B., Moulines, E.: Analysis of nonsmooth stochastic approximation: the differential inclusion approach. arXiv preprint arXiv:1805.01916 (2018)
– reference: CarmonYDuchiJCHinderOSidfordALower bounds for finding stationary points II: first-order methodsMath. Program.20211851315355420171610.1007/s10107-019-01431-x
– reference: Davis, D., Drusvyatskiy, D., Lee, Y.T., Padmanabhan, S., Ye, G.: A gradient sampling method with complexity guarantees for Lipschitz functions in high and low dimensions. In: Advances in Neural Information Processing Systems, vol. 35, pp. 6692–6703 (2022)
– reference: MunkresJRTopology2000HobokenPearson Prentice Hall
– reference: BurkeJVCurtisFELewisASOvertonMLSimõesLEGradient sampling methods for nonsmooth optimizationNumerical Nonsmooth Optimization: State of the Art Algorithms2020ChamSpringer20122510.1007/978-3-030-34910-3_6
– reference: ConnARGouldNITointPLTrust Region Methods2000PhiladelphiaSIAM10.1137/1.9780898719857
– reference: BraunGGuzmánCPokuttaSLower bounds on the oracle complexity of nonsmooth convex optimization via information theoryIEEE Trans. Inf. Theory201763747094724366698510.1109/TIT.2017.2701343
– reference: VavasisSABlack-box complexity of local minimizationSIAM J. Optim.1993316080120200210.1137/0803004
– reference: NesterovYuLectures on Convex Optimization2018BerlinSpringer10.1007/978-3-319-91578-4
– reference: NesterovYuHow to make the gradients smallOptima2012881011
– reference: Arora, R., Basu, A., Mianjy, P., Mukherjee, A.: Understanding deep neural networks with rectified linear units. In: International Conference on Learning Representations (2018)
– reference: CuiYPangJSModern Nonconvex Nondifferentiable Optimization2021PhiladelphiaSIAM10.1137/1.9781611976748
– reference: DavisDDrusvyatskiyDKakadeSLeeJDStochastic subgradient method converges on tame functionsFound. Comput. Math.2020201119154405692710.1007/s10208-018-09409-5
– reference: GoldsteinAOptimization of Lipschitz continuous functionsMath. Program.1977131142244333510.1007/BF01584320
– reference: LiuDCNocedalJOn the limited memory BFGS method for large scale optimizationMath. Program.1989451503528103824510.1007/BF01589116
– reference: KiwielKCA nonderivative version of the gradient sampling algorithm for nonsmooth nonconvex optimizationSIAM J. Optim.201020419831994260024910.1137/090748408
– reference: Tian, L., So, A.M.C.: On the hardness of computing near-approximate stationary points of Clarke regular nonsmooth nonconvex problems and certain DC programs. In: ICML Workshop on Beyond First-Order Methods in ML Systems (2021)
– reference: KiwielKCConvergence of the gradient sampling algorithm for nonsmooth nonconvex optimizationSIAM J. Optim.2007182379388233844310.1137/050639673
– reference: Tian, L., Zhou, K., So, A.M.C.: On the finite-time complexity and practical computation of approximate stationarity concepts of Lipschitz functions. In: Proceedings of the 39th International Conference on Machine Learning, pp. 21360–21379 (2022)
– reference: DavisDGrimmerBProximally guided stochastic subgradient method for nonsmooth, nonconvex problemsSIAM J. Optim.201929319081930398268210.1137/17M1151031
– reference: DavisDDrusvyatskiyDStochastic model-based minimization of weakly convex functionsSIAM J. Optim.2019291207239390245510.1137/18M1178244
– reference: NesterovYuA method of solving a convex programming problem with convergence rate O(1/k2)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(1/k^2)$$\end{document}Dokl. Akad. Nauk1983269543547701288
– reference: LiJSoAMCMaWKUnderstanding notions of stationarity in nonsmooth optimization: a guided tour of various constructions of subdifferential for nonsmooth functionsIEEE Signal Process. Mag.2020375183110.1109/MSP.2020.3003845
– reference: Kornowski, G., Shamir, O.: On the complexity of finding small subgradients in nonsmooth optimization. arXiv preprint arXiv:2209.10346 (2022)
– reference: ClarkeFHOptimization and Nonsmooth Analysis1990PhiladelphiaSIAM10.1137/1.9781611971309
– reference: HagerWWZhangHA survey of nonlinear conjugate gradient methodsPac. J. Optim.20062135582548208
– reference: RockafellarRTWetsRJBVariational Analysis2009BerlinSpringer Science & Business Media
– reference: CartisCGouldNIMTointPLOn the complexity of steepest descent, Newton’s and regularized Newton’s methods for nonconvex unconstrained optimization problemsSIAM J. Optim.201020628332852272115710.1137/090774100
– reference: CarmonYDuchiJCHinderOSidfordALower bounds for finding stationary points IMath. Program.20201841–271120416354110.1007/s10107-019-01406-y
– reference: DyerMFriezeAComputing the volume of convex bodies: a case where randomness provably helpsProbab. Comb. Appl.1991441231701141926
– reference: DaniilidisADrusvyatskiyDPathological subgradient dynamicsSIAM J. Optim.202030213271338409336510.1137/19M1298147
– volume: 88
  start-page: 10
  year: 2012
  ident: 2031_CR41
  publication-title: Optima
– volume-title: Modern Nonconvex Nondifferentiable Optimization
  year: 2021
  ident: 2031_CR14
  doi: 10.1137/1.9781611976748
– volume: 20
  start-page: 1983
  issue: 4
  year: 2010
  ident: 2031_CR30
  publication-title: SIAM J. Optim.
  doi: 10.1137/090748408
– volume: 15
  start-page: 751
  issue: 3
  year: 2005
  ident: 2031_CR7
  publication-title: SIAM J. Optim.
  doi: 10.1137/030601296
– volume: 269
  start-page: 543
  year: 1983
  ident: 2031_CR40
  publication-title: Dokl. Akad. Nauk
– ident: 2031_CR46
– volume: 29
  start-page: 207
  issue: 1
  year: 2019
  ident: 2031_CR16
  publication-title: SIAM J. Optim.
  doi: 10.1137/18M1178244
– volume-title: Problem Complexity and Method Efficiency in Optimization
  year: 1983
  ident: 2031_CR39
– volume: 45
  start-page: 503
  issue: 1
  year: 1989
  ident: 2031_CR35
  publication-title: Math. Program.
  doi: 10.1007/BF01589116
– volume-title: Fundamentals of Convex Analysis
  year: 2004
  ident: 2031_CR25
– ident: 2031_CR37
  doi: 10.1007/s10957-022-02093-0
– ident: 2031_CR49
– ident: 2031_CR31
  doi: 10.1287/moor.2022.0289
– ident: 2031_CR1
– volume: 23
  start-page: 2341
  issue: 4
  year: 2013
  ident: 2031_CR22
  publication-title: SIAM J. Optim.
  doi: 10.1137/120880811
– volume: 185
  start-page: 315
  issue: 1
  year: 2021
  ident: 2031_CR10
  publication-title: Math. Program.
  doi: 10.1007/s10107-019-01431-x
– ident: 2031_CR45
– volume: 23
  start-page: 1
  issue: 314
  year: 2022
  ident: 2031_CR33
  publication-title: J. Mach. Learn. Res.
– ident: 2031_CR27
– volume: 38
  start-page: 1
  issue: 1
  year: 1991
  ident: 2031_CR21
  publication-title: J. ACM
  doi: 10.1145/102782.102783
– ident: 2031_CR48
– volume-title: Optimization and Nonsmooth Analysis
  year: 1990
  ident: 2031_CR12
  doi: 10.1137/1.9781611971309
– ident: 2031_CR8
– volume: 13
  start-page: 14
  issue: 1
  year: 1977
  ident: 2031_CR23
  publication-title: Math. Program.
  doi: 10.1007/BF01584320
– volume: 20
  start-page: 119
  issue: 1
  year: 2020
  ident: 2031_CR17
  publication-title: Found. Comput. Math.
  doi: 10.1007/s10208-018-09409-5
– volume: 68
  start-page: 11
  issue: 2
  year: 2021
  ident: 2031_CR26
  publication-title: J. ACM
  doi: 10.1145/3418526
– volume: 108
  start-page: 177
  issue: 1
  year: 2006
  ident: 2031_CR43
  publication-title: Math. Program.
  doi: 10.1007/s10107-006-0706-8
– volume: 63
  start-page: 4709
  issue: 7
  year: 2017
  ident: 2031_CR5
  publication-title: IEEE Trans. Inf. Theory
  doi: 10.1109/TIT.2017.2701343
– volume: 44
  start-page: 328
  issue: 1
  year: 2005
  ident: 2031_CR2
  publication-title: SIAM J. Control Optim.
  doi: 10.1137/S0363012904439301
– ident: 2031_CR28
– volume: 554
  start-page: 95
  year: 2014
  ident: 2031_CR4
  publication-title: Theor. Comput. Sci.
  doi: 10.1016/j.tcs.2014.06.006
– volume: 184
  start-page: 71
  issue: 1–2
  year: 2020
  ident: 2031_CR9
  publication-title: Math. Program.
  doi: 10.1007/s10107-019-01406-y
– volume: 29
  start-page: 1908
  issue: 3
  year: 2019
  ident: 2031_CR19
  publication-title: SIAM J. Optim.
  doi: 10.1137/17M1151031
– volume-title: Set Theory and Abstract Algebra
  year: 1975
  ident: 2031_CR3
– volume: 3
  start-page: 60
  issue: 1
  year: 1993
  ident: 2031_CR47
  publication-title: SIAM J. Optim.
  doi: 10.1137/0803004
– volume: 2
  start-page: 35
  issue: 1
  year: 2006
  ident: 2031_CR24
  publication-title: Pac. J. Optim.
– volume: 37
  start-page: 18
  issue: 5
  year: 2020
  ident: 2031_CR34
  publication-title: IEEE Signal Process. Mag.
  doi: 10.1109/MSP.2020.3003845
– volume-title: Variational Analysis
  year: 2009
  ident: 2031_CR44
– ident: 2031_CR18
– volume-title: Lectures on Convex Optimization
  year: 2018
  ident: 2031_CR42
  doi: 10.1007/978-3-319-91578-4
– volume: 44
  start-page: 123
  year: 1991
  ident: 2031_CR20
  publication-title: Probab. Comb. Appl.
– volume: 18
  start-page: 379
  issue: 2
  year: 2007
  ident: 2031_CR29
  publication-title: SIAM J. Optim.
  doi: 10.1137/050639673
– volume: 20
  start-page: 2833
  issue: 6
  year: 2010
  ident: 2031_CR11
  publication-title: SIAM J. Optim.
  doi: 10.1137/090774100
– volume-title: Trust Region Methods
  year: 2000
  ident: 2031_CR13
  doi: 10.1137/1.9780898719857
– start-page: 201
  volume-title: Numerical Nonsmooth Optimization: State of the Art Algorithms
  year: 2020
  ident: 2031_CR6
  doi: 10.1007/978-3-030-34910-3_6
– ident: 2031_CR36
– ident: 2031_CR32
– volume: 30
  start-page: 1327
  issue: 2
  year: 2020
  ident: 2031_CR15
  publication-title: SIAM J. Optim.
  doi: 10.1137/19M1298147
– volume-title: Topology
  year: 2000
  ident: 2031_CR38
SSID ssj0001388
Score 2.4379888
Snippet We consider the oracle complexity of computing an approximate stationary point of a Lipschitz function. When the function is smooth, it is well known that the...
SourceID gale
crossref
springer
SourceType Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 51
SubjectTerms Algorithms
Calculus of Variations and Optimal Control; Optimization
Combinatorics
Full Length Paper
Mathematical and Computational Physics
Mathematical Methods in Physics
Mathematics
Mathematics and Statistics
Mathematics of Computing
Numerical Analysis
Theoretical
Title No dimension-free deterministic algorithm computes approximate stationarities of Lipschitzians
URI https://link.springer.com/article/10.1007/s10107-023-02031-6
Volume 208
WOSCitedRecordID wos001137011800001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVPQU
  databaseName: ABI/INFORM Collection
  customDbUrl:
  eissn: 1436-4646
  dateEnd: 20241214
  omitProxy: false
  ssIdentifier: ssj0001388
  issn: 0025-5610
  databaseCode: 7WY
  dateStart: 20240101
  isFulltext: true
  titleUrlDefault: https://www.proquest.com/abicomplete
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: ABI/INFORM Global
  customDbUrl:
  eissn: 1436-4646
  dateEnd: 20241214
  omitProxy: false
  ssIdentifier: ssj0001388
  issn: 0025-5610
  databaseCode: M0C
  dateStart: 20240101
  isFulltext: true
  titleUrlDefault: https://search.proquest.com/abiglobal
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: Advanced Technologies & Aerospace Database
  customDbUrl:
  eissn: 1436-4646
  dateEnd: 20241214
  omitProxy: false
  ssIdentifier: ssj0001388
  issn: 0025-5610
  databaseCode: P5Z
  dateStart: 20240101
  isFulltext: true
  titleUrlDefault: https://search.proquest.com/hightechjournals
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: Computer Science Database
  customDbUrl:
  eissn: 1436-4646
  dateEnd: 20241214
  omitProxy: false
  ssIdentifier: ssj0001388
  issn: 0025-5610
  databaseCode: K7-
  dateStart: 20240101
  isFulltext: true
  titleUrlDefault: http://search.proquest.com/compscijour
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: Engineering Database
  customDbUrl:
  eissn: 1436-4646
  dateEnd: 20241214
  omitProxy: false
  ssIdentifier: ssj0001388
  issn: 0025-5610
  databaseCode: M7S
  dateStart: 20240101
  isFulltext: true
  titleUrlDefault: http://search.proquest.com
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: ProQuest Central
  customDbUrl:
  eissn: 1436-4646
  dateEnd: 20241214
  omitProxy: false
  ssIdentifier: ssj0001388
  issn: 0025-5610
  databaseCode: BENPR
  dateStart: 20240101
  isFulltext: true
  titleUrlDefault: https://www.proquest.com/central
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: Science Database
  customDbUrl:
  eissn: 1436-4646
  dateEnd: 20241214
  omitProxy: false
  ssIdentifier: ssj0001388
  issn: 0025-5610
  databaseCode: M2P
  dateStart: 20240101
  isFulltext: true
  titleUrlDefault: https://search.proquest.com/sciencejournals
  providerName: ProQuest
– providerCode: PRVAVX
  databaseName: SpringerLINK Contemporary 1997-Present
  customDbUrl:
  eissn: 1436-4646
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0001388
  issn: 0025-5610
  databaseCode: RSV
  dateStart: 19970101
  isFulltext: true
  titleUrlDefault: https://link.springer.com/search?facet-content-type=%22Journal%22
  providerName: Springer Nature
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LSwMxEA5aPejBt1hf5CB40MBus9ndHItYPGgRfNCTIZuHLtS27K4i_noz-6gtiKDHQDKEyWQemcw3CJ14YCQ9o4lOlCQB05bwgHEimWY00s5CW1U2m4j6_Xgw4Ld1UVje_HZvUpKlpp4pdvPhWa0DeUcniiRcREsM0GYgRr97nOpfn8Zx06gVvIO6VOZnGnPmqFHK8ynR0tL01v-3xw20VnuWuFuJwiZaMKMttDqDN-hGN1OQ1nwbPfXHWAO4PzyYEZsZg3X9OaZEb8Zy-DzO0uLlFauq90OOSwjyj9SRMDivsvgyK0FZ8dji63SSQ17i08lcvoMeepf3F1ekbrdAFKVeQZym4Z7mAVW0Y6E2wVBGDZO-jRlTnCUsUXGslYK4LeHaSGqcf5EwaT3rYmy6i1qj8cjsIWx95-eFJqBUhu78VRLx2PpeIqmNeELDNvIbrgtVY5FDS4yh-EZRBk4Kx0lRclK4NWfTNZMKiePX2adwmAKuqaOsZF1t4PYHgFeiC44NmG_eRufNWYr6_ua_EN7_2_QDtNJxblBVvXiIWkX2Zo7Qsnov0jw7LgX3C6gC5yE
linkProvider Springer Nature
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LSwMxEA5aBfXgW6zPHAQPGthtNu3mWMRSsS2CVXoyZPPQQm3LbhXx15vZR60gBT0GkiFMJvPIZL5B6MwDI-kZTXSkJAmYtoQHjBPJNKM17Sy0VWmziVqnE_Z6_C4vCkuK3-5FSjLV1DPFbj48q1Ug7-hEkVQX0VIAbXYgRr9_nOpfn4Zh0agVvIO8VOZ3Gj_MUaGUf6ZEU0vT2PjfHjfReu5Z4nomCltowQy30doM3qAbtacgrckOeuqMsAZwf3gwIzY2Buv8c0yK3ozl4HkU9ycvr1hlvR8SnEKQf_QdCYOTLIsv4xSUFY8sbvXHCeQlPp3MJbvooXHdvWqSvN0CUZR6E-I0Dfc0D6iiFQu1CYYyapj0bciY4ixikQpDrRTEbRHXRlLj_IuISetZF2PTPVQajoZmH2HrOz-vagJKZdWdv4pqPLS-F0lqazyi1TLyC64LlWORQ0uMgfhGUQZOCsdJkXJSuDUX0zXjDIlj7uxzOEwB19RRVjKvNnD7A8ArUQfHBsw3L6PL4ixFfn-TOYQP_jb9FK00u-2WaN10bg_RasW5RFkl4xEqTeI3c4yW1fukn8QnqRB_AYXO6gU
linkToPdf http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1LSwMxEA5aRfTgW3ybg-BBQ3ebTbs5iloUtYgvPBmyeWihtmV3FfHXm9lHbUEK4nGXZAjJJDPJzPcNQvseGEnPaKIjJUnAtCU8YJxIphltaGehrcqKTTRarfDpid8MofizbPcyJJljGoClqZtW-9pWh4BvPjyx1SAG6dSS1CfRVOB-QVLX7d3j4Cz2aRiWRVvBUyhgM7_LGDFN5QE9Gh7NrE5z4f_jXUTzhceJj3MVWUITpruM5oZ4CN3X9YC8NVlBz60e1kD6Dw9pxMbGYF0kzWSszlh2XnpxO319wyqvCZHgjJr8s-1EGJzk0X0ZZ2StuGfxVbufQLziy-lisooemmf3J-ekKMNAFKVeStwJxD3NA6pozQJmwVBGDZO-DRlTnEUsUmGolYL7XMS1kdQ4vyNi0nrW3b3pGqp0e12zjrD1nf9XNwGlsu70QkUNHlrfiyS1DR7R-gbyyxUQquAoh1IZHfHDrgwzKdxMimwmhetzOOjTzxk6xrY-gIUVsH2dZCULFIIbHxBhiWNweMCs8w10VK6rKPZ1Mkbw5t-a76GZm9OmuLpoXW6h2ZrzlHKA4zaqpPG72UHT6iNtJ_Fups_f-J7y6Q
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=No+dimension-free+deterministic+algorithm+computes+approximate+stationarities+of+Lipschitzians&rft.jtitle=Mathematical+programming&rft.au=Tian%2C+Lai&rft.au=So%2C+Anthony+Man-Cho&rft.date=2024-11-01&rft.pub=Springer&rft.issn=0025-5610&rft.volume=208&rft.issue=1-2&rft.spage=51&rft_id=info:doi/10.1007%2Fs10107-023-02031-6&rft.externalDocID=A812430369
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0025-5610&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0025-5610&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0025-5610&client=summon