No dimension-free deterministic algorithm computes approximate stationarities of Lipschitzians
We consider the oracle complexity of computing an approximate stationary point of a Lipschitz function. When the function is smooth, it is well known that the simple deterministic gradient method has finite dimension-free oracle complexity. However, when the function can be nonsmooth, it is only rec...
Gespeichert in:
| Veröffentlicht in: | Mathematical programming Jg. 208; H. 1-2; S. 51 - 74 |
|---|---|
| Hauptverfasser: | , |
| Format: | Journal Article |
| Sprache: | Englisch |
| Veröffentlicht: |
Berlin/Heidelberg
Springer Berlin Heidelberg
01.11.2024
Springer |
| Schlagworte: | |
| ISSN: | 0025-5610, 1436-4646 |
| Online-Zugang: | Volltext |
| Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
| Abstract | We consider the oracle complexity of computing an approximate stationary point of a Lipschitz function. When the function is smooth, it is well known that the simple deterministic gradient method has finite dimension-free oracle complexity. However, when the function can be nonsmooth, it is only recently that a randomized algorithm with finite dimension-free oracle complexity has been developed. In this paper, we show that no deterministic algorithm can do the same. Moreover, even without the dimension-free requirement, we show that any finite-time deterministic method cannot be general zero-respecting. In particular, this implies that a natural derandomization of the aforementioned randomized algorithm cannot have finite-time complexity. Our results reveal a fundamental hurdle in modern large-scale nonconvex nonsmooth optimization. |
|---|---|
| AbstractList | We consider the oracle complexity of computing an approximate stationary point of a Lipschitz function. When the function is smooth, it is well known that the simple deterministic gradient method has finite dimension-free oracle complexity. However, when the function can be nonsmooth, it is only recently that a randomized algorithm with finite dimension-free oracle complexity has been developed. In this paper, we show that no deterministic algorithm can do the same. Moreover, even without the dimension-free requirement, we show that any finite-time deterministic method cannot be general zero-respecting. In particular, this implies that a natural derandomization of the aforementioned randomized algorithm cannot have finite-time complexity. Our results reveal a fundamental hurdle in modern large-scale nonconvex nonsmooth optimization. |
| Audience | Academic |
| Author | So, Anthony Man-Cho Tian, Lai |
| Author_xml | – sequence: 1 givenname: Lai surname: Tian fullname: Tian, Lai organization: Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong – sequence: 2 givenname: Anthony Man-Cho orcidid: 0000-0003-2588-7851 surname: So fullname: So, Anthony Man-Cho email: manchoso@se.cuhk.edu.hk organization: Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong |
| BookMark | eNp9kMFO5DAMQCMEEgPLD3DqD3Rw6iZtj2i0uyCN4ALXjTKpMwRNkyrJSAtfv2G7pz2MLMuS7WfJ74qd--CJsVsOaw7Q3SUOHLoaGiwJyGt5xla8RVm3spXnbAXQiFpIDpfsKqV3AODY9yv26ylUo5vIJxd8bSNRNVKmODnvUnam0od9iC6_TZUJ03zMlCo9zzH8dpPOVKWscyF1WXFlFGy1dXMyby5_Ou3TN3Zh9SHRzb96zV5_fH_ZPNTb55-Pm_ttbRAh153EAcahRYONlV2LhAJJaG57IcwgdmJn-n40prwhd8NIGqnrm53QFuwAHV6z9XJ3rw-knLchR21KjDQ5U1xZV_r3PW9aBJRDAfoFMDGkFMkq45ZXCugOioP6EqsWsaqIVX_FKlnQ5j90jkVG_DgN4QKlsuz3FNV7OEZfnJyi_gAIjo8c |
| CitedBy_id | crossref_primary_10_1007_s10107_025_02262_9 |
| Cites_doi | 10.1137/1.9781611976748 10.1137/090748408 10.1137/030601296 10.1137/18M1178244 10.1007/BF01589116 10.1007/s10957-022-02093-0 10.1287/moor.2022.0289 10.1137/120880811 10.1007/s10107-019-01431-x 10.1145/102782.102783 10.1137/1.9781611971309 10.1007/BF01584320 10.1007/s10208-018-09409-5 10.1145/3418526 10.1007/s10107-006-0706-8 10.1109/TIT.2017.2701343 10.1137/S0363012904439301 10.1016/j.tcs.2014.06.006 10.1007/s10107-019-01406-y 10.1137/17M1151031 10.1137/0803004 10.1109/MSP.2020.3003845 10.1007/978-3-319-91578-4 10.1137/050639673 10.1137/090774100 10.1137/1.9780898719857 10.1007/978-3-030-34910-3_6 10.1137/19M1298147 |
| ContentType | Journal Article |
| Copyright | Springer-Verlag GmbH Germany, part of Springer Nature and Mathematical Optimization Society 2024. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law. COPYRIGHT 2024 Springer |
| Copyright_xml | – notice: Springer-Verlag GmbH Germany, part of Springer Nature and Mathematical Optimization Society 2024. Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law. – notice: COPYRIGHT 2024 Springer |
| DBID | AAYXX CITATION |
| DOI | 10.1007/s10107-023-02031-6 |
| DatabaseName | CrossRef |
| DatabaseTitle | CrossRef |
| DatabaseTitleList | |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Engineering Mathematics |
| EISSN | 1436-4646 |
| EndPage | 74 |
| ExternalDocumentID | A812430369 10_1007_s10107_023_02031_6 |
| GrantInformation_xml | – fundername: Hong Kong Research Grants Council (RGC) General Research Fund (GRF) grantid: CUHK 14216122 |
| GroupedDBID | --K --Z -52 -5D -5G -BR -EM -Y2 -~C -~X .4S .86 .DC .VR 06D 0R~ 0VY 199 1B1 1N0 1OL 1SB 203 28- 29M 2J2 2JN 2JY 2KG 2KM 2LR 2P1 2VQ 2~H 30V 3V. 4.4 406 408 409 40D 40E 5GY 5QI 5VS 67Z 6NX 6TJ 78A 7WY 88I 8AO 8FE 8FG 8FL 8TC 8UJ 8VB 95- 95. 95~ 96X AAAVM AABHQ AACDK AAHNG AAIAL AAJBT AAJKR AANZL AARHV AARTL AASML AATNV AATVU AAUYE AAWCG AAYIU AAYQN AAYTO AAYZH ABAKF ABBBX ABBXA ABDBF ABDZT ABECU ABFTV ABHLI ABHQN ABJCF ABJNI ABJOX ABKCH ABKTR ABMNI ABMQK ABNWP ABQBU ABQSL ABSXP ABTEG ABTHY ABTKH ABTMW ABULA ABUWG ABWNU ABXPI ACAOD ACBXY ACDTI ACGFS ACGOD ACHSB ACHXU ACIWK ACKNC ACMDZ ACMLO ACNCT ACOKC ACOMO ACPIV ACUHS ACZOJ ADHHG ADHIR ADIMF ADINQ ADKNI ADKPE ADRFC ADTPH ADURQ ADYFF ADZKW AEBTG AEFIE AEFQL AEGAL AEGNC AEJHL AEJRE AEKMD AEMOZ AEMSY AENEX AEOHA AEPYU AESKC AETLH AEVLU AEXYK AFBBN AFEXP AFFNX AFGCZ AFKRA AFLOW AFQWF AFWTZ AFZKB AGAYW AGDGC AGGDS AGJBK AGMZJ AGQEE AGQMX AGRTI AGWIL AGWZB AGYKE AHAVH AHBYD AHKAY AHQJS AHSBF AHYZX AIAKS AIGIU AIIXL AILAN AITGF AJBLW AJRNO AJZVZ AKVCP ALMA_UNASSIGNED_HOLDINGS ALWAN AMKLP AMXSW AMYLF AMYQR AOCGG ARAPS ARCSS ARMRJ ASPBG AVWKF AXYYD AYJHY AZFZN AZQEC B-. B0M BA0 BAPOH BBWZM BDATZ BENPR BEZIV BGLVJ BGNMA BPHCQ BSONS CAG CCPQU COF CS3 CSCUP DDRTE DL5 DNIVK DPUIP DU5 DWQXO EAD EAP EBA EBLON EBR EBS EBU ECS EDO EIOEI EJD EMI EMK EPL ESBYG EST ESX FEDTE FERAY FFXSO FIGPU FINBP FNLPD FRNLG FRRFC FSGXE FWDCC GGCAI GGRSB GJIRD GNUQQ GNWQR GQ6 GQ7 GQ8 GROUPED_ABI_INFORM_COMPLETE GXS H13 HCIFZ HF~ HG5 HG6 HMJXF HQYDN HRMNR HVGLF HZ~ H~9 I-F I09 IAO IHE IJ- IKXTQ ITM IWAJR IXC IZIGR IZQ I~X I~Z J-C J0Z JBSCW JCJTX JZLTJ K1G K60 K6V K6~ K7- KDC KOV KOW L6V LAS LLZTM M0C M0N M2P M4Y M7S MA- N2Q N9A NB0 NDZJH NPVJJ NQ- NQJWS NU0 O9- O93 O9G O9I O9J OAM P19 P2P P62 P9R PF0 PQBIZ PQBZA PQQKQ PROAC PT4 PT5 PTHSS Q2X QOK QOS QWB R4E R89 R9I RHV RIG RNI RNS ROL RPX RPZ RSV RZK S16 S1Z S26 S27 S28 S3B SAP SCLPG SDD SDH SDM SHX SISQX SJYHP SMT SNE SNPRN SNX SOHCF SOJ SPISZ SRMVM SSLCW STPWE SZN T13 T16 TH9 TN5 TSG TSK TSV TUC TUS U2A UG4 UOJIU UTJUX UZXMN VC2 VFIZW W23 W48 WH7 WK8 XPP YLTOR Z45 Z5O Z7R Z7S Z7X Z7Y Z7Z Z81 Z83 Z86 Z88 Z8M Z8N Z8R Z8T Z8W Z92 ZL0 ZMTXR ZWQNP ~02 ~8M ~EX AAPKM AAYXX ABBRH ABDBE ABFSG ABRTQ ACSTC ADHKG ADXHL AEZWR AFDZB AFFHD AFHIU AFOHR AGQPQ AHPBZ AHWEU AIXLP AMVHM ATHPR AYFIA CITATION PHGZM PHGZT PQGLB |
| ID | FETCH-LOGICAL-c330t-76390d943c32f6743e353e5a1f855c95b5bc88dcc0026b9dea3e782b5af0f9073 |
| IEDL.DBID | RSV |
| ISICitedReferencesCount | 2 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001137011800001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 0025-5610 |
| IngestDate | Sat Nov 29 10:28:28 EST 2025 Tue Nov 18 22:27:36 EST 2025 Sat Nov 29 03:34:04 EST 2025 Fri Feb 21 02:37:19 EST 2025 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 1-2 |
| Keywords | Dimension-free rates Lower bounds 68Q25 Information-based complexity Black-box optimization 90C56 Stationary points 90C60 |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c330t-76390d943c32f6743e353e5a1f855c95b5bc88dcc0026b9dea3e782b5af0f9073 |
| ORCID | 0000-0003-2588-7851 |
| PageCount | 24 |
| ParticipantIDs | gale_infotracacademiconefile_A812430369 crossref_citationtrail_10_1007_s10107_023_02031_6 crossref_primary_10_1007_s10107_023_02031_6 springer_journals_10_1007_s10107_023_02031_6 |
| PublicationCentury | 2000 |
| PublicationDate | 20241100 2024-11-00 20241101 |
| PublicationDateYYYYMMDD | 2024-11-01 |
| PublicationDate_xml | – month: 11 year: 2024 text: 20241100 |
| PublicationDecade | 2020 |
| PublicationPlace | Berlin/Heidelberg |
| PublicationPlace_xml | – name: Berlin/Heidelberg |
| PublicationSubtitle | A Publication of the Mathematical Optimization Society |
| PublicationTitle | Mathematical programming |
| PublicationTitleAbbrev | Math. Program |
| PublicationYear | 2024 |
| Publisher | Springer Berlin Heidelberg Springer |
| Publisher_xml | – name: Springer Berlin Heidelberg – name: Springer |
| References | ClarkeFHOptimization and Nonsmooth Analysis1990PhiladelphiaSIAM10.1137/1.9781611971309 MunkresJRTopology2000HobokenPearson Prentice Hall ConnARGouldNITointPLTrust Region Methods2000PhiladelphiaSIAM10.1137/1.9780898719857 DyerMFriezeAComputing the volume of convex bodies: a case where randomness provably helpsProbab. Comb. Appl.1991441231701141926 Zhang, J., Lin, H., Jegelka, S., Jadbabaie, A., Sra, S.: Complexity of finding stationary points of nonsmooth nonconvex functions. In: Proceedings of the 37th International Conference on Machine Learning, pp. 11173–11182 (2020) BöckenhauerHJHromkovičJKommDKrugSSmulaJSprockAThe string guessing problem as a method to prove lower bounds on the advice complexityTheor. Comput. Sci.201455495108326288010.1016/j.tcs.2014.06.006 Jordan, M., Kornowski, G., Lin, T., Shamir, O., Zampetakis, M.: Deterministic nonsmooth nonconvex optimization. In: Proceedings of the 36th Conference on Learning Theory, pp. 4570–4597 (2023) Woodworth, B., Srebro, N.: Tight complexity bounds for optimizing composite objectives. In: Advances in Neural Information Processing Systems, vol. 29, pp. 3646–3654 (2016) Davis, D., Drusvyatskiy, D., Lee, Y.T., Padmanabhan, S., Ye, G.: A gradient sampling method with complexity guarantees for Lipschitz functions in high and low dimensions. In: Advances in Neural Information Processing Systems, vol. 35, pp. 6692–6703 (2022) KornowskiGShamirOOracle complexity in nonsmooth nonconvex optimizationJ. Mach. Learn. Res.2022233141444577753 Kornowski, G., Shamir, O.: On the complexity of finding small subgradients in nonsmooth optimization. arXiv preprint arXiv:2209.10346 (2022) CuiYPangJSModern Nonconvex Nondifferentiable Optimization2021PhiladelphiaSIAM10.1137/1.9781611976748 NesterovYuLectures on Convex Optimization2018BerlinSpringer10.1007/978-3-319-91578-4 BraunGGuzmánCPokuttaSLower bounds on the oracle complexity of nonsmooth convex optimization via information theoryIEEE Trans. Inf. Theory201763747094724366698510.1109/TIT.2017.2701343 CartisCGouldNIMTointPLOn the complexity of steepest descent, Newton’s and regularized Newton’s methods for nonconvex unconstrained optimization problemsSIAM J. Optim.201020628332852272115710.1137/090774100 Arora, R., Basu, A., Mianjy, P., Mukherjee, A.: Understanding deep neural networks with rectified linear units. In: International Conference on Learning Representations (2018) Tian, L., So, A.M.C.: On the hardness of computing near-approximate stationary points of Clarke regular nonsmooth nonconvex problems and certain DC programs. In: ICML Workshop on Beyond First-Order Methods in ML Systems (2021) NesterovYuA method of solving a convex programming problem with convergence rate O(1/k2)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(1/k^2)$$\end{document}Dokl. Akad. Nauk1983269543547701288 NesterovYuHow to make the gradients smallOptima2012881011 BlythTSSet Theory and Abstract Algebra1975New YorkLongman Publishing Group DaniilidisADrusvyatskiyDPathological subgradient dynamicsSIAM J. Optim.202030213271338409336510.1137/19M1298147 VavasisSABlack-box complexity of local minimizationSIAM J. Optim.1993316080120200210.1137/0803004 Carmon, Y., Duchi, J.C., Hinder, O., Sidford, A.: “Convex until proven guilty”: dimension-free acceleration of gradient descent on non-convex functions. In: Proceedings of the 34th International Conference on Machine Learning, pp. 654–663 (2017) Kakade, S.M., Lee, J.D.: Provably correct automatic subdifferentiation for qualified programs. In: Advances in Neural Information Processing Systems, vol. 31, pp. 7125–7135 (2018) HagerWWZhangHA survey of nonlinear conjugate gradient methodsPac. J. Optim.20062135582548208 GoldsteinAOptimization of Lipschitz continuous functionsMath. Program.1977131142244333510.1007/BF01584320 LiJSoAMCMaWKUnderstanding notions of stationarity in nonsmooth optimization: a guided tour of various constructions of subdifferential for nonsmooth functionsIEEE Signal Process. Mag.2020375183110.1109/MSP.2020.3003845 DavisDDrusvyatskiyDKakadeSLeeJDStochastic subgradient method converges on tame functionsFound. Comput. Math.2020201119154405692710.1007/s10208-018-09409-5 JinCNetrapalliPGeRKakadeSMJordanMIOn nonconvex optimization for machine learning: gradients, stochasticity, and saddle pointsJ. ACM202168211426706010.1145/3418526 LiuDCNocedalJOn the limited memory BFGS method for large scale optimizationMath. Program.1989451503528103824510.1007/BF01589116 BenaïmMHofbauerJSorinSStochastic approximations and differential inclusionsSIAM J. Control Optim.2005441328348217715910.1137/S0363012904439301 BurkeJVCurtisFELewisASOvertonMLSimõesLEGradient sampling methods for nonsmooth optimizationNumerical Nonsmooth Optimization: State of the Art Algorithms2020ChamSpringer20122510.1007/978-3-030-34910-3_6 CarmonYDuchiJCHinderOSidfordALower bounds for finding stationary points IMath. Program.20201841–271120416354110.1007/s10107-019-01406-y Kong, S., Lewis, A.: The cost of nonconvexity in deterministic nonsmooth optimization. arXiv preprint arXiv:2210.00652 (2022) CarmonYDuchiJCHinderOSidfordALower bounds for finding stationary points II: first-order methodsMath. Program.20211851315355420171610.1007/s10107-019-01431-x Hiriart-UrrutyJBLemaréchalCFundamentals of Convex Analysis2004BerlinSpringer Science & Business Media DavisDDrusvyatskiyDStochastic model-based minimization of weakly convex functionsSIAM J. Optim.2019291207239390245510.1137/18M1178244 NesterovYuPolyakBTCubic regularization of Newton method and its global performanceMath. Program.20061081177205222945910.1007/s10107-006-0706-8 DavisDGrimmerBProximally guided stochastic subgradient method for nonsmooth, nonconvex problemsSIAM J. Optim.201929319081930398268210.1137/17M1151031 DyerMFriezeAKannanRA random polynomial-time algorithm for approximating the volume of convex bodiesJ. ACM1991381117109591610.1145/102782.102783 KiwielKCConvergence of the gradient sampling algorithm for nonsmooth nonconvex optimizationSIAM J. Optim.2007182379388233844310.1137/050639673 NemirovskijASYudinDBProblem Complexity and Method Efficiency in Optimization1983HobokenWiley-Interscience Majewski, S., Miasojedow, B., Moulines, E.: Analysis of nonsmooth stochastic approximation: the differential inclusion approach. arXiv preprint arXiv:1805.01916 (2018) Tian, L., Zhou, K., So, A.M.C.: On the finite-time complexity and practical computation of approximate stationarity concepts of Lipschitz functions. In: Proceedings of the 39th International Conference on Machine Learning, pp. 21360–21379 (2022) RockafellarRTWetsRJBVariational Analysis2009BerlinSpringer Science & Business Media BurkeJVLewisASOvertonMLA robust gradient sampling algorithm for nonsmooth, nonconvex optimizationSIAM J. Optim.2005153751779214285910.1137/030601296 Metel, M.R., Takeda, A.: Perturbed iterate SGD for Lipschitz continuous loss functions. J. Optim. Theory Appl. 195(2), 504–547 (2022) KiwielKCA nonderivative version of the gradient sampling algorithm for nonsmooth nonconvex optimizationSIAM J. Optim.201020419831994260024910.1137/090748408 GhadimiSLanGStochastic first-and zeroth-order methods for nonconvex stochastic programmingSIAM J. Optim.201323423412368313443910.1137/120880811 G Braun (2031_CR5) 2017; 63 D Davis (2031_CR17) 2020; 20 M Dyer (2031_CR21) 1991; 38 2031_CR8 Yu Nesterov (2031_CR42) 2018 SA Vavasis (2031_CR47) 1993; 3 JV Burke (2031_CR6) 2020 HJ Böckenhauer (2031_CR4) 2014; 554 AR Conn (2031_CR13) 2000 Yu Nesterov (2031_CR41) 2012; 88 A Goldstein (2031_CR23) 1977; 13 Yu Nesterov (2031_CR40) 1983; 269 RT Rockafellar (2031_CR44) 2009 2031_CR45 M Benaïm (2031_CR2) 2005; 44 S Ghadimi (2031_CR22) 2013; 23 2031_CR49 2031_CR48 2031_CR46 C Cartis (2031_CR11) 2010; 20 JR Munkres (2031_CR38) 2000 KC Kiwiel (2031_CR30) 2010; 20 FH Clarke (2031_CR12) 1990 2031_CR32 TS Blyth (2031_CR3) 1975 2031_CR31 2031_CR37 Y Carmon (2031_CR9) 2020; 184 G Kornowski (2031_CR33) 2022; 23 2031_CR36 Y Carmon (2031_CR10) 2021; 185 M Dyer (2031_CR20) 1991; 44 DC Liu (2031_CR35) 1989; 45 AS Nemirovskij (2031_CR39) 1983 WW Hager (2031_CR24) 2006; 2 Yu Nesterov (2031_CR43) 2006; 108 2031_CR28 JB Hiriart-Urruty (2031_CR25) 2004 C Jin (2031_CR26) 2021; 68 2031_CR27 A Daniilidis (2031_CR15) 2020; 30 KC Kiwiel (2031_CR29) 2007; 18 J Li (2031_CR34) 2020; 37 JV Burke (2031_CR7) 2005; 15 D Davis (2031_CR16) 2019; 29 2031_CR18 2031_CR1 D Davis (2031_CR19) 2019; 29 Y Cui (2031_CR14) 2021 |
| References_xml | – reference: NesterovYuPolyakBTCubic regularization of Newton method and its global performanceMath. Program.20061081177205222945910.1007/s10107-006-0706-8 – reference: Hiriart-UrrutyJBLemaréchalCFundamentals of Convex Analysis2004BerlinSpringer Science & Business Media – reference: NemirovskijASYudinDBProblem Complexity and Method Efficiency in Optimization1983HobokenWiley-Interscience – reference: BöckenhauerHJHromkovičJKommDKrugSSmulaJSprockAThe string guessing problem as a method to prove lower bounds on the advice complexityTheor. Comput. Sci.201455495108326288010.1016/j.tcs.2014.06.006 – reference: Kakade, S.M., Lee, J.D.: Provably correct automatic subdifferentiation for qualified programs. In: Advances in Neural Information Processing Systems, vol. 31, pp. 7125–7135 (2018) – reference: BenaïmMHofbauerJSorinSStochastic approximations and differential inclusionsSIAM J. Control Optim.2005441328348217715910.1137/S0363012904439301 – reference: DyerMFriezeAKannanRA random polynomial-time algorithm for approximating the volume of convex bodiesJ. ACM1991381117109591610.1145/102782.102783 – reference: JinCNetrapalliPGeRKakadeSMJordanMIOn nonconvex optimization for machine learning: gradients, stochasticity, and saddle pointsJ. ACM202168211426706010.1145/3418526 – reference: Kong, S., Lewis, A.: The cost of nonconvexity in deterministic nonsmooth optimization. arXiv preprint arXiv:2210.00652 (2022) – reference: Jordan, M., Kornowski, G., Lin, T., Shamir, O., Zampetakis, M.: Deterministic nonsmooth nonconvex optimization. In: Proceedings of the 36th Conference on Learning Theory, pp. 4570–4597 (2023) – reference: Zhang, J., Lin, H., Jegelka, S., Jadbabaie, A., Sra, S.: Complexity of finding stationary points of nonsmooth nonconvex functions. In: Proceedings of the 37th International Conference on Machine Learning, pp. 11173–11182 (2020) – reference: Woodworth, B., Srebro, N.: Tight complexity bounds for optimizing composite objectives. In: Advances in Neural Information Processing Systems, vol. 29, pp. 3646–3654 (2016) – reference: BurkeJVLewisASOvertonMLA robust gradient sampling algorithm for nonsmooth, nonconvex optimizationSIAM J. Optim.2005153751779214285910.1137/030601296 – reference: KornowskiGShamirOOracle complexity in nonsmooth nonconvex optimizationJ. Mach. Learn. Res.2022233141444577753 – reference: BlythTSSet Theory and Abstract Algebra1975New YorkLongman Publishing Group – reference: Carmon, Y., Duchi, J.C., Hinder, O., Sidford, A.: “Convex until proven guilty”: dimension-free acceleration of gradient descent on non-convex functions. In: Proceedings of the 34th International Conference on Machine Learning, pp. 654–663 (2017) – reference: GhadimiSLanGStochastic first-and zeroth-order methods for nonconvex stochastic programmingSIAM J. Optim.201323423412368313443910.1137/120880811 – reference: Metel, M.R., Takeda, A.: Perturbed iterate SGD for Lipschitz continuous loss functions. J. Optim. Theory Appl. 195(2), 504–547 (2022) – reference: Majewski, S., Miasojedow, B., Moulines, E.: Analysis of nonsmooth stochastic approximation: the differential inclusion approach. arXiv preprint arXiv:1805.01916 (2018) – reference: CarmonYDuchiJCHinderOSidfordALower bounds for finding stationary points II: first-order methodsMath. Program.20211851315355420171610.1007/s10107-019-01431-x – reference: Davis, D., Drusvyatskiy, D., Lee, Y.T., Padmanabhan, S., Ye, G.: A gradient sampling method with complexity guarantees for Lipschitz functions in high and low dimensions. In: Advances in Neural Information Processing Systems, vol. 35, pp. 6692–6703 (2022) – reference: MunkresJRTopology2000HobokenPearson Prentice Hall – reference: BurkeJVCurtisFELewisASOvertonMLSimõesLEGradient sampling methods for nonsmooth optimizationNumerical Nonsmooth Optimization: State of the Art Algorithms2020ChamSpringer20122510.1007/978-3-030-34910-3_6 – reference: ConnARGouldNITointPLTrust Region Methods2000PhiladelphiaSIAM10.1137/1.9780898719857 – reference: BraunGGuzmánCPokuttaSLower bounds on the oracle complexity of nonsmooth convex optimization via information theoryIEEE Trans. Inf. Theory201763747094724366698510.1109/TIT.2017.2701343 – reference: VavasisSABlack-box complexity of local minimizationSIAM J. Optim.1993316080120200210.1137/0803004 – reference: NesterovYuLectures on Convex Optimization2018BerlinSpringer10.1007/978-3-319-91578-4 – reference: NesterovYuHow to make the gradients smallOptima2012881011 – reference: Arora, R., Basu, A., Mianjy, P., Mukherjee, A.: Understanding deep neural networks with rectified linear units. In: International Conference on Learning Representations (2018) – reference: CuiYPangJSModern Nonconvex Nondifferentiable Optimization2021PhiladelphiaSIAM10.1137/1.9781611976748 – reference: DavisDDrusvyatskiyDKakadeSLeeJDStochastic subgradient method converges on tame functionsFound. Comput. Math.2020201119154405692710.1007/s10208-018-09409-5 – reference: GoldsteinAOptimization of Lipschitz continuous functionsMath. Program.1977131142244333510.1007/BF01584320 – reference: LiuDCNocedalJOn the limited memory BFGS method for large scale optimizationMath. Program.1989451503528103824510.1007/BF01589116 – reference: KiwielKCA nonderivative version of the gradient sampling algorithm for nonsmooth nonconvex optimizationSIAM J. Optim.201020419831994260024910.1137/090748408 – reference: Tian, L., So, A.M.C.: On the hardness of computing near-approximate stationary points of Clarke regular nonsmooth nonconvex problems and certain DC programs. In: ICML Workshop on Beyond First-Order Methods in ML Systems (2021) – reference: KiwielKCConvergence of the gradient sampling algorithm for nonsmooth nonconvex optimizationSIAM J. Optim.2007182379388233844310.1137/050639673 – reference: Tian, L., Zhou, K., So, A.M.C.: On the finite-time complexity and practical computation of approximate stationarity concepts of Lipschitz functions. In: Proceedings of the 39th International Conference on Machine Learning, pp. 21360–21379 (2022) – reference: DavisDGrimmerBProximally guided stochastic subgradient method for nonsmooth, nonconvex problemsSIAM J. Optim.201929319081930398268210.1137/17M1151031 – reference: DavisDDrusvyatskiyDStochastic model-based minimization of weakly convex functionsSIAM J. Optim.2019291207239390245510.1137/18M1178244 – reference: NesterovYuA method of solving a convex programming problem with convergence rate O(1/k2)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(1/k^2)$$\end{document}Dokl. Akad. Nauk1983269543547701288 – reference: LiJSoAMCMaWKUnderstanding notions of stationarity in nonsmooth optimization: a guided tour of various constructions of subdifferential for nonsmooth functionsIEEE Signal Process. Mag.2020375183110.1109/MSP.2020.3003845 – reference: Kornowski, G., Shamir, O.: On the complexity of finding small subgradients in nonsmooth optimization. arXiv preprint arXiv:2209.10346 (2022) – reference: ClarkeFHOptimization and Nonsmooth Analysis1990PhiladelphiaSIAM10.1137/1.9781611971309 – reference: HagerWWZhangHA survey of nonlinear conjugate gradient methodsPac. J. Optim.20062135582548208 – reference: RockafellarRTWetsRJBVariational Analysis2009BerlinSpringer Science & Business Media – reference: CartisCGouldNIMTointPLOn the complexity of steepest descent, Newton’s and regularized Newton’s methods for nonconvex unconstrained optimization problemsSIAM J. Optim.201020628332852272115710.1137/090774100 – reference: CarmonYDuchiJCHinderOSidfordALower bounds for finding stationary points IMath. Program.20201841–271120416354110.1007/s10107-019-01406-y – reference: DyerMFriezeAComputing the volume of convex bodies: a case where randomness provably helpsProbab. Comb. Appl.1991441231701141926 – reference: DaniilidisADrusvyatskiyDPathological subgradient dynamicsSIAM J. Optim.202030213271338409336510.1137/19M1298147 – volume: 88 start-page: 10 year: 2012 ident: 2031_CR41 publication-title: Optima – volume-title: Modern Nonconvex Nondifferentiable Optimization year: 2021 ident: 2031_CR14 doi: 10.1137/1.9781611976748 – volume: 20 start-page: 1983 issue: 4 year: 2010 ident: 2031_CR30 publication-title: SIAM J. Optim. doi: 10.1137/090748408 – volume: 15 start-page: 751 issue: 3 year: 2005 ident: 2031_CR7 publication-title: SIAM J. Optim. doi: 10.1137/030601296 – volume: 269 start-page: 543 year: 1983 ident: 2031_CR40 publication-title: Dokl. Akad. Nauk – ident: 2031_CR46 – volume: 29 start-page: 207 issue: 1 year: 2019 ident: 2031_CR16 publication-title: SIAM J. Optim. doi: 10.1137/18M1178244 – volume-title: Problem Complexity and Method Efficiency in Optimization year: 1983 ident: 2031_CR39 – volume: 45 start-page: 503 issue: 1 year: 1989 ident: 2031_CR35 publication-title: Math. Program. doi: 10.1007/BF01589116 – volume-title: Fundamentals of Convex Analysis year: 2004 ident: 2031_CR25 – ident: 2031_CR37 doi: 10.1007/s10957-022-02093-0 – ident: 2031_CR49 – ident: 2031_CR31 doi: 10.1287/moor.2022.0289 – ident: 2031_CR1 – volume: 23 start-page: 2341 issue: 4 year: 2013 ident: 2031_CR22 publication-title: SIAM J. Optim. doi: 10.1137/120880811 – volume: 185 start-page: 315 issue: 1 year: 2021 ident: 2031_CR10 publication-title: Math. Program. doi: 10.1007/s10107-019-01431-x – ident: 2031_CR45 – volume: 23 start-page: 1 issue: 314 year: 2022 ident: 2031_CR33 publication-title: J. Mach. Learn. Res. – ident: 2031_CR27 – volume: 38 start-page: 1 issue: 1 year: 1991 ident: 2031_CR21 publication-title: J. ACM doi: 10.1145/102782.102783 – ident: 2031_CR48 – volume-title: Optimization and Nonsmooth Analysis year: 1990 ident: 2031_CR12 doi: 10.1137/1.9781611971309 – ident: 2031_CR8 – volume: 13 start-page: 14 issue: 1 year: 1977 ident: 2031_CR23 publication-title: Math. Program. doi: 10.1007/BF01584320 – volume: 20 start-page: 119 issue: 1 year: 2020 ident: 2031_CR17 publication-title: Found. Comput. Math. doi: 10.1007/s10208-018-09409-5 – volume: 68 start-page: 11 issue: 2 year: 2021 ident: 2031_CR26 publication-title: J. ACM doi: 10.1145/3418526 – volume: 108 start-page: 177 issue: 1 year: 2006 ident: 2031_CR43 publication-title: Math. Program. doi: 10.1007/s10107-006-0706-8 – volume: 63 start-page: 4709 issue: 7 year: 2017 ident: 2031_CR5 publication-title: IEEE Trans. Inf. Theory doi: 10.1109/TIT.2017.2701343 – volume: 44 start-page: 328 issue: 1 year: 2005 ident: 2031_CR2 publication-title: SIAM J. Control Optim. doi: 10.1137/S0363012904439301 – ident: 2031_CR28 – volume: 554 start-page: 95 year: 2014 ident: 2031_CR4 publication-title: Theor. Comput. Sci. doi: 10.1016/j.tcs.2014.06.006 – volume: 184 start-page: 71 issue: 1–2 year: 2020 ident: 2031_CR9 publication-title: Math. Program. doi: 10.1007/s10107-019-01406-y – volume: 29 start-page: 1908 issue: 3 year: 2019 ident: 2031_CR19 publication-title: SIAM J. Optim. doi: 10.1137/17M1151031 – volume-title: Set Theory and Abstract Algebra year: 1975 ident: 2031_CR3 – volume: 3 start-page: 60 issue: 1 year: 1993 ident: 2031_CR47 publication-title: SIAM J. Optim. doi: 10.1137/0803004 – volume: 2 start-page: 35 issue: 1 year: 2006 ident: 2031_CR24 publication-title: Pac. J. Optim. – volume: 37 start-page: 18 issue: 5 year: 2020 ident: 2031_CR34 publication-title: IEEE Signal Process. Mag. doi: 10.1109/MSP.2020.3003845 – volume-title: Variational Analysis year: 2009 ident: 2031_CR44 – ident: 2031_CR18 – volume-title: Lectures on Convex Optimization year: 2018 ident: 2031_CR42 doi: 10.1007/978-3-319-91578-4 – volume: 44 start-page: 123 year: 1991 ident: 2031_CR20 publication-title: Probab. Comb. Appl. – volume: 18 start-page: 379 issue: 2 year: 2007 ident: 2031_CR29 publication-title: SIAM J. Optim. doi: 10.1137/050639673 – volume: 20 start-page: 2833 issue: 6 year: 2010 ident: 2031_CR11 publication-title: SIAM J. Optim. doi: 10.1137/090774100 – volume-title: Trust Region Methods year: 2000 ident: 2031_CR13 doi: 10.1137/1.9780898719857 – start-page: 201 volume-title: Numerical Nonsmooth Optimization: State of the Art Algorithms year: 2020 ident: 2031_CR6 doi: 10.1007/978-3-030-34910-3_6 – ident: 2031_CR36 – ident: 2031_CR32 – volume: 30 start-page: 1327 issue: 2 year: 2020 ident: 2031_CR15 publication-title: SIAM J. Optim. doi: 10.1137/19M1298147 – volume-title: Topology year: 2000 ident: 2031_CR38 |
| SSID | ssj0001388 |
| Score | 2.4379888 |
| Snippet | We consider the oracle complexity of computing an approximate stationary point of a Lipschitz function. When the function is smooth, it is well known that the... |
| SourceID | gale crossref springer |
| SourceType | Aggregation Database Enrichment Source Index Database Publisher |
| StartPage | 51 |
| SubjectTerms | Algorithms Calculus of Variations and Optimal Control; Optimization Combinatorics Full Length Paper Mathematical and Computational Physics Mathematical Methods in Physics Mathematics Mathematics and Statistics Mathematics of Computing Numerical Analysis Theoretical |
| Title | No dimension-free deterministic algorithm computes approximate stationarities of Lipschitzians |
| URI | https://link.springer.com/article/10.1007/s10107-023-02031-6 |
| Volume | 208 |
| WOSCitedRecordID | wos001137011800001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVPQU databaseName: ABI/INFORM Collection customDbUrl: eissn: 1436-4646 dateEnd: 20241214 omitProxy: false ssIdentifier: ssj0001388 issn: 0025-5610 databaseCode: 7WY dateStart: 20240101 isFulltext: true titleUrlDefault: https://www.proquest.com/abicomplete providerName: ProQuest – providerCode: PRVPQU databaseName: ABI/INFORM Global customDbUrl: eissn: 1436-4646 dateEnd: 20241214 omitProxy: false ssIdentifier: ssj0001388 issn: 0025-5610 databaseCode: M0C dateStart: 20240101 isFulltext: true titleUrlDefault: https://search.proquest.com/abiglobal providerName: ProQuest – providerCode: PRVPQU databaseName: Advanced Technologies & Aerospace Database customDbUrl: eissn: 1436-4646 dateEnd: 20241214 omitProxy: false ssIdentifier: ssj0001388 issn: 0025-5610 databaseCode: P5Z dateStart: 20240101 isFulltext: true titleUrlDefault: https://search.proquest.com/hightechjournals providerName: ProQuest – providerCode: PRVPQU databaseName: Computer Science Database customDbUrl: eissn: 1436-4646 dateEnd: 20241214 omitProxy: false ssIdentifier: ssj0001388 issn: 0025-5610 databaseCode: K7- dateStart: 20240101 isFulltext: true titleUrlDefault: http://search.proquest.com/compscijour providerName: ProQuest – providerCode: PRVPQU databaseName: Engineering Database customDbUrl: eissn: 1436-4646 dateEnd: 20241214 omitProxy: false ssIdentifier: ssj0001388 issn: 0025-5610 databaseCode: M7S dateStart: 20240101 isFulltext: true titleUrlDefault: http://search.proquest.com providerName: ProQuest – providerCode: PRVPQU databaseName: ProQuest Central customDbUrl: eissn: 1436-4646 dateEnd: 20241214 omitProxy: false ssIdentifier: ssj0001388 issn: 0025-5610 databaseCode: BENPR dateStart: 20240101 isFulltext: true titleUrlDefault: https://www.proquest.com/central providerName: ProQuest – providerCode: PRVPQU databaseName: Science Database customDbUrl: eissn: 1436-4646 dateEnd: 20241214 omitProxy: false ssIdentifier: ssj0001388 issn: 0025-5610 databaseCode: M2P dateStart: 20240101 isFulltext: true titleUrlDefault: https://search.proquest.com/sciencejournals providerName: ProQuest – providerCode: PRVAVX databaseName: SpringerLINK Contemporary 1997-Present customDbUrl: eissn: 1436-4646 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0001388 issn: 0025-5610 databaseCode: RSV dateStart: 19970101 isFulltext: true titleUrlDefault: https://link.springer.com/search?facet-content-type=%22Journal%22 providerName: Springer Nature |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LSwMxEA5aPejBt1hf5CB40MBus9ndHItYPGgRfNCTIZuHLtS27K4i_noz-6gtiKDHQDKEyWQemcw3CJ14YCQ9o4lOlCQB05bwgHEimWY00s5CW1U2m4j6_Xgw4Ld1UVje_HZvUpKlpp4pdvPhWa0DeUcniiRcREsM0GYgRr97nOpfn8Zx06gVvIO6VOZnGnPmqFHK8ynR0tL01v-3xw20VnuWuFuJwiZaMKMttDqDN-hGN1OQ1nwbPfXHWAO4PzyYEZsZg3X9OaZEb8Zy-DzO0uLlFauq90OOSwjyj9SRMDivsvgyK0FZ8dji63SSQ17i08lcvoMeepf3F1ekbrdAFKVeQZym4Z7mAVW0Y6E2wVBGDZO-jRlTnCUsUXGslYK4LeHaSGqcf5EwaT3rYmy6i1qj8cjsIWx95-eFJqBUhu78VRLx2PpeIqmNeELDNvIbrgtVY5FDS4yh-EZRBk4Kx0lRclK4NWfTNZMKiePX2adwmAKuqaOsZF1t4PYHgFeiC44NmG_eRufNWYr6_ua_EN7_2_QDtNJxblBVvXiIWkX2Zo7Qsnov0jw7LgX3C6gC5yE |
| linkProvider | Springer Nature |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LSwMxEA5aBfXgW6zPHAQPGthtNu3mWMRSsS2CVXoyZPPQQm3LbhXx15vZR60gBT0GkiFMJvPIZL5B6MwDI-kZTXSkJAmYtoQHjBPJNKM17Sy0VWmziVqnE_Z6_C4vCkuK3-5FSjLV1DPFbj48q1Ug7-hEkVQX0VIAbXYgRr9_nOpfn4Zh0agVvIO8VOZ3Gj_MUaGUf6ZEU0vT2PjfHjfReu5Z4nomCltowQy30doM3qAbtacgrckOeuqMsAZwf3gwIzY2Buv8c0yK3ozl4HkU9ycvr1hlvR8SnEKQf_QdCYOTLIsv4xSUFY8sbvXHCeQlPp3MJbvooXHdvWqSvN0CUZR6E-I0Dfc0D6iiFQu1CYYyapj0bciY4ixikQpDrRTEbRHXRlLj_IuISetZF2PTPVQajoZmH2HrOz-vagJKZdWdv4pqPLS-F0lqazyi1TLyC64LlWORQ0uMgfhGUQZOCsdJkXJSuDUX0zXjDIlj7uxzOEwB19RRVjKvNnD7A8ArUQfHBsw3L6PL4ixFfn-TOYQP_jb9FK00u-2WaN10bg_RasW5RFkl4xEqTeI3c4yW1fukn8QnqRB_AYXO6gU |
| linkToPdf | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1LSwMxEA5aRfTgW3ybg-BBQ3ebTbs5iloUtYgvPBmyeWihtmV3FfHXm9lHbUEK4nGXZAjJJDPJzPcNQvseGEnPaKIjJUnAtCU8YJxIphltaGehrcqKTTRarfDpid8MofizbPcyJJljGoClqZtW-9pWh4BvPjyx1SAG6dSS1CfRVOB-QVLX7d3j4Cz2aRiWRVvBUyhgM7_LGDFN5QE9Gh7NrE5z4f_jXUTzhceJj3MVWUITpruM5oZ4CN3X9YC8NVlBz60e1kD6Dw9pxMbGYF0kzWSszlh2XnpxO319wyqvCZHgjJr8s-1EGJzk0X0ZZ2StuGfxVbufQLziy-lisooemmf3J-ekKMNAFKVeStwJxD3NA6pozQJmwVBGDZO-DRlTnEUsUmGolYL7XMS1kdQ4vyNi0nrW3b3pGqp0e12zjrD1nf9XNwGlsu70QkUNHlrfiyS1DR7R-gbyyxUQquAoh1IZHfHDrgwzKdxMimwmhetzOOjTzxk6xrY-gIUVsH2dZCULFIIbHxBhiWNweMCs8w10VK6rKPZ1Mkbw5t-a76GZm9OmuLpoXW6h2ZrzlHKA4zaqpPG72UHT6iNtJ_Fups_f-J7y6Q |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=No+dimension-free+deterministic+algorithm+computes+approximate+stationarities+of+Lipschitzians&rft.jtitle=Mathematical+programming&rft.au=Tian%2C+Lai&rft.au=So%2C+Anthony+Man-Cho&rft.date=2024-11-01&rft.pub=Springer&rft.issn=0025-5610&rft.volume=208&rft.issue=1-2&rft.spage=51&rft_id=info:doi/10.1007%2Fs10107-023-02031-6&rft.externalDocID=A812430369 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0025-5610&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0025-5610&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0025-5610&client=summon |