Convergence of Constant Step Stochastic Gradient Descent for Non-Smooth Non-Convex Functions

This paper studies the asymptotic behavior of the constant step Stochastic Gradient Descent for the minimization of an unknown function, defined as the expectation of a non convex, non smooth, locally Lipschitz random function. As the gradient may not exist, it is replaced by a certain operator: a r...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Set-valued and variational analysis Jg. 30; H. 3; S. 1117 - 1147
Hauptverfasser: Bianchi, Pascal, Hachem, Walid, Schechtman, Sholom
Format: Journal Article
Sprache:Englisch
Veröffentlicht: Dordrecht Springer Netherlands 01.09.2022
Springer Nature B.V
Springer
Schlagworte:
ISSN:1877-0533, 1877-0541
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Abstract This paper studies the asymptotic behavior of the constant step Stochastic Gradient Descent for the minimization of an unknown function, defined as the expectation of a non convex, non smooth, locally Lipschitz random function. As the gradient may not exist, it is replaced by a certain operator: a reasonable choice is to use an element of the Clarke subdifferential of the random function; another choice is the output of the celebrated backpropagation algorithm, which is popular amongst practioners, and whose properties have recently been studied by Bolte and Pauwels. Since the expectation of the chosen operator is not in general an element of the Clarke subdifferential of the mean function, it has been assumed in the literature that an oracle of the Clarke subdifferential of the mean function is available. As a first result, it is shown in this paper that such an oracle is not needed for almost all initialization points of the algorithm. Next, in the small step size regime, it is shown that the interpolated trajectory of the algorithm converges in probability (in the compact convergence sense) towards the set of solutions of a particular differential inclusion: the subgradient flow. Finally, viewing the iterates as a Markov chain whose transition kernel is indexed by the step size, it is shown that the invariant distribution of the kernel converge weakly to the set of invariant distribution of this differential inclusion as the step size tends to zero. These results show that when the step size is small, with large probability, the iterates eventually lie in a neighborhood of the critical points of the mean function.
AbstractList This paper studies the asymptotic behavior of the constant step Stochastic Gradient Descent for the minimization of an unknown function, defined as the expectation of a non convex, non smooth, locally Lipschitz random function. As the gradient may not exist, it is replaced by a certain operator: a reasonable choice is to use an element of the Clarke subdifferential of the random function; another choice is the output of the celebrated backpropagation algorithm, which is popular amongst practioners, and whose properties have recently been studied by Bolte and Pauwels. Since the expectation of the chosen operator is not in general an element of the Clarke subdifferential of the mean function, it has been assumed in the literature that an oracle of the Clarke subdifferential of the mean function is available. As a first result, it is shown in this paper that such an oracle is not needed for almost all initialization points of the algorithm. Next, in the small step size regime, it is shown that the interpolated trajectory of the algorithm converges in probability (in the compact convergence sense) towards the set of solutions of a particular differential inclusion: the subgradient flow. Finally, viewing the iterates as a Markov chain whose transition kernel is indexed by the step size, it is shown that the invariant distribution of the kernel converge weakly to the set of invariant distribution of this differential inclusion as the step size tends to zero. These results show that when the step size is small, with large probability, the iterates eventually lie in a neighborhood of the critical points of the mean function.
This paper studies the asymptotic behavior of the constant step Stochastic Gradient Descent for the minimization of an unknown function F , defined as the expectation of a non convex, non smooth, locally Lipschitz random function. As the gradient may not exist, it is replaced by a certain operator: a reasonable choice is to use an element of the Clarke subdifferential of the random function; an other choice is the output of the celebrated backpropagation algorithm, which is popular amongst practionners, and whose properties have recently been studied by Bolte and Pauwels [7]. Since the expectation of the chosen operator is not in general an element of the Clarke subdifferential BF of the mean function, it has been assumed in the literature that an oracle of BF is available. As a first result, it is shown in this paper that such an oracle is not needed for almost all initialization points of the algorithm. Next, in the small step size regime, it is shown that the interpolated trajectory of the algorithm converges in probability (in the compact convergence sense) towards the set of solutions of the differential inclusion. Finally, viewing the iterates as a Markov chain whose transition kernel is indexed by the step size, it is shown that the invariant distribution of the kernel converge weakly to the set of invariant distribution of this differential inclusion as the step size tends to zero. These results show that when the step size is small, with large probability, the iterates eventually lie in a neighborhood of the critical points of the mean function F .
Author Schechtman, Sholom
Hachem, Walid
Bianchi, Pascal
Author_xml – sequence: 1
  givenname: Pascal
  surname: Bianchi
  fullname: Bianchi, Pascal
  organization: LTCI, Telecom Paris
– sequence: 2
  givenname: Walid
  surname: Hachem
  fullname: Hachem, Walid
  organization: LIGM, CNRS, Université Gustave Eiffel
– sequence: 3
  givenname: Sholom
  orcidid: 0000-0002-5390-4279
  surname: Schechtman
  fullname: Schechtman, Sholom
  email: sholom.schechtman@univ-eiffel.fr
  organization: LIGM, CNRS, Université Gustave Eiffel
BackLink https://hal.science/hal-02564349$$DView record in HAL
BookMark eNp9UE1PAjEUbAwmAvoHPG3iyUO1X-x2jwQFTIge0JtJU0sLS6DFthDl19tl_Ug8kLR9r68zk-l0QMs6qwG4xOgGI1TcBowJ4RARAhHKKYf7E9DGvCgg6jHc-u0pPQOdEJaJg1CJ2-B14OxO-7m2SmfOZOkaorQxm0a9SYdTCxlipbKRl7NKp4c7HVRdjfPZo7NwunYuLg7tQesjG26tilUSOgenRq6CvviuXfAyvH8ejOHkafQw6E-goj0aITGYybeCcZ02V6VhrOAzpIqZ1Iobo8s0QporgqVRDOFcSS7TolopwiTtgutGdyFXYuOrtfSfwslKjPsTUc8Q6eWMsnJHE_aqwW68e9_qEMXSbb1N9gTJS4JpjosyoXiDUt6F4LURqoqy_lT0sloJjESdu2hyT_pEHHIX-0Ql_6g_jo6SaEMKCWzn2v-5OsL6ArzfmIQ
CitedBy_id crossref_primary_10_1007_s10107_023_01936_6
crossref_primary_10_1007_s10107_023_02020_9
crossref_primary_10_1007_s10957_024_02408_3
crossref_primary_10_1007_s10915_025_02798_0
crossref_primary_10_1007_s10957_022_02093_0
crossref_primary_10_1287_moor_2022_0289
crossref_primary_10_1287_moor_2021_0194
crossref_primary_10_1137_21M1468450
crossref_primary_10_1007_s10107_025_02245_w
crossref_primary_10_1016_j_camwa_2024_03_025
crossref_primary_10_1137_23M1619733
crossref_primary_10_1137_22M1479178
crossref_primary_10_1137_22M1513034
Cites_doi 10.1007/978-3-642-75894-2
10.1023/B:CASA.0000012091.84864.65
10.1137/110844192
10.1142/S0219493712500116
10.1007/BF02742069
10.1007/3-540-29587-9
10.4064/ap-54-1-85-91
10.1137/080722059
10.1090/S0002-9947-1979-0546911-1
10.1080/17442508.2018.1539086
10.1137/S0363012904439301
10.1137/060670080
10.1007/978-3-642-69512-4
10.1215/S0012-7094-96-08416-1
10.1007/s11590-020-01537-8
10.1007/s10107-020-01501-5
10.1017/CBO9780511626630
10.1007/s10208-018-09409-5
10.1007/BF01099354
ContentType Journal Article
Copyright The Author(s), under exclusive licence to Springer Nature B.V. 2022
The Author(s), under exclusive licence to Springer Nature B.V. 2022.
Distributed under a Creative Commons Attribution 4.0 International License
Copyright_xml – notice: The Author(s), under exclusive licence to Springer Nature B.V. 2022
– notice: The Author(s), under exclusive licence to Springer Nature B.V. 2022.
– notice: Distributed under a Creative Commons Attribution 4.0 International License
DBID AAYXX
CITATION
1XC
VOOES
DOI 10.1007/s11228-022-00638-z
DatabaseName CrossRef
Hyper Article en Ligne (HAL)
Hyper Article en Ligne (HAL) (Open Access)
DatabaseTitle CrossRef
DatabaseTitleList


DeliveryMethod fulltext_linktorsrc
Discipline Mathematics
EISSN 1877-0541
EndPage 1147
ExternalDocumentID oai:HAL:hal-02564349v3
10_1007_s11228_022_00638_z
GrantInformation_xml – fundername: Conseil Régional, Île-de-France
  funderid: https://doi.org/10.13039/501100003990
GroupedDBID -5D
-5G
-BR
-EM
-Y2
-~C
.VR
06D
0R~
0VY
199
1N0
203
2J2
2JN
2JY
2KG
2LR
2~H
30V
4.4
406
408
409
40D
40E
5VS
6NX
8TC
8UJ
95-
95.
95~
96X
AAAVM
AABHQ
AACDK
AAHNG
AAIAL
AAJBT
AAJKR
AANZL
AARHV
AARTL
AASML
AATNV
AATVU
AAUYE
AAWCG
AAYIU
AAYQN
AAYTO
AAYZH
ABAKF
ABBBX
ABBXA
ABDZT
ABECU
ABFTV
ABHLI
ABHQN
ABJNI
ABJOX
ABKCH
ABKTR
ABMQK
ABNWP
ABQBU
ABQSL
ABSXP
ABTEG
ABTHY
ABTKH
ABTMW
ABULA
ABWNU
ABXPI
ACAOD
ACBXY
ACDTI
ACGFS
ACHSB
ACHXU
ACKNC
ACMDZ
ACMLO
ACOKC
ACOMO
ACPIV
ACSNA
ACZOJ
ADHIR
ADINQ
ADKNI
ADKPE
ADRFC
ADTPH
ADURQ
ADYFF
ADZKW
AEBTG
AEFQL
AEGAL
AEGNC
AEJHL
AEJRE
AEKMD
AEMSY
AENEX
AEOHA
AEPYU
AESKC
AETLH
AEVLU
AEXYK
AFBBN
AFEXP
AFGCZ
AFLOW
AFQWF
AFWTZ
AFZKB
AGAYW
AGDGC
AGJBK
AGMZJ
AGQEE
AGQMX
AGRTI
AGWIL
AGWZB
AGYKE
AHAVH
AHBYD
AHSBF
AHYZX
AIAKS
AIGIU
AIIXL
AILAN
AITGF
AJBLW
AJRNO
AJZVZ
ALMA_UNASSIGNED_HOLDINGS
ALWAN
AMKLP
AMXSW
AMYLF
AOCGG
ARMRJ
AXYYD
AYJHY
AZFZN
B-.
BAPOH
BDATZ
BGNMA
BSONS
CSCUP
DDRTE
DNIVK
DPUIP
DU5
EBLON
EBS
EIOEI
EJD
ESBYG
FEDTE
FERAY
FFXSO
FIGPU
FINBP
FNLPD
FRRFC
FSGXE
FWDCC
GGCAI
GGRSB
GJIRD
GNWQR
GQ6
GQ7
GQ8
H13
HF~
HG6
HLICF
HMJXF
HQYDN
HRMNR
HVGLF
HZ~
IJ-
IKXTQ
ITM
IWAJR
IZIGR
I~X
I~Z
J-C
J0Z
JBSCW
JCJTX
JZLTJ
KOV
LAK
LLZTM
M4Y
MA-
N9A
NB0
NPVJJ
NQJWS
NU0
O9-
O93
O9G
O9J
P9R
PF0
PT4
QOS
R89
R9I
ROL
RSV
S16
S3B
SAP
SCLPG
SDH
SHX
SISQX
SJYHP
SMT
SNE
SNPRN
SNX
SOHCF
SOJ
SPISZ
SRMVM
SSLCW
STPWE
SZN
TSG
TSV
TUC
U2A
UG4
UOJIU
UTJUX
UZXMN
VC2
VFIZW
W23
W48
WK8
YLTOR
Z45
ZMTXR
ZWQNP
~A9
AAPKM
AAYXX
ABBRH
ABDBE
ABFSG
ABRTQ
ACSTC
ADHKG
AEZWR
AFDZB
AFHIU
AFOHR
AGQPQ
AHPBZ
AHWEU
AIXLP
ATHPR
AYFIA
CITATION
1XC
VOOES
ID FETCH-LOGICAL-c353t-2f14ab748e7488c9f4478d0c7daec8ffe99f40e8c21afc4016ca8aa8a3ecc24a3
IEDL.DBID RSV
ISICitedReferencesCount 23
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000781283100001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 1877-0533
IngestDate Sun Oct 19 06:20:28 EDT 2025
Thu Sep 25 00:47:38 EDT 2025
Sat Nov 29 01:59:32 EST 2025
Tue Nov 18 22:25:36 EST 2025
Fri Feb 21 02:45:02 EST 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 3
Keywords Clarke subdifferential
Backpropagation algorithm
Differential inclusions
65K05
Stochastic approximation
65K10
Non convex and non smooth optimization
90C15
34A60
Language English
License Distributed under a Creative Commons Attribution 4.0 International License: http://creativecommons.org/licenses/by/4.0
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c353t-2f14ab748e7488c9f4478d0c7daec8ffe99f40e8c21afc4016ca8aa8a3ecc24a3
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ORCID 0000-0002-5390-4279
0000-0001-8499-2761
OpenAccessLink https://hal.science/hal-02564349
PQID 2692136179
PQPubID 2044238
PageCount 31
ParticipantIDs hal_primary_oai_HAL_hal_02564349v3
proquest_journals_2692136179
crossref_citationtrail_10_1007_s11228_022_00638_z
crossref_primary_10_1007_s11228_022_00638_z
springer_journals_10_1007_s11228_022_00638_z
PublicationCentury 2000
PublicationDate 2022-09-01
PublicationDateYYYYMMDD 2022-09-01
PublicationDate_xml – month: 09
  year: 2022
  text: 2022-09-01
  day: 01
PublicationDecade 2020
PublicationPlace Dordrecht
PublicationPlace_xml – name: Dordrecht
PublicationSubtitle Theory and Applications
PublicationTitle Set-valued and variational analysis
PublicationTitleAbbrev Set-Valued Var. Anal
PublicationYear 2022
Publisher Springer Netherlands
Springer Nature B.V
Springer
Publisher_xml – name: Springer Netherlands
– name: Springer Nature B.V
– name: Springer
References BenvenisteAMétivierMPriouretPAdaptive algorithms and stochastic approximations, Applications of Mathematics (New York), vol. 221990BerlinSpringer0752.93073https://doi.org/10.1007/978-3-642-75894-2. Translated from the French by Stephen S. Wilson
Ruszczyński, A.: Convergence of a stochastic subgradient method with averaging for nonsmooth nonconvex constrained optimization. Optim. Lett. 14. https://doi.org/10.1007/s11590-020-01537-8 (2020)
Paszke, A., Gross, S., Chintala, S., Chanan, G., Yang, E., DeVito, Z., Lin, Z., Desmaison, A., Antiga, L., Lerer, A.: Automatic Differentiation in PyTorch. In: NIPS-W (2017)
Kushner, H.J., Yin, G.G.: Stochastic Approximation and Recursive Algorithms and Applications, Applications of Mathematics (New York), 2nd edn., vol. 35. Springer, New York (2003). Stochastic Modelling and Applied Probability
Has’minskiı̆RZThe averaging principle for parabolic and elliptic differential equations and Markov processes with small diffusionTeor. Verojatnost. i Primenen.19638325161044
Majewski, S., Miasojedow, B., Moulines, E.: Analysis of nonsmooth stochastic approximation: the differential inclusion approach. arXiv:1805.01916(2018)
Davis, D., Drusvyatskiy, D., Kakade, S., Lee, J.D.: Stochastic subgradient method converges on tame functions. Found Comput Math (20), 119–154. https://doi.org/10.1007/s10208-018-09409-5 (2020)
AubinJPCellinaADifferential inclusions, Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences], vol. 2641984BerlinSpringerhttps://doi.org/10.1007/978-3-642-69512-4. Set-valued maps and viability theory
FaureMRothGErgodic properties of weak asymptotic pseudotrajectories for set-valued dynamical systemsStoch. Dyn.20131311250011,23300724910.1142/S0219493712500116https://doi.org/10.1142/S0219493712500116
ClarkeFHLedyaevYSSternRJWolenskiPRNonsmooth Analysis and Control Theory Graduate Texts in Mathematics, vol. 1781998New YorkSpringer
AliprantisCDBorderKCInfinite Dimensional Analysis: a Hitchhiker’s Guide2006BerlinSpringer1156.46001https://doi.org/10.1007/3-540-29587-9
MeynSTweedieRLMarkov Chains and Stochastic Stability20092nd edn.New YorkCambridge University Press10.1017/CBO9780511626630
ErmolievYNorkinVStochastic generalized gradient method for solving nonconvex nonsmooth stochastic optimization problemsCybern. Syst. Anal.199834219621510.1007/BF02742069https://doi.org/10.1007/BF02742069. http://pure.iiasa.ac.at/id/eprint/5415
AubinJPFrankowskaHLasotaAPoincaré’s recurrence theorem for set-valued dynamical systemsAnn. Polon. Math.19915418591113207710.4064/ap-54-1-85-91https://doi.org/10.4064/ap-54-1-85-91
van den DriesLMillerCGeometric categories and o-minimal structuresDuke. Math. J.199684249754014043370889.03025https://doi.org/10.1215/S0012-7094-96-08416-1
BolteJDaniilidisALewisAShiotaMClarke subgradients of stratifiable functionsSIAM J. Optim.2007182556572233845110.1137/060670080
NorkinVGeneralized-differentiable functionsCybern. Syst. Anal.198016101210.1007/BF01099354https://doi.org/10.1007/BF01099354
Bolte, J., Pauwels, E.: Conservative set valued fields, automatic differentiation, stochastic gradient method and deep learning. arXiv:1909.10300(2019)
BenaïmMHofbauerJSorinSStochastic approximations and differential inclusionsSIAM J. Control Optim.2005441328348217715910.1137/S0363012904439301(electronic). https://doi.org/10.1137/S0363012904439301
Kakade, S., Lee, J.D.: Provably correct automatic sub-differentiation for qualified programs. In: Bengio, S., Wallach, H., Larochelle, H., Grauman, K., Cesa-Bianchi, N., Garnett, R. (eds.) Advances in Neural Information Processing Systems. http://papers.nips.cc/paper/7943-provably-correct-automatic-sub-differentiation-for-qualified-programs.pdf, vol. 31, pp 7125–7135. Curran Associates, Inc (2018)
Mikhalevich, V., Gupal, A., Norkin, V.: Methods of nonconvex optimization. Nauka (1987)
FollandGReal Analysis: Modern Techniques and Their Applications. Pure and Applied Mathematics: A Wiley Series of Texts, Monographs and Tracts2013HobokenWileyhttps://books.google.fr/books?id=wI4fAwAAQBAJ
LebourgGGeneric differentiability of Lipschitzian functionsTransactions of the American Mathematical Society197925612514454691110.1090/S0002-9947-1979-0546911-1http://www.jstor.org/stable/1998104
IoffeADAn invitation to tame optimizationSIAM J. Optim.200919418941917248605510.1137/080722059https://doi.org/10.1137/080722059
ErmolievYMNorkinVSolution of nonconvex nonsmooth stochastic optimization problemsCybern. Syst. Anal.200339570171510.1023/B:CASA.0000012091.84864.65
BianchiPHachemWSalimAConstant step stochastic approximations involving differential inclusions: stability, long-run convergence and applicationsStochastics2019912288320389586710.1080/17442508.2018.1539086https://doi.org/10.1080/17442508.2018.1539086
RothGSandholmWHStochastic approximations with constant step size and differential inclusionsSIAM J. Control Optim.2013511525555303288610.1137/110844192https://doi.org/10.1137/110844192
AD Ioffe (638_CR17) 2009; 19
G Roth (638_CR26) 2013; 51
CD Aliprantis (638_CR1) 2006
JP Aubin (638_CR3) 1991; 54
G Folland (638_CR15) 2013
638_CR25
FH Clarke (638_CR9) 1998
638_CR23
Y Ermoliev (638_CR12) 1998; 34
638_CR21
YM Ermoliev (638_CR13) 2003; 39
M Faure (638_CR14) 2013; 13
L van den Dries (638_CR11) 1996; 84
638_CR8
638_CR27
A Benveniste (638_CR5) 1990
JP Aubin (638_CR2) 1984
G Lebourg (638_CR20) 1979; 256
S Meyn (638_CR22) 2009
RZ Has’minskiı̆ (638_CR16) 1963; 8
P Bianchi (638_CR6) 2019; 91
V Norkin (638_CR24) 1980; 16
M Benaïm (638_CR4) 2005; 44
638_CR10
J Bolte (638_CR7) 2007; 18
638_CR19
638_CR18
References_xml – reference: AubinJPCellinaADifferential inclusions, Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences], vol. 2641984BerlinSpringerhttps://doi.org/10.1007/978-3-642-69512-4. Set-valued maps and viability theory
– reference: FaureMRothGErgodic properties of weak asymptotic pseudotrajectories for set-valued dynamical systemsStoch. Dyn.20131311250011,23300724910.1142/S0219493712500116https://doi.org/10.1142/S0219493712500116
– reference: ErmolievYMNorkinVSolution of nonconvex nonsmooth stochastic optimization problemsCybern. Syst. Anal.200339570171510.1023/B:CASA.0000012091.84864.65
– reference: MeynSTweedieRLMarkov Chains and Stochastic Stability20092nd edn.New YorkCambridge University Press10.1017/CBO9780511626630
– reference: AliprantisCDBorderKCInfinite Dimensional Analysis: a Hitchhiker’s Guide2006BerlinSpringer1156.46001https://doi.org/10.1007/3-540-29587-9
– reference: Ruszczyński, A.: Convergence of a stochastic subgradient method with averaging for nonsmooth nonconvex constrained optimization. Optim. Lett. 14. https://doi.org/10.1007/s11590-020-01537-8 (2020)
– reference: Davis, D., Drusvyatskiy, D., Kakade, S., Lee, J.D.: Stochastic subgradient method converges on tame functions. Found Comput Math (20), 119–154. https://doi.org/10.1007/s10208-018-09409-5 (2020)
– reference: Bolte, J., Pauwels, E.: Conservative set valued fields, automatic differentiation, stochastic gradient method and deep learning. arXiv:1909.10300(2019)
– reference: BolteJDaniilidisALewisAShiotaMClarke subgradients of stratifiable functionsSIAM J. Optim.2007182556572233845110.1137/060670080
– reference: AubinJPFrankowskaHLasotaAPoincaré’s recurrence theorem for set-valued dynamical systemsAnn. Polon. Math.19915418591113207710.4064/ap-54-1-85-91https://doi.org/10.4064/ap-54-1-85-91
– reference: NorkinVGeneralized-differentiable functionsCybern. Syst. Anal.198016101210.1007/BF01099354https://doi.org/10.1007/BF01099354
– reference: RothGSandholmWHStochastic approximations with constant step size and differential inclusionsSIAM J. Control Optim.2013511525555303288610.1137/110844192https://doi.org/10.1137/110844192
– reference: ClarkeFHLedyaevYSSternRJWolenskiPRNonsmooth Analysis and Control Theory Graduate Texts in Mathematics, vol. 1781998New YorkSpringer
– reference: ErmolievYNorkinVStochastic generalized gradient method for solving nonconvex nonsmooth stochastic optimization problemsCybern. Syst. Anal.199834219621510.1007/BF02742069https://doi.org/10.1007/BF02742069. http://pure.iiasa.ac.at/id/eprint/5415/
– reference: FollandGReal Analysis: Modern Techniques and Their Applications. Pure and Applied Mathematics: A Wiley Series of Texts, Monographs and Tracts2013HobokenWileyhttps://books.google.fr/books?id=wI4fAwAAQBAJ
– reference: IoffeADAn invitation to tame optimizationSIAM J. Optim.200919418941917248605510.1137/080722059https://doi.org/10.1137/080722059
– reference: Majewski, S., Miasojedow, B., Moulines, E.: Analysis of nonsmooth stochastic approximation: the differential inclusion approach. arXiv:1805.01916(2018)
– reference: Paszke, A., Gross, S., Chintala, S., Chanan, G., Yang, E., DeVito, Z., Lin, Z., Desmaison, A., Antiga, L., Lerer, A.: Automatic Differentiation in PyTorch. In: NIPS-W (2017)
– reference: van den DriesLMillerCGeometric categories and o-minimal structuresDuke. Math. J.199684249754014043370889.03025https://doi.org/10.1215/S0012-7094-96-08416-1
– reference: Mikhalevich, V., Gupal, A., Norkin, V.: Methods of nonconvex optimization. Nauka (1987)
– reference: BenvenisteAMétivierMPriouretPAdaptive algorithms and stochastic approximations, Applications of Mathematics (New York), vol. 221990BerlinSpringer0752.93073https://doi.org/10.1007/978-3-642-75894-2. Translated from the French by Stephen S. Wilson
– reference: BenaïmMHofbauerJSorinSStochastic approximations and differential inclusionsSIAM J. Control Optim.2005441328348217715910.1137/S0363012904439301(electronic). https://doi.org/10.1137/S0363012904439301
– reference: Kushner, H.J., Yin, G.G.: Stochastic Approximation and Recursive Algorithms and Applications, Applications of Mathematics (New York), 2nd edn., vol. 35. Springer, New York (2003). Stochastic Modelling and Applied Probability
– reference: Kakade, S., Lee, J.D.: Provably correct automatic sub-differentiation for qualified programs. In: Bengio, S., Wallach, H., Larochelle, H., Grauman, K., Cesa-Bianchi, N., Garnett, R. (eds.) Advances in Neural Information Processing Systems. http://papers.nips.cc/paper/7943-provably-correct-automatic-sub-differentiation-for-qualified-programs.pdf, vol. 31, pp 7125–7135. Curran Associates, Inc (2018)
– reference: BianchiPHachemWSalimAConstant step stochastic approximations involving differential inclusions: stability, long-run convergence and applicationsStochastics2019912288320389586710.1080/17442508.2018.1539086https://doi.org/10.1080/17442508.2018.1539086
– reference: Has’minskiı̆RZThe averaging principle for parabolic and elliptic differential equations and Markov processes with small diffusionTeor. Verojatnost. i Primenen.19638325161044
– reference: LebourgGGeneric differentiability of Lipschitzian functionsTransactions of the American Mathematical Society197925612514454691110.1090/S0002-9947-1979-0546911-1http://www.jstor.org/stable/1998104
– volume-title: Adaptive algorithms and stochastic approximations, Applications of Mathematics (New York), vol. 22
  year: 1990
  ident: 638_CR5
  doi: 10.1007/978-3-642-75894-2
– volume: 39
  start-page: 701
  issue: 5
  year: 2003
  ident: 638_CR13
  publication-title: Cybern. Syst. Anal.
  doi: 10.1023/B:CASA.0000012091.84864.65
– volume: 51
  start-page: 525
  issue: 1
  year: 2013
  ident: 638_CR26
  publication-title: SIAM J. Control Optim.
  doi: 10.1137/110844192
– volume: 13
  start-page: 1250011,23
  issue: 1
  year: 2013
  ident: 638_CR14
  publication-title: Stoch. Dyn.
  doi: 10.1142/S0219493712500116
– volume: 34
  start-page: 196
  issue: 2
  year: 1998
  ident: 638_CR12
  publication-title: Cybern. Syst. Anal.
  doi: 10.1007/BF02742069
– volume-title: Infinite Dimensional Analysis: a Hitchhiker’s Guide
  year: 2006
  ident: 638_CR1
  doi: 10.1007/3-540-29587-9
– volume: 54
  start-page: 85
  issue: 1
  year: 1991
  ident: 638_CR3
  publication-title: Ann. Polon. Math.
  doi: 10.4064/ap-54-1-85-91
– volume: 19
  start-page: 1894
  issue: 4
  year: 2009
  ident: 638_CR17
  publication-title: SIAM J. Optim.
  doi: 10.1137/080722059
– volume: 256
  start-page: 125
  year: 1979
  ident: 638_CR20
  publication-title: Transactions of the American Mathematical Society
  doi: 10.1090/S0002-9947-1979-0546911-1
– volume-title: Nonsmooth Analysis and Control Theory Graduate Texts in Mathematics, vol. 178
  year: 1998
  ident: 638_CR9
– volume: 91
  start-page: 288
  issue: 2
  year: 2019
  ident: 638_CR6
  publication-title: Stochastics
  doi: 10.1080/17442508.2018.1539086
– volume-title: Real Analysis: Modern Techniques and Their Applications. Pure and Applied Mathematics: A Wiley Series of Texts, Monographs and Tracts
  year: 2013
  ident: 638_CR15
– ident: 638_CR18
– ident: 638_CR23
– volume: 44
  start-page: 328
  issue: 1
  year: 2005
  ident: 638_CR4
  publication-title: SIAM J. Control Optim.
  doi: 10.1137/S0363012904439301
– volume: 18
  start-page: 556
  issue: 2
  year: 2007
  ident: 638_CR7
  publication-title: SIAM J. Optim.
  doi: 10.1137/060670080
– volume-title: Differential inclusions, Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences], vol. 264
  year: 1984
  ident: 638_CR2
  doi: 10.1007/978-3-642-69512-4
– volume: 84
  start-page: 497
  issue: 2
  year: 1996
  ident: 638_CR11
  publication-title: Duke. Math. J.
  doi: 10.1215/S0012-7094-96-08416-1
– ident: 638_CR21
– ident: 638_CR25
– ident: 638_CR27
  doi: 10.1007/s11590-020-01537-8
– ident: 638_CR8
  doi: 10.1007/s10107-020-01501-5
– volume-title: Markov Chains and Stochastic Stability
  year: 2009
  ident: 638_CR22
  doi: 10.1017/CBO9780511626630
– ident: 638_CR10
  doi: 10.1007/s10208-018-09409-5
– volume: 16
  start-page: 10
  year: 1980
  ident: 638_CR24
  publication-title: Cybern. Syst. Anal.
  doi: 10.1007/BF01099354
– volume: 8
  start-page: 3
  year: 1963
  ident: 638_CR16
  publication-title: Teor. Verojatnost. i Primenen.
– ident: 638_CR19
SSID ssj0070091
Score 2.5110884
Snippet This paper studies the asymptotic behavior of the constant step Stochastic Gradient Descent for the minimization of an unknown function, defined as the...
This paper studies the asymptotic behavior of the constant step Stochastic Gradient Descent for the minimization of an unknown function F , defined as the...
SourceID hal
proquest
crossref
springer
SourceType Open Access Repository
Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 1117
SubjectTerms Algorithms
Analysis
Asymptotic properties
Back propagation
Convergence
Critical point
Invariants
Kernels
Markov chains
Mathematics
Mathematics and Statistics
Numerical Analysis
Operators (mathematics)
Optimization
Optimization and Control
Probability theory
Title Convergence of Constant Step Stochastic Gradient Descent for Non-Smooth Non-Convex Functions
URI https://link.springer.com/article/10.1007/s11228-022-00638-z
https://www.proquest.com/docview/2692136179
https://hal.science/hal-02564349
Volume 30
WOSCitedRecordID wos000781283100001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVAVX
  databaseName: Springer Journals New Starts & Take-Overs Collection
  customDbUrl:
  eissn: 1877-0541
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0070091
  issn: 1877-0533
  databaseCode: RSV
  dateStart: 20090601
  isFulltext: true
  titleUrlDefault: https://link.springer.com/search?facet-content-type=%22Journal%22
  providerName: Springer Nature
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3dS8MwED90-qAPfovziyC-aaBNsjZ5HNPpgw7BD3wQSpYPJugqWxXxrzfJ2k1FBYW2lDZNy9019zty9wvAPiFdwqSwWOgowqxrOJbcWmxiI7huECOZCotNpJ0Ov70VF2VR2LDKdq-mJMNIPSl2i4lnU3bBU_Cz-G0aZhqebcbH6Jc31fibOtQQwiyepthXmpalMt_38ckdTfd8MuQHpPllcjT4nPbi_752CRZKjImaI6NYhinTX4H58zFB63AV7lo-2zwUXhqUW9QawcQC-awvd8hVT3oGZ3QyCDlhBToa8T4hB3JRJ-_jy8fcKTmchr5eUdu5yGDFa3DdPr5qneJyoQWsaIMWmNiYyW7KuHE7V8IylnIdqVRLo5zmjHCXIsMViaVVLiJLlOTSbdQZgFM1XYdaP--bDUANzWlMFNUxl8wIyhNttEi45lFirCV1iCt5Z6pkIfeLYTxkE_5kL7nMSS4Lksve6nAwfuZpxMHxa-s9p8ZxQ0-ffdo8y_w1j-8YZeKF1mG70nJW_rTDjCSCxNRBOlGHw0qrk9s_v3Lzb823YI4Ew_CZattQKwbPZgdm1UtxPxzsBmN-BwyT7xg
linkProvider Springer Nature
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3da9swED-WbtDuofvoyrJlmxh7WwW2pNjSY0mbZSwJg2QjDwWh6IMMWnskXin96yspdtKNrbCCbYwsyeburLtDd78D-EDInDAlHBYmSTCbW44Vdw7b1ApuusQqpmOxiXw85rOZ-Fonha2aaPdmSzKu1Ntkt5QENGXvPEU9i69b8JCFMjvBR598b9bf3FsN0c3ieY5DpmmdKvP3OX5TR61FCIa8ZWn-sTkadU7_yf2-9ins1zYmOl4LxTN4YIvn8Hi0AWhdHcBZL0Sbx8RLi0qHemszsUIh6stfSr1QAcEZfVrGmLAKnaxxn5A3ctG4LPDkovRMjrdxrivU9yoySvEL-NY_nfYGuC60gDXt0goTlzI1zxm3_uRaOMZybhKdG2W155wVvimxXJNUOe09skwrrvxBvQB4VtND2CnKwr4E1DWcpkRTk3LFrKA8M9aIjBueZNY50oa0obfUNQp5KIZxLrf4yYFy0lNORsrJ6zZ83Iz5ucbguLP3e8_GTccAnz04HsrQFuw7Rpm4pG3oNFyW9U-7kiQTJKXepBNtOGq4un3871e--r_u72B3MB0N5fDz-Mtr2CNRSELUWgd2quUv-wYe6cvqx2r5Ngr2DUk_8fw
linkToPdf http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV3da9RAEB_aKqIP9avFq1UX8U2XJrt7ye5jufZasR7FqvRBWPb2gxY0KXexlP71zm6SuyoqiJCEsNlswswkM8P-5rcArxibMmFUoMplGRVTL6mRIVCfeyXdkHkjbFpsopxM5OmpOr5RxZ_Q7v2UZFvTEFmaqmbnwoWdZeFbziKzMiZSyefS61W4JTCTiaCuDyef-39xiRFESrlkWdJYddqVzfx-jJ9c0-pZBEbeiDp_mShN_md8___f_AGsd7En2W2N5SGs-OoR3Hu_IG6dP4Yvo4hCTwWZntSBjNrwsSERDYaH2p6ZyOxMDmYJK9aQvZYPimDwSyZ1RU--1aj8dJrGuiJjdJ3Jujfg03j_4-iQdgswUMuHvKEs5MJMSyE97tKqIEQpXWZLZ7xFjXqFTZmXluUmWMzUCmukwY2jYaAJ8E1Yq-rKPwEydJLnzHKXSyO84rJw3qlCOpkVPgQ2gLyXvbYdO3lcJOOrXvIqR8lplJxOktPXA3i9uOei5eb4a--XqNJFx0irfbh7pGNbjPsEF-qSD2C717juPua5ZoViOcdQTw3gTa_h5eU_P3Lr37q_gDvHe2N99Hby7incZclGIphtG9aa2Xf_DG7by-Z8PnuebPwHd4r64A
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Convergence+of+Constant+Step+Stochastic+Gradient+Descent+for+Non-Smooth+Non-Convex+Functions&rft.jtitle=Set-valued+and+variational+analysis&rft.au=Bianchi%2C+Pascal&rft.au=Hachem%2C+Walid&rft.au=Schechtman%2C+Sholom&rft.date=2022-09-01&rft.pub=Springer+Netherlands&rft.issn=1877-0533&rft.eissn=1877-0541&rft.volume=30&rft.issue=3&rft.spage=1117&rft.epage=1147&rft_id=info:doi/10.1007%2Fs11228-022-00638-z&rft.externalDocID=10_1007_s11228_022_00638_z
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1877-0533&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1877-0533&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1877-0533&client=summon