Sparsity in long-time control of neural ODEs

We consider the neural ODE and optimal control perspective of supervised learning, with ℓ1-control penalties, where rather than only minimizing a final cost (the empirical risk) for the state, we integrate this cost over the entire time horizon. We prove that any optimal control (for this cost) vani...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Systems & control letters Ročník 172; s. 105452
Hlavní autoři: Esteve-Yagüe, Carlos, Geshkovski, Borjan
Médium: Journal Article
Jazyk:angličtina
Vydáno: Elsevier B.V 01.02.2023
Témata:
ISSN:0167-6911, 1872-7956
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:We consider the neural ODE and optimal control perspective of supervised learning, with ℓ1-control penalties, where rather than only minimizing a final cost (the empirical risk) for the state, we integrate this cost over the entire time horizon. We prove that any optimal control (for this cost) vanishes beyond some positive stopping time. When seen in the discrete-time context, this result entails an ordered sparsity pattern for the parameters of the associated residual neural network: ordered in the sense that these parameters are all 0 beyond a certain layer. Furthermore, we provide a polynomial stability estimate for the empirical risk with respect to the time horizon. This can be seen as a turnpike property, for nonsmooth dynamics and functionals with ℓ1 penalties, and without any smallness assumptions on the data, both of which are new in the literature.
ISSN:0167-6911
1872-7956
DOI:10.1016/j.sysconle.2022.105452