An adaptive dynamic programming-based algorithm for infinite-horizon linear quadratic stochastic optimal control problems
This paper develops a novel adaptive dynamic programming (ADP)-based model-free policy iteration (PI) algorithm to solve an infinite-horizon continuous-time linear quadratic stochastic (LQS) optimal control problem, where the diffusion term in system dynamics contains both control and state variable...
Uloženo v:
| Vydáno v: | Journal of applied mathematics & computing Ročník 69; číslo 3; s. 2741 - 2760 |
|---|---|
| Hlavní autor: | |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
Berlin/Heidelberg
Springer Berlin Heidelberg
01.06.2023
Springer Nature B.V |
| Témata: | |
| ISSN: | 1598-5865, 1865-2085 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Shrnutí: | This paper develops a novel adaptive dynamic programming (ADP)-based model-free policy iteration (PI) algorithm to solve an infinite-horizon continuous-time linear quadratic stochastic (LQS) optimal control problem, where the diffusion term in system dynamics contains both control and state variables. First, we apply Ito’s lemma and take expectations to describe a relationship among the state trajectory, the control input and the matrices to be solved. Then, without needing the information of all system coefficient matrices, the ADP-based model-free algorithm is developed to approximate the optimal control from the collected data. Moreover, we give the convergence analysis under some mild conditions. Finally, a numerical example and an illustrative application are served to show that the proposed algorithm is effective. |
|---|---|
| Bibliografie: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ISSN: | 1598-5865 1865-2085 |
| DOI: | 10.1007/s12190-023-01857-9 |