An adaptive dynamic programming-based algorithm for infinite-horizon linear quadratic stochastic optimal control problems
This paper develops a novel adaptive dynamic programming (ADP)-based model-free policy iteration (PI) algorithm to solve an infinite-horizon continuous-time linear quadratic stochastic (LQS) optimal control problem, where the diffusion term in system dynamics contains both control and state variable...
Saved in:
| Published in: | Journal of applied mathematics & computing Vol. 69; no. 3; pp. 2741 - 2760 |
|---|---|
| Main Author: | |
| Format: | Journal Article |
| Language: | English |
| Published: |
Berlin/Heidelberg
Springer Berlin Heidelberg
01.06.2023
Springer Nature B.V |
| Subjects: | |
| ISSN: | 1598-5865, 1865-2085 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | This paper develops a novel adaptive dynamic programming (ADP)-based model-free policy iteration (PI) algorithm to solve an infinite-horizon continuous-time linear quadratic stochastic (LQS) optimal control problem, where the diffusion term in system dynamics contains both control and state variables. First, we apply Ito’s lemma and take expectations to describe a relationship among the state trajectory, the control input and the matrices to be solved. Then, without needing the information of all system coefficient matrices, the ADP-based model-free algorithm is developed to approximate the optimal control from the collected data. Moreover, we give the convergence analysis under some mild conditions. Finally, a numerical example and an illustrative application are served to show that the proposed algorithm is effective. |
|---|---|
| Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ISSN: | 1598-5865 1865-2085 |
| DOI: | 10.1007/s12190-023-01857-9 |