Stochastic linear quadratic optimal tracking control for discrete-time systems with delays based on Q-learning algorithm
In this paper, a reinforcement Q-learning method based on value iteration (Ⅵ) is proposed for a class of model-free stochastic linear quadratic (SLQ) optimal tracking problem with time delay. Compared with the traditional reinforcement learning method, Q-learning method avoids the need for accurate...
Uložené v:
| Vydané v: | AIMS mathematics Ročník 8; číslo 5; s. 10249 - 10265 |
|---|---|
| Hlavní autori: | , , |
| Médium: | Journal Article |
| Jazyk: | English |
| Vydavateľské údaje: |
AIMS Press
01.01.2023
|
| Predmet: | |
| ISSN: | 2473-6988, 2473-6988 |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Shrnutí: | In this paper, a reinforcement Q-learning method based on value iteration (Ⅵ) is proposed for a class of model-free stochastic linear quadratic (SLQ) optimal tracking problem with time delay. Compared with the traditional reinforcement learning method, Q-learning method avoids the need for accurate system model. Firstly, the delay operator is introduced to construct a novel augmented system composed of the original system and the command generator. Secondly, the SLQ optimal tracking problem is transformed into a deterministic one by system transformation and the corresponding Q function of SLQ optimal tracking control is derived. Based on this, Q-learning algorithm is proposed and its convergence is proved. Finally, a simulation example shows the effectiveness of the proposed algorithm. |
|---|---|
| ISSN: | 2473-6988 2473-6988 |
| DOI: | 10.3934/math.2023519 |