Online Adaptive Policy Learning Algorithm for H State Feedback Control of Unknown Affine Nonlinear Discrete-Time Systems
The problem of H ∞ state feedback control of affine nonlinear discrete-time systems with unknown dynamics is investigated in this paper. An online adaptive policy learning algorithm (APLA) based on adaptive dynamic programming (ADP) is proposed for learning in real-time the solution to the Hamilton-...
Uložené v:
| Vydané v: | IEEE transactions on cybernetics Ročník 44; číslo 12; s. 2706 - 2718 |
|---|---|
| Hlavní autori: | , , , |
| Médium: | Journal Article |
| Jazyk: | English |
| Vydavateľské údaje: |
United States
IEEE
01.12.2014
|
| Predmet: | |
| ISSN: | 2168-2267, 2168-2275, 2168-2275 |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Shrnutí: | The problem of H ∞ state feedback control of affine nonlinear discrete-time systems with unknown dynamics is investigated in this paper. An online adaptive policy learning algorithm (APLA) based on adaptive dynamic programming (ADP) is proposed for learning in real-time the solution to the Hamilton-Jacobi-Isaacs (HJI) equation, which appears in the H ∞ control problem. In the proposed algorithm, three neural networks (NNs) are utilized to find suitable approximations of the optimal value function and the saddle point feedback control and disturbance policies. Novel weight updating laws are given to tune the critic, actor, and disturbance NNs simultaneously by using data generated in real-time along the system trajectories. Considering NN approximation errors, we provide the stability analysis of the proposed algorithm with Lyapunov approach. Moreover, the need of the system input dynamics for the proposed algorithm is relaxed by using a NN identification scheme. Finally, simulation examples show the effectiveness of the proposed algorithm. |
|---|---|
| Bibliografia: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
| ISSN: | 2168-2267 2168-2275 2168-2275 |
| DOI: | 10.1109/TCYB.2014.2313915 |