Reinforcement-Learning-Based Robust Controller Design for Continuous-Time Uncertain Nonlinear Systems Subject to Input Constraints
The design of stabilizing controller for uncertain nonlinear systems with control constraints is a challenging problem. The constrained-input coupled with the inability to identify accurately the uncertainties motivates the design of stabilizing controller based on reinforcement-learning (RL) method...
Uloženo v:
| Vydáno v: | IEEE transactions on cybernetics Ročník 45; číslo 7; s. 1372 - 1385 |
|---|---|
| Hlavní autoři: | , , , |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
United States
IEEE
01.07.2015
|
| Témata: | |
| ISSN: | 2168-2267, 2168-2275 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Shrnutí: | The design of stabilizing controller for uncertain nonlinear systems with control constraints is a challenging problem. The constrained-input coupled with the inability to identify accurately the uncertainties motivates the design of stabilizing controller based on reinforcement-learning (RL) methods. In this paper, a novel RL-based robust adaptive control algorithm is developed for a class of continuous-time uncertain nonlinear systems subject to input constraints. The robust control problem is converted to the constrained optimal control problem with appropriately selecting value functions for the nominal system. Distinct from typical action-critic dual networks employed in RL, only one critic neural network (NN) is constructed to derive the approximate optimal control. Meanwhile, unlike initial stabilizing control often indispensable in RL, there is no special requirement imposed on the initial control. By utilizing Lyapunov's direct method, the closed-loop optimal control system and the estimated weights of the critic NN are proved to be uniformly ultimately bounded. In addition, the derived approximate optimal control is verified to guarantee the uncertain nonlinear system to be stable in the sense of uniform ultimate boundedness. Two simulation examples are provided to illustrate the effectiveness and applicability of the present approach. |
|---|---|
| Bibliografie: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
| ISSN: | 2168-2267 2168-2275 |
| DOI: | 10.1109/TCYB.2015.2417170 |