Novel Discounted Adaptive Critic Control Designs With Accelerated Learning Formulation

Inspired by the successive relaxation method, a novel discounted iterative adaptive dynamic programming framework is developed, in which the iterative value function sequence possesses an adjustable convergence rate. The different convergence properties of the value function sequence and the stabili...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:IEEE transactions on cybernetics Ročník 54; číslo 5; s. 1 - 14
Hlavní autori: Ha, Mingming, Wang, Ding, Liu, Derong
Médium: Journal Article
Jazyk:English
Vydavateľské údaje: United States IEEE 01.05.2024
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Predmet:
ISSN:2168-2267, 2168-2275, 2168-2275
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Popis
Shrnutí:Inspired by the successive relaxation method, a novel discounted iterative adaptive dynamic programming framework is developed, in which the iterative value function sequence possesses an adjustable convergence rate. The different convergence properties of the value function sequence and the stability of the closed-loop systems under the new discounted value iteration (VI) are investigated. Based on the properties of the given VI scheme, an accelerated learning algorithm with convergence guarantee is presented. Moreover, the implementations of the new VI scheme and its accelerated learning design are elaborated, which involve value function approximation and policy improvement. A nonlinear fourth-order ball-and-beam balancing plant is used to verify the performance of the developed approaches. Compared with the traditional VI, the present discounted iterative adaptive critic designs greatly accelerate the convergence rate of the value function and reduce the computational cost simultaneously.
Bibliografia:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:2168-2267
2168-2275
2168-2275
DOI:10.1109/TCYB.2022.3233593