Neural-network-based learning algorithms for cooperative games of discrete-time multi-player systems with control constraints via adaptive dynamic programming

Adaptive dynamic programming (ADP), an important branch of reinforcement learning, is a powerful tool in solving various optimal control problems. However, the cooperative game issues of discrete-time multi-player systems with control constraints have rarely been investigated in this field. In order...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Neurocomputing (Amsterdam) Ročník 344; s. 13 - 19
Hlavní autoři: Jiang, He, Zhang, Huaguang, Xie, Xiangpeng, Han, Ji
Médium: Journal Article
Jazyk:angličtina
Vydáno: Elsevier B.V 07.06.2019
Témata:
ISSN:0925-2312, 1872-8286
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Adaptive dynamic programming (ADP), an important branch of reinforcement learning, is a powerful tool in solving various optimal control problems. However, the cooperative game issues of discrete-time multi-player systems with control constraints have rarely been investigated in this field. In order to address this issue, a novel policy iteration (PI) algorithm is proposed based on ADP technique, and its associated convergence analysis is also studied in this brief paper. For the proposed PI algorithm, an online neural network (NN) implementation scheme with multiple-network structure is presented. In the online NN-based learning algorithm, critic network, constrained actor networks and unconstrained actor networks are employed to approximate the value function, constrained and unconstrained control policies, respectively, and the NN weight updating laws are designed based on the gradient descent method. Finally, a numerical simulation example is illustrated to show the effectiveness.
ISSN:0925-2312
1872-8286
DOI:10.1016/j.neucom.2018.02.107