Neural-network-based learning algorithms for cooperative games of discrete-time multi-player systems with control constraints via adaptive dynamic programming

Adaptive dynamic programming (ADP), an important branch of reinforcement learning, is a powerful tool in solving various optimal control problems. However, the cooperative game issues of discrete-time multi-player systems with control constraints have rarely been investigated in this field. In order...

Full description

Saved in:

Bibliographic Details
Published in:	Neurocomputing (Amsterdam) Vol. 344; pp. 13 - 19
Main Authors:	Jiang, He, Zhang, Huaguang, Xie, Xiangpeng, Han, Ji
Format:	Journal Article
Language:	English
Published:	Elsevier B.V 07.06.2019
Subjects:	Adaptive dynamic programming Approximate dynamic programming Neural networks Reinforcement learning Approximate dynamic programming Adaptive dynamic programming Neural networks Reinforcement learning
ISSN:	0925-2312, 1872-8286
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Adaptive dynamic programming (ADP), an important branch of reinforcement learning, is a powerful tool in solving various optimal control problems. However, the cooperative game issues of discrete-time multi-player systems with control constraints have rarely been investigated in this field. In order to address this issue, a novel policy iteration (PI) algorithm is proposed based on ADP technique, and its associated convergence analysis is also studied in this brief paper. For the proposed PI algorithm, an online neural network (NN) implementation scheme with multiple-network structure is presented. In the online NN-based learning algorithm, critic network, constrained actor networks and unconstrained actor networks are employed to approximate the value function, constrained and unconstrained control policies, respectively, and the NN weight updating laws are designed based on the gradient descent method. Finally, a numerical simulation example is illustrated to show the effectiveness.
ISSN:	0925-2312 1872-8286
DOI:	10.1016/j.neucom.2018.02.107