Optimal Robust Output Containment of Unknown Heterogeneous Multiagent System Using Off-Policy Reinforcement Learning

This paper investigates optimal robust output containment problem of general linear heterogeneous multiagent systems (MAS) with completely unknown dynamics. A model-based algorithm using offline policy iteration (PI) is first developed, where the <inline-formula> <tex-math notation="La...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on cybernetics Jg. 48; H. 11; S. 3197 - 3207
Hauptverfasser: Zuo, Shan, Song, Yongduan, Lewis, Frank L., Davoudi, Ali
Format: Journal Article
Sprache:Englisch
Veröffentlicht: United States IEEE 01.11.2018
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Schlagworte:
ISSN:2168-2267, 2168-2275, 2168-2275
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:This paper investigates optimal robust output containment problem of general linear heterogeneous multiagent systems (MAS) with completely unknown dynamics. A model-based algorithm using offline policy iteration (PI) is first developed, where the <inline-formula> <tex-math notation="LaTeX">{p} </tex-math></inline-formula>-copy internal model principle is utilized to address the system parameter variations. This offline PI algorithm requires the nominal model of each agent, which may not be available in most real-world applications. To address this issue, a discounted performance function is introduced to express the optimal robust output containment problem as an optimal output-feedback design problem with bounded <inline-formula> <tex-math notation="LaTeX">{\mathcal {L}_{2}} </tex-math></inline-formula>-gain. To solve this problem online in real time, a Bellman equation is first developed to evaluate a certain control policy and find the updated control policies, simultaneously, using only the state/output information measured online. Then, using this Bellman equation, a model-free off-policy integral reinforcement learning algorithm is proposed to solve the optimal robust output containment problem of heterogeneous MAS, in real time, without requiring any knowledge of the system dynamics. Simulation results are provided to verify the effectiveness of the proposed method.
Bibliographie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:2168-2267
2168-2275
2168-2275
DOI:10.1109/TCYB.2017.2761878