Adaptive Event-Triggered Output Synchronization of Heterogeneous Multiagent Systems: A Model-Free Reinforcement Learning Approach

This paper proposes a reinforcement learning approach to the output synchronization problem for heterogeneous leader-follower multi-agent systems, where the system dynamics of all agents are completely unknown. First, to solve the challenge caused by unknown dynamics of the leader, we develop an exp...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on signal and information processing over networks Jg. 11; S. 604 - 616
Hauptverfasser:	Hu, Wenfeng, Wang, Xuan, Guo, Meichen, Luo, Biao, Huang, Tingwen
Format:	Journal Article
Sprache:	Englisch
Veröffentlicht:	Piscataway IEEE 2025 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Schlagworte:	Adaptation models Algorithms Dynamic programming Event detection Event-triggered mechanism (ETM) experience-replay heterogeneous multi-agent systems (HMASs) Heuristic algorithms Machine learning Multi-agent systems Multiagent systems Observers Optimal control output synchronization Power system dynamics Reinforcement learning reinforcement learning (RL) Synchronism Synchronization System dynamics Teaching methods Vehicle dynamics
ISSN:	2373-776X, 2373-7778
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	This paper proposes a reinforcement learning approach to the output synchronization problem for heterogeneous leader-follower multi-agent systems, where the system dynamics of all agents are completely unknown. First, to solve the challenge caused by unknown dynamics of the leader, we develop an experience-replay learning method to estimate the leader's dynamics, which only uses the leader's past state and output information as training data. Second, based on the newly estimated leader's dynamics, we design an event-triggered observer for each follower to estimate the leader's state and output. Furthermore, the experience-replay learning method and the event-triggered leader observer are co-designed, which ensures the convergence and Zeno behavior exclusion. Subsequently, to free the followers from reliance on system dynamics, a data-driven adaptive dynamic programming (ADP) method is presented to iteratively derive the optimal control gains, based on which we design a policy iteration (PI) algorithm for output synchronization. Finally, the proposed algorithm's performance is validated through a simulation.
Bibliographie:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	2373-776X 2373-7778
DOI:	10.1109/TSIPN.2025.3578759