A data-driven α-policy iteration algorithm for optimal leader-following consensus of discrete-time multi-agent systems

In this paper, the data-driven α-policy iteration (PI) algorithm is proposed to address the optimal leader-following consensus problem of discrete-time multi-agent systems (MASs). Unlike existing results for state consensus problem that utilise the PI algorithm, the novel algorithm leverages only th...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:International journal of systems science Jg. 56; H. 16; S. 4055 - 4072
Hauptverfasser: Xiang, Aoxue, Zhao, Xinyuan, Ma, Ruicheng
Format: Journal Article
Sprache:Englisch
Veröffentlicht: London Taylor & Francis 10.12.2025
Taylor & Francis Ltd
Schlagworte:
ISSN:0020-7721, 1464-5319
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:In this paper, the data-driven α-policy iteration (PI) algorithm is proposed to address the optimal leader-following consensus problem of discrete-time multi-agent systems (MASs). Unlike existing results for state consensus problem that utilise the PI algorithm, the novel algorithm leverages only the system's trajectory from historical data over a finite number of steps and and does not require an admissible initial policy. Firstly, the linear quadratic regulator (LQR) design method is applied to derive the Bellman equation and the control policy based on the available measured data. Then, the data-driven α-PI algorithm is introduced, demonstrating a convergence rate that outperforms the value iteration (VI) algorithm and enabling all follower agents to track the trajectory of the leader agent. Finally, two examples are presented to demonstrate the performance of the proposed method.
Bibliographie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0020-7721
1464-5319
DOI:10.1080/00207721.2025.2482006