A data-driven α-policy iteration algorithm for optimal leader-following consensus of discrete-time multi-agent systems

In this paper, the data-driven α-policy iteration (PI) algorithm is proposed to address the optimal leader-following consensus problem of discrete-time multi-agent systems (MASs). Unlike existing results for state consensus problem that utilise the PI algorithm, the novel algorithm leverages only th...

Full description

Saved in:
Bibliographic Details
Published in:International journal of systems science Vol. 56; no. 16; pp. 4055 - 4072
Main Authors: Xiang, Aoxue, Zhao, Xinyuan, Ma, Ruicheng
Format: Journal Article
Language:English
Published: London Taylor & Francis 10.12.2025
Taylor & Francis Ltd
Subjects:
ISSN:0020-7721, 1464-5319
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In this paper, the data-driven α-policy iteration (PI) algorithm is proposed to address the optimal leader-following consensus problem of discrete-time multi-agent systems (MASs). Unlike existing results for state consensus problem that utilise the PI algorithm, the novel algorithm leverages only the system's trajectory from historical data over a finite number of steps and and does not require an admissible initial policy. Firstly, the linear quadratic regulator (LQR) design method is applied to derive the Bellman equation and the control policy based on the available measured data. Then, the data-driven α-PI algorithm is introduced, demonstrating a convergence rate that outperforms the value iteration (VI) algorithm and enabling all follower agents to track the trajectory of the leader agent. Finally, two examples are presented to demonstrate the performance of the proposed method.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0020-7721
1464-5319
DOI:10.1080/00207721.2025.2482006