A data-driven α-policy iteration algorithm for optimal leader-following consensus of discrete-time multi-agent systems

In this paper, the data-driven α-policy iteration (PI) algorithm is proposed to address the optimal leader-following consensus problem of discrete-time multi-agent systems (MASs). Unlike existing results for state consensus problem that utilise the PI algorithm, the novel algorithm leverages only th...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:International journal of systems science Ročník 56; číslo 16; s. 4055 - 4072
Hlavní autori: Xiang, Aoxue, Zhao, Xinyuan, Ma, Ruicheng
Médium: Journal Article
Jazyk:English
Vydavateľské údaje: London Taylor & Francis 10.12.2025
Taylor & Francis Ltd
Predmet:
ISSN:0020-7721, 1464-5319
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Abstract In this paper, the data-driven α-policy iteration (PI) algorithm is proposed to address the optimal leader-following consensus problem of discrete-time multi-agent systems (MASs). Unlike existing results for state consensus problem that utilise the PI algorithm, the novel algorithm leverages only the system's trajectory from historical data over a finite number of steps and and does not require an admissible initial policy. Firstly, the linear quadratic regulator (LQR) design method is applied to derive the Bellman equation and the control policy based on the available measured data. Then, the data-driven α-PI algorithm is introduced, demonstrating a convergence rate that outperforms the value iteration (VI) algorithm and enabling all follower agents to track the trajectory of the leader agent. Finally, two examples are presented to demonstrate the performance of the proposed method.
AbstractList In this paper, the data-driven α-policy iteration (PI) algorithm is proposed to address the optimal leader-following consensus problem of discrete-time multi-agent systems (MASs). Unlike existing results for state consensus problem that utilise the PI algorithm, the novel algorithm leverages only the system's trajectory from historical data over a finite number of steps and and does not require an admissible initial policy. Firstly, the linear quadratic regulator (LQR) design method is applied to derive the Bellman equation and the control policy based on the available measured data. Then, the data-driven α-PI algorithm is introduced, demonstrating a convergence rate that outperforms the value iteration (VI) algorithm and enabling all follower agents to track the trajectory of the leader agent. Finally, two examples are presented to demonstrate the performance of the proposed method.
Author Xiang, Aoxue
Ma, Ruicheng
Zhao, Xinyuan
Author_xml – sequence: 1
  givenname: Aoxue
  surname: Xiang
  fullname: Xiang, Aoxue
  organization: Beijing University of Technology
– sequence: 2
  givenname: Xinyuan
  surname: Zhao
  fullname: Zhao, Xinyuan
  organization: Beijing University of Technology
– sequence: 3
  givenname: Ruicheng
  surname: Ma
  fullname: Ma, Ruicheng
  email: maruicheng@lnu.edu.cn
  organization: Liaoning University
BookMark eNp9kM1O3DAUha0KpA7QR6hkibWHaydOnB0I8VMJqZt2bRn7Zmrk2FPbw2geixfhmUg0dNvV3XznHN3vjJzEFJGQ7xzWHBRcAQjoe8HXAoRci1YJgO4LWfG2a5ls-HBCVgvDFugrOSvlBQCkFLAi-xvqTDXMZf-Kkb6_sW0K3h6or5hN9SlSEzYp-_pnomPKNG2rn0ygAY3DzMYUQtr7uKE2xYKx7ApNI3W-2IwV2QwjnXahemY2GCsth1JxKhfkdDSh4LfPe05-39_9un1kTz8fftzePDErlKrMgGkdQsdFz1sle2eMHZy0vG8H1SM6HJzpbd9ZaECK0TQDf1byGZtBKet4c04uj73bnP7usFT9knY5zpO6Ed3S0vBmpuSRsjmVknHU2zx_mQ-ag14c63-O9eJYfzqec9fHnI-zm8nsUw5OV3MIKY_ZROvnmf9XfACqroeU
Cites_doi 10.1109/TNNLS.2017.2728622
10.1109/TCSII.2021.3120791
10.1109/TAC.2013.2275670
10.1109/TIE.41
10.1080/17517571003763380
10.1080/00207721.2024.2367711
10.1109/TAC.2024.3433668
10.1109/TSMC.2024.3389689
10.1109/TFUZZ.2020.3021714
10.1109/TSMC.6221021
10.1016/j.ins.2017.07.014
10.1016/j.automatica.2014.10.047
10.1109/TCSII.2021.3131360
10.1016/j.mechatronics.2005.08.002
10.1016/j.automatica.2017.07.004
10.1016/j.neucom.2022.10.032
10.1109/TASE.2024.3484412
10.3390/robotics6040022
10.1002/asjc.v25.6
10.1080/00207721.2024.2328785
10.1109/TCYB.2014.2384016
10.1109/TNNLS.2023.3303863
10.1109/TNNLS.2022.3213566
10.1109/TNNLS.2021.3098985
10.1109/TCYB.2024.3418190
10.1080/00207179.2019.1583376
10.1016/j.automatica.2023.111198
10.1109/TNNLS.2021.3122458
10.1080/00207721.2024.2410458
10.1080/00207721.2024.2371019
10.1109/TNNLS.2023.3244934
10.1109/TCYB.2023.3274908
10.1109/TII.2023.3342881
10.3390/robotics12050121
10.1109/TAC.9
10.1080/00207721.2024.2304121
ContentType Journal Article
Copyright 2025 Informa UK Limited, trading as Taylor & Francis Group 2025
2025 Informa UK Limited, trading as Taylor & Francis Group
Copyright_xml – notice: 2025 Informa UK Limited, trading as Taylor & Francis Group 2025
– notice: 2025 Informa UK Limited, trading as Taylor & Francis Group
DBID AAYXX
CITATION
7SC
8FD
JQ2
L7M
L~C
L~D
DOI 10.1080/00207721.2025.2482006
DatabaseName CrossRef
Computer and Information Systems Abstracts
Technology Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
DatabaseTitle CrossRef
Computer and Information Systems Abstracts
Technology Research Database
Computer and Information Systems Abstracts – Academic
Advanced Technologies Database with Aerospace
ProQuest Computer Science Collection
Computer and Information Systems Abstracts Professional
DatabaseTitleList
Computer and Information Systems Abstracts
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISSN 1464-5319
EndPage 4072
ExternalDocumentID 10_1080_00207721_2025_2482006
2482006
Genre Research Article
GrantInformation_xml – fundername: National Natural Science Foundation of China
  grantid: 12271015; 62473183
– fundername: Scientific Research Fund of Educational Department of Liaoning Province
  grantid: JYTMS20230773Scientific Research Fund of the Educational Department of Liaoning Province, China (JYTMS20230773)
GroupedDBID -~X
.7F
.DC
.QJ
0BK
0R~
29J
30N
4.4
5GY
5VS
8VB
AAENE
AAGDL
AAHIA
AAJMT
AALDU
AAMIU
AAPUL
AAQRR
ABCCY
ABFIM
ABHAV
ABJNI
ABLIJ
ABPAQ
ABPEM
ABTAI
ABXUL
ABXYU
ACGEJ
ACGFS
ACNCT
ACTIO
ADCVX
ADGTB
ADXPE
AEISY
AENEX
AEOZL
AEPSL
AEYOC
AFKVX
AFRVT
AGDLA
AGMYJ
AHDZW
AIJEM
AIYEW
AJWEG
AKBVH
AKOOK
ALMA_UNASSIGNED_HOLDINGS
ALQZU
AQRUH
AQTUD
AVBZW
AWYRJ
BLEHA
CCCUG
CE4
CS3
DKSSO
DU5
EBS
E~A
E~B
F5P
GTTXZ
H13
HF~
HZ~
H~P
IPNFZ
J.P
KYCEM
LJTGL
M4Z
MS~
NA5
O9-
P2P
QWB
RIG
RNANH
ROSJB
RTWRZ
S-T
SNACF
TASJS
TBQAZ
TDBHL
TEN
TFL
TFT
TFW
TNC
TTHFI
TUROJ
TWF
UT5
UU3
ZGOLN
ZL0
~02
~S~
AAYXX
CITATION
DGEBU
NX~
7SC
8FD
JQ2
L7M
L~C
L~D
ID FETCH-LOGICAL-c288t-a0a4de0612714857daac9d5c174987eede9da7c76c03052fa391b85be3988cd13
IEDL.DBID TFW
ISICitedReferencesCount 1
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001452284300001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 0020-7721
IngestDate Sat Nov 01 14:31:02 EDT 2025
Sat Nov 29 07:00:30 EST 2025
Sat Nov 01 14:37:34 EDT 2025
IsPeerReviewed true
IsScholarly true
Issue 16
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c288t-a0a4de0612714857daac9d5c174987eede9da7c76c03052fa391b85be3988cd13
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
PQID 3267498313
PQPubID 2045514
PageCount 18
ParticipantIDs proquest_journals_3267498313
informaworld_taylorfrancis_310_1080_00207721_2025_2482006
crossref_primary_10_1080_00207721_2025_2482006
PublicationCentury 2000
PublicationDate 2025-12-10
PublicationDateYYYYMMDD 2025-12-10
PublicationDate_xml – month: 12
  year: 2025
  text: 2025-12-10
  day: 10
PublicationDecade 2020
PublicationPlace London
PublicationPlace_xml – name: London
PublicationTitle International journal of systems science
PublicationYear 2025
Publisher Taylor & Francis
Taylor & Francis Ltd
Publisher_xml – name: Taylor & Francis
– name: Taylor & Francis Ltd
References e_1_3_3_30_1
Liu Y. (e_1_3_3_16_1) 2024; 71
e_1_3_3_18_1
e_1_3_3_17_1
e_1_3_3_39_1
e_1_3_3_19_1
e_1_3_3_14_1
e_1_3_3_37_1
e_1_3_3_13_1
e_1_3_3_38_1
e_1_3_3_35_1
e_1_3_3_15_1
e_1_3_3_36_1
e_1_3_3_10_1
e_1_3_3_33_1
e_1_3_3_34_1
e_1_3_3_12_1
e_1_3_3_31_1
e_1_3_3_11_1
e_1_3_3_32_1
e_1_3_3_40_1
e_1_3_3_7_1
e_1_3_3_6_1
e_1_3_3_9_1
e_1_3_3_8_1
e_1_3_3_29_1
e_1_3_3_28_1
e_1_3_3_25_1
e_1_3_3_24_1
e_1_3_3_27_1
e_1_3_3_26_1
e_1_3_3_3_1
e_1_3_3_21_1
e_1_3_3_2_1
e_1_3_3_20_1
e_1_3_3_5_1
e_1_3_3_23_1
e_1_3_3_4_1
e_1_3_3_22_1
References_xml – ident: e_1_3_3_35_1
  doi: 10.1109/TNNLS.2017.2728622
– ident: e_1_3_3_3_1
  doi: 10.1109/TCSII.2021.3120791
– ident: e_1_3_3_19_1
  doi: 10.1109/TAC.2013.2275670
– ident: e_1_3_3_20_1
  doi: 10.1109/TIE.41
– ident: e_1_3_3_33_1
  doi: 10.1080/17517571003763380
– ident: e_1_3_3_6_1
  doi: 10.1080/00207721.2024.2367711
– ident: e_1_3_3_26_1
  doi: 10.1109/TAC.2024.3433668
– ident: e_1_3_3_27_1
  doi: 10.1109/TSMC.2024.3389689
– ident: e_1_3_3_31_1
  doi: 10.1109/TFUZZ.2020.3021714
– ident: e_1_3_3_24_1
  doi: 10.1109/TSMC.6221021
– ident: e_1_3_3_13_1
  doi: 10.1016/j.ins.2017.07.014
– ident: e_1_3_3_2_1
  doi: 10.1016/j.automatica.2014.10.047
– ident: e_1_3_3_15_1
  doi: 10.1109/TCSII.2021.3131360
– ident: e_1_3_3_21_1
  doi: 10.1016/j.mechatronics.2005.08.002
– ident: e_1_3_3_9_1
  doi: 10.1016/j.automatica.2017.07.004
– ident: e_1_3_3_12_1
  doi: 10.1016/j.neucom.2022.10.032
– ident: e_1_3_3_38_1
  doi: 10.1109/TASE.2024.3484412
– ident: e_1_3_3_36_1
  doi: 10.3390/robotics6040022
– ident: e_1_3_3_4_1
  doi: 10.1002/asjc.v25.6
– volume: 71
  start-page: 2694
  issue: 5
  year: 2024
  ident: e_1_3_3_16_1
  article-title: Event-triggered distributed adaptive leaderless consensus of uncertain heterogenous nonlinear multi-agent systems
  publication-title: IEEE Transactions on Circuits and Systems II: Express Briefs
– ident: e_1_3_3_18_1
  doi: 10.1080/00207721.2024.2328785
– ident: e_1_3_3_10_1
  doi: 10.1109/TCYB.2014.2384016
– ident: e_1_3_3_34_1
  doi: 10.1002/asjc.v25.6
– ident: e_1_3_3_37_1
  doi: 10.1109/TNNLS.2023.3303863
– ident: e_1_3_3_30_1
  doi: 10.1109/TNNLS.2022.3213566
– ident: e_1_3_3_28_1
  doi: 10.1109/TNNLS.2021.3098985
– ident: e_1_3_3_32_1
  doi: 10.1109/TCYB.2024.3418190
– ident: e_1_3_3_23_1
  doi: 10.1080/00207179.2019.1583376
– ident: e_1_3_3_40_1
  doi: 10.1016/j.automatica.2023.111198
– ident: e_1_3_3_14_1
  doi: 10.1109/TNNLS.2021.3122458
– ident: e_1_3_3_22_1
  doi: 10.1080/00207721.2024.2410458
– ident: e_1_3_3_17_1
  doi: 10.1109/TSMC.6221021
– ident: e_1_3_3_25_1
  doi: 10.1080/00207721.2024.2371019
– ident: e_1_3_3_8_1
  doi: 10.1109/TNNLS.2023.3244934
– ident: e_1_3_3_29_1
  doi: 10.1109/TCYB.2023.3274908
– ident: e_1_3_3_7_1
  doi: 10.1109/TII.2023.3342881
– ident: e_1_3_3_39_1
  doi: 10.3390/robotics12050121
– ident: e_1_3_3_5_1
  doi: 10.1109/TAC.9
– ident: e_1_3_3_11_1
  doi: 10.1080/00207721.2024.2304121
SSID ssj0005520
Score 2.4249713
Snippet In this paper, the data-driven α-policy iteration (PI) algorithm is proposed to address the optimal leader-following consensus problem of discrete-time...
SourceID proquest
crossref
informaworld
SourceType Aggregation Database
Index Database
Publisher
StartPage 4055
SubjectTerms Bellman theory
Discrete time systems
Iterative algorithms
Linear quadratic regulator
model-free
Multi-agent systems
Multiagent systems
reinforcement learning
state consensus
α-policy iteration
Title A data-driven α-policy iteration algorithm for optimal leader-following consensus of discrete-time multi-agent systems
URI https://www.tandfonline.com/doi/abs/10.1080/00207721.2025.2482006
https://www.proquest.com/docview/3267498313
Volume 56
WOSCitedRecordID wos001452284300001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVAWR
  databaseName: Taylor & Francis
  customDbUrl:
  eissn: 1464-5319
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0005520
  issn: 0020-7721
  databaseCode: TFW
  dateStart: 19701001
  isFulltext: true
  titleUrlDefault: https://www.tandfonline.com
  providerName: Taylor & Francis
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV29TsMwELZQxQAD_4hCQR5YXcWx09hjhaiYKoYiukVOYkOlNqmSlD4XL8Iz4XMSKEKIAUYPPjl35_Nd9N13CF1zzaXiipIBo5rwQHkkjgNBYkaNYSlMUuVu2EQ4HovpVN43aMKygVVCDW1qoggXq-Fyq7hsEXHQwe3ZpBCqOz_o-1w0pNs2swdQ32T0-AnyCBpiRlskwZa2h-cnKV9epy_cpd9itXuARvv_cPQDtNdkn3hYu8sh2tLZEdrd4CQ8RushBtQoSQuIg_jtlSwddTCu-ZetGbGaP-XFrHpeYHtwnNugs7BC5w4UTYx1rHxtReEEgNpZuSpxbjB0_xY2QScwzB47GCNR0NaFay7p8gQ9jG4nN3ekmc5AEl-IiihP8VRDhhSC4sNUqUSmQWJLHClC-_RqmaowCQcJxBTfKCZpLIJYMylEklJ2ijpZnukzhKVSIoSuIyMk94QRamBdi0ouNQvtqov6rVWiZU3CEdEPbtNaoxFoNGo02kVy03ZR5f5-mHpUScR-2dtrDR0199lu8QfwWYyy8z-IvkA7sAQ0DPV6qFMVK32JtpOXalYWV85z3wFiuOzj
linkProvider Taylor & Francis
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV3NTsMwDI7QQAIO_CMGA3Lgmqlp0jU5TohpiLHTELtVaZvAJNZOXceeixfhmYjbDjYhxAGOVRQrjR3Hjj5_RuiKay4VV5S0GNWEe8ohYegJEjJqDIuhkyovmk34_b4YDuVyLQzAKiGHNiVRROGr4XDDY_QCEgcl3I6NCiG9c72my0XJur0O3ekgARt0Hr9gHl5FzWjTJJizqOL5SczK_bTCXvrNWxdXUGf3Pxa_h3aqABS3S4vZR2s6OUDbS7SEh2jexgAcJXEGrhC_v5FJwR6MSwpmq0msXp7SbJQ_j7FdOU6t3xlboS8FLpoYa1vp3IrCEWC1k-lsilODoQA4szE6gX72uEAyEgWVXbikk54eoYfOzeC6S6oGDSRyhciJchSPNQRJvs2qPD9WKpKxF9ksRwrf3r5axsqP_FYEbsU1ikkaCi_UTAoRxZQdo1qSJvoEYamU8KHwyAjJHWGEalnropJLzXz7VUfNhVqCScnDEdBPetNyRwPY0aDa0TqSy8oL8uIBxJTdSgL2y9zGQtNBdaTtFLcFv8UoO_2D6Eu02R3c94Lebf_uDG3BEIBjqNNAtTyb6XO0Eb3mo2l2UZjxB05q8QY
linkToPdf http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1PS8MwFA-iInrwvzidmoPXjKZJ1-Q41KEoY4eJu5W0TXSwraPt3Ofyi_iZzGs73RDxoMdS8kjzXt6f8nu_h9Al11wqrihpMqoJ95RDwtATJGTUGBbDJFVeDJvwOx3R78tuhSbMKlgl1NCmJIoofDVc7kls5og46OB2bFII1Z3rNVwuStLtNZs6e2DYvfbTF8rDq5gZbZUEa-ZNPD-JWQpPS-Sl35x1EYHaO_-w9120XaWfuFXayx5a0eN9tLVASniAZi0MsFESp-AI8fsbmRTcwbgkYLZ6xGr4nKSD_GWE7cZxYr3OyAodFqhoYqxlJTMrCkeA1B5n0wwnBkP7b2ozdALT7HGBYyQK-rpwSSadHaLH9k3v6pZU4xlI5AqRE-UoHmtIkXxbU3l-rFQkYy-yNY4Uvo29WsbKj_xmBE7FNYpJGgov1EwKEcWUHaHVcTLWxwhLpYQPbUdGSO4II1TT2haVXGrm26caasy1EkxKFo6AfpKblicawIkG1YnWkFzUXZAXvz9MOaskYL-src8VHVQX2i5xm_BZjLKTP4i-QBvd63bwcNe5P0Wb8AaQMdSpo9U8neoztB695oMsPS-M-AOHce-4
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+data-driven+%CE%B1+-policy+iteration+algorithm+for+optimal+leader-following+consensus+of+discrete-time+multi-agent+systems&rft.jtitle=International+journal+of+systems+science&rft.au=Xiang%2C+Aoxue&rft.au=Zhao%2C+Xinyuan&rft.au=Ma%2C+Ruicheng&rft.date=2025-12-10&rft.issn=0020-7721&rft.eissn=1464-5319&rft.volume=56&rft.issue=16&rft.spage=4055&rft.epage=4072&rft_id=info:doi/10.1080%2F00207721.2025.2482006&rft.externalDBID=n%2Fa&rft.externalDocID=10_1080_00207721_2025_2482006
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0020-7721&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0020-7721&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0020-7721&client=summon