Heterogeneous Multi-Robot Cooperation With Asynchronous Multi-Agent Reinforcement Learning

Multi-robot systems (MRSs) are becoming increasingly important in various domains. However, effective communication and coordination among multiple robots remain significant challenges. In this letter, we introduce a novel architecture for multi-robot decision-making and control based on multi-agent...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:IEEE robotics and automation letters Ročník 9; číslo 1; s. 159 - 166
Hlavní autoři: Zhang, Han, Zhang, Xiaohui, Feng, Zhao, Xiao, Xiaohui
Médium: Journal Article
Jazyk:angličtina
Vydáno: Piscataway IEEE 01.01.2024
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Témata:
ISSN:2377-3766, 2377-3766
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Abstract Multi-robot systems (MRSs) are becoming increasingly important in various domains. However, effective communication and coordination among multiple robots remain significant challenges. In this letter, we introduce a novel architecture for multi-robot decision-making and control based on multi-agent reinforcement learning (MARL). Our architecture can accommodate heterogeneous robots operating asynchronously in different scenarios. We propose an improved practical Q-value mixing network (Qrainbow), which builds on value-decomposition networks and applies the multi-head attention mixer of Qatten and effective components from Rainbow, such as double network, dueling network, and prioritized experience replay. To migrate the algorithm to MRS, we fuse macro-action into Qrainbow and make a slight change to the process of calculating the loss function, enabling Qrainbow to work in asynchronous scenarios. We evaluate our architecture in both the benchmark environment for MARL and a multi-robot environment with varying layouts. In terms of convergence speed and final result, Qrainbow outperforms other state-of-the-art MARL algorithms. Additionally, our architecture achieves superior performance in reducing time costs and avoiding collisions between robots in homogeneous and heterogeneous multi-robot cooperation tasks.
AbstractList Multi-robot systems (MRSs) are becoming increasingly important in various domains. However, effective communication and coordination among multiple robots remain significant challenges. In this letter, we introduce a novel architecture for multi-robot decision-making and control based on multi-agent reinforcement learning (MARL). Our architecture can accommodate heterogeneous robots operating asynchronously in different scenarios. We propose an improved practical Q-value mixing network (Qrainbow), which builds on value-decomposition networks and applies the multi-head attention mixer of Qatten and effective components from Rainbow, such as double network, dueling network, and prioritized experience replay. To migrate the algorithm to MRS, we fuse macro-action into Qrainbow and make a slight change to the process of calculating the loss function, enabling Qrainbow to work in asynchronous scenarios. We evaluate our architecture in both the benchmark environment for MARL and a multi-robot environment with varying layouts. In terms of convergence speed and final result, Qrainbow outperforms other state-of-the-art MARL algorithms. Additionally, our architecture achieves superior performance in reducing time costs and avoiding collisions between robots in homogeneous and heterogeneous multi-robot cooperation tasks.
Author Zhang, Han
Xiao, Xiaohui
Feng, Zhao
Zhang, Xiaohui
Author_xml – sequence: 1
  givenname: Han
  orcidid: 0000-0002-8591-7736
  surname: Zhang
  fullname: Zhang, Han
  email: sagirizhanghan@whu.edu.cn
  organization: School of Power and Mechanical Engineering, Wuhan University, Wuhan, China
– sequence: 2
  givenname: Xiaohui
  orcidid: 0009-0008-2436-5350
  surname: Zhang
  fullname: Zhang, Xiaohui
  email: zhangxh@whu.edu.cn
  organization: School of Power and Mechanical Engineering, Wuhan University, Wuhan, China
– sequence: 3
  givenname: Zhao
  orcidid: 0000-0001-7213-9413
  surname: Feng
  fullname: Feng, Zhao
  email: fengzhao@whu.edu.cn
  organization: School of Power and Mechanical Engineering, Wuhan University, Wuhan, China
– sequence: 4
  givenname: Xiaohui
  orcidid: 0000-0002-8212-2452
  surname: Xiao
  fullname: Xiao, Xiaohui
  email: xhxiao@whu.edu.cn
  organization: School of Power and Mechanical Engineering, Wuhan University, Wuhan, China
BookMark eNp9kM1LwzAYh4NMcM7dPXgoeO7MR5uPYxnqhIowFMFLSNN0y9iSmWaH_fd268DhwfeSN_B78iPPNRg47wwAtwhOEILioZwXEwwxmRCCeZbxCzDEhLGUMEoHZ_sVGLftCkKIcsyIyIfga2aiCX5hnPG7NnndraNN577yMZl6vzVBRetd8mnjMinavdPL4N1vsujAmMyNdY0P2mwOt9Ko4Kxb3IDLRq1bMz6dI_Dx9Pg-naXl2_PLtChTjQWOqW4oQVwJJHJaQ6yZbliltaYq4whylilS07zGVFBRdaOxbngmGsMrViOKyQjc9-9ug__emTbKld8F11VKzEX3d5ofU7BP6eDbNphGboPdqLCXCMqDRNlJlAeJ8iSxQ-gfRNt49BGDsuv_wLsetMaYsx7Se_8BkxWBtw
CODEN IRALC6
CitedBy_id crossref_primary_10_1002_rob_22441
crossref_primary_10_1016_j_cnsns_2025_108846
crossref_primary_10_1109_ACCESS_2024_3518924
crossref_primary_10_1002_rnc_7910
crossref_primary_10_1016_j_jnca_2025_104231
crossref_primary_10_1109_TASE_2024_3486063
crossref_primary_10_1016_j_jfranklin_2025_107775
Cites_doi 10.1613/jair.1.11418
10.1109/LRA.2021.3092305
10.1609/aaai.v32i1.11631
10.1109/LRA.2022.3161699
10.1016/j.eswa.2017.05.074
10.1038/nature14236
10.3389/frobt.2018.00059
10.1609/aaai.v32i1.11796
10.1609/aaai.v30i1.10295
10.1109/LRA.2021.3092290
10.1006/ijhc.1997.0162
10.1007/978-3-030-60990-0_12
10.1109/LRA.2022.3148489
10.1109/ICRA48506.2021.9561359
10.1109/LRA.2022.3224667
10.1016/0004-3702(92)90058-6
10.1109/LRA.2022.3191204
ContentType Journal Article
Copyright Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024
Copyright_xml – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024
DBID 97E
RIA
RIE
AAYXX
CITATION
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
DOI 10.1109/LRA.2023.3328448
DatabaseName IEEE Xplore (IEEE)
IEEE All-Society Periodicals Package (ASPP) 1998–Present
IEEE Electronic Library (IEL)
CrossRef
Computer and Information Systems Abstracts
Electronics & Communications Abstracts
Technology Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
DatabaseTitle CrossRef
Technology Research Database
Computer and Information Systems Abstracts – Academic
Electronics & Communications Abstracts
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts Professional
DatabaseTitleList
Technology Research Database
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISSN 2377-3766
EndPage 166
ExternalDocumentID 10_1109_LRA_2023_3328448
10301527
Genre orig-research
GrantInformation_xml – fundername: National Key R&B Program of China
  grantid: 2018YFB2100903
– fundername: Knowledge Innovation Program of Wuhan-Shuguang Project
  grantid: 2023010201020252
GroupedDBID 0R~
97E
AAJGR
AARMG
AASAJ
AAWTH
ABAZT
ABQJQ
ABVLG
ACGFS
AGQYO
AGSQL
AHBIQ
AKJIK
AKQYR
ALMA_UNASSIGNED_HOLDINGS
ATWAV
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
EBS
EJD
IFIPE
IPLJI
JAVBF
KQ8
M43
M~E
O9-
OCL
RIA
RIE
AAYXX
CITATION
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
ID FETCH-LOGICAL-c292t-cf6318a91956d02c7cf7bccc6a4810874a3d65d26969bbbbc2cf849fe8b7d1623
IEDL.DBID RIE
ISICitedReferencesCount 10
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001257126000013&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 2377-3766
IngestDate Mon Jun 30 04:21:55 EDT 2025
Tue Nov 18 21:18:44 EST 2025
Sat Nov 29 06:03:27 EST 2025
Wed Aug 27 02:29:06 EDT 2025
IsPeerReviewed true
IsScholarly true
Issue 1
Language English
License https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html
https://doi.org/10.15223/policy-029
https://doi.org/10.15223/policy-037
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c292t-cf6318a91956d02c7cf7bccc6a4810874a3d65d26969bbbbc2cf849fe8b7d1623
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ORCID 0000-0002-8212-2452
0000-0001-7213-9413
0009-0008-2436-5350
0000-0002-8591-7736
PQID 2892376562
PQPubID 4437225
PageCount 8
ParticipantIDs crossref_primary_10_1109_LRA_2023_3328448
crossref_citationtrail_10_1109_LRA_2023_3328448
proquest_journals_2892376562
ieee_primary_10301527
PublicationCentury 2000
PublicationDate 2024-Jan.
2024-1-00
20240101
PublicationDateYYYYMMDD 2024-01-01
PublicationDate_xml – month: 01
  year: 2024
  text: 2024-Jan.
PublicationDecade 2020
PublicationPlace Piscataway
PublicationPlace_xml – name: Piscataway
PublicationTitle IEEE robotics and automation letters
PublicationTitleAbbrev LRA
PublicationYear 2024
Publisher IEEE
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml – name: IEEE
– name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References ref12
Bellemare (ref30) 2017; 70
Sunehag (ref14) 2018
ref32
Schaul (ref27) 2016
ref2
ref1
Yang (ref16) 2002
ref17
Schulman (ref10) 2017
ref19
ref18
Rashid (ref15) 2020; 21
Ming (ref7) 1993
Lillicrap (ref9) 2016
ref24
ref26
Yu (ref13) 2022; 35
ref25
ref20
Xiao (ref21) 2020
ref22
Fortunato (ref31) 2018
Wang (ref28) 2016
ref29
ref8
Lowe (ref11) 2017; 30
ref4
Nachum (ref23) 2020
ref3
ref6
ref5
References_xml – start-page: 330
  volume-title: Proc. 10th Int. Conf. Mach. Learn.
  year: 1993
  ident: ref7
  article-title: Multi-agent reinforcement learning: Independent versus cooperative agents
– year: 2002
  ident: ref16
  article-title: Qatten: A general framework for cooperative multiagent reinforcement learning
– ident: ref20
  doi: 10.1613/jair.1.11418
– volume: 30
  volume-title: Adv. Neural Inf. Process. Syst.
  year: 2017
  ident: ref11
  article-title: Multi-agent actor-critic for mixed cooperative-competitive environments
– start-page: 1146
  volume-title: Proc. Conf. Robot Learn.
  year: 2020
  ident: ref21
  article-title: Macro-action-based deep multi-agent reinforcement learning
– start-page: 1995
  volume-title: Proc. Int. Conf. Mach. Learn.
  year: 2016
  ident: ref28
  article-title: Dueling network architectures for deep reinforcement learning
– ident: ref4
  doi: 10.1109/LRA.2021.3092305
– ident: ref29
  doi: 10.1609/aaai.v32i1.11631
– ident: ref24
  doi: 10.1109/LRA.2022.3161699
– ident: ref6
  doi: 10.1016/j.eswa.2017.05.074
– ident: ref8
  doi: 10.1038/nature14236
– start-page: 2085
  volume-title: Proc. 17th Int. Conf. Auton. Agents MultiAgent Syst.
  year: 2018
  ident: ref14
  article-title: Value-decomposition networks for cooperative multi-agent learning based on team reward
– volume: 35
  start-page: 24611
  volume-title: Proc. Adv. Neural Inf. Process. Syst.
  year: 2022
  ident: ref13
  article-title: The surprising effectiveness of ppo in cooperative multi-agent games
– year: 2017
  ident: ref10
  article-title: Proximal policy optimization algorithms
– volume-title: Proc. 6th Int. Conf. Learn. Representations
  year: 2018
  ident: ref31
  article-title: Noisy networks for exploration
– ident: ref5
  doi: 10.3389/frobt.2018.00059
– ident: ref25
  doi: 10.1609/aaai.v32i1.11796
– start-page: 110
  volume-title: Conf. Robot Learn.
  year: 2020
  ident: ref23
  article-title: Multi-agent manipulation via locomotion using hierarchical Sim2Real
– ident: ref26
  doi: 10.1609/aaai.v30i1.10295
– ident: ref1
  doi: 10.1109/LRA.2021.3092290
– volume-title: Proc. 4th Int. Conf. Learn. Representations
  year: 2016
  ident: ref27
  article-title: Prioritized experience replay
– ident: ref19
  doi: 10.1613/jair.1.11418
– ident: ref18
  doi: 10.1006/ijhc.1997.0162
– ident: ref12
  doi: 10.1007/978-3-030-60990-0_12
– ident: ref2
  doi: 10.1109/LRA.2022.3148489
– volume: 70
  start-page: 449
  volume-title: Proc. 34th Int. Conf. Mach. Learn.
  year: 2017
  ident: ref30
  article-title: A distributional perspective on reinforcement learning
– ident: ref32
  doi: 10.1109/ICRA48506.2021.9561359
– volume-title: Proc. 4th Int. Conf. Learn. Representations
  year: 2016
  ident: ref9
  article-title: Continuous control with deep reinforcement learning
– ident: ref22
  doi: 10.1109/LRA.2022.3224667
– ident: ref17
  doi: 10.1016/0004-3702(92)90058-6
– ident: ref3
  doi: 10.1109/LRA.2022.3191204
– volume: 21
  start-page: 7234
  issue: 1
  year: 2020
  ident: ref15
  article-title: Monotonic value function factorisation for deep multi-agent reinforcement learning
  publication-title: J. Mach. Learn. Res.
SSID ssj0001527395
Score 2.3073626
Snippet Multi-robot systems (MRSs) are becoming increasingly important in various domains. However, effective communication and coordination among multiple robots...
SourceID proquest
crossref
ieee
SourceType Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 159
SubjectTerms Algorithms
asynchronous execution
Collision avoidance
Cooperation
Decision making
Heterogeneous networks
heterogeneous robots
Multi-robot systems
Multiagent systems
Multiple robots
Q-learning
Reinforcement learning
Robot control
Robot kinematics
Robots
System effectiveness
Task analysis
Training
Title Heterogeneous Multi-Robot Cooperation With Asynchronous Multi-Agent Reinforcement Learning
URI https://ieeexplore.ieee.org/document/10301527
https://www.proquest.com/docview/2892376562
Volume 9
WOSCitedRecordID wos001257126000013&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVIEE
  databaseName: IEEE Electronic Library (IEL)
  customDbUrl:
  eissn: 2377-3766
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0001527395
  issn: 2377-3766
  databaseCode: RIE
  dateStart: 20160101
  isFulltext: true
  titleUrlDefault: https://ieeexplore.ieee.org/
  providerName: IEEE
– providerCode: PRVHPJ
  databaseName: ROAD: Directory of Open Access Scholarly Resources (ISSN International Center)
  customDbUrl:
  eissn: 2377-3766
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0001527395
  issn: 2377-3766
  databaseCode: M~E
  dateStart: 20160101
  isFulltext: true
  titleUrlDefault: https://road.issn.org
  providerName: ISSN International Centre
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NT8JAEJ0I8aAHPzGiSHrw4qFQ2mW7eyREwgGJIRqJl6adbpXEtASKiRd_uzvbIiRGE3vqYbZp9u3H7OzMewDXIuSSa8_AFpHLbdYVji2Rod1VUYRShIiJIXEd-eOxmE7lfVmsbmphlFIm-Uy16NXc5ccZrihU1iZJLJJhrUDF93lRrLUJqBCVmOyuryId2R5Nei1SB295nl6ESeFna-sxWio_FmCzqwwO__k_R3BQuo9Wr8D7GHZUegL7W6SCp_A8pAyXTA8MpU_1lqmwtSdZlOVWP8vmqoDceprlr1Zv-ZEi0eNuLHtUa2VNlGFURRM8tEoS1pcaPA5uH_pDu1RQsNGVbm5jwvWcDSUVBcaOi8RBFCEiD5noOMJnoRfzbuxqvGSkH3QxEUwmSkR-3NGe0RlU0yxV52BJQttjsfS0RZS4IY-Z58hEKuScOaIO7XXnBljSi5PKxVtgjhmODDQcAcERlHDU4ea7xbyg1vjDtkbdv2VX9HwdGmsAg3LyLQN9hqRcH-3ZXfzS7BL29NdZEUppQDVfrNQV7OJ7PlsumlC5-7xtmtH1BXANzos
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LS8NAEB60CurBZ8X6zMGLh7RpstnuHkuxVKxFSsXiJSSTjRakKW0q-O_d2aS2IArmlMMsCfvtY3Z25vsArkXIJdeegS0il9vMF44tkaHtqyhCKULExJC4dhu9nhgO5WNRrG5qYZRSJvlMVenV3OXHKc4pVFYjSSySYV2HDZ8x18nLtZYhFSITk_7iMtKRtW6_WSV98Krn6WWYNH5WNh-jpvJjCTb7Snvvn3-0D7uFA2k1c8QPYE2ND2FnhVbwCF46lOOS6qGh9LneMjW2dj-N0sxqpelE5aBbz6PszWrOPsdIBLlLyyZVW1l9ZThV0YQPrYKG9bUMT-3bQatjFxoKNrrSzWxMuJ61oaSywNhxkViIIkTkIRN1RzRY6MXcj12NmIz0gy4mgslEiagR17VvdAylcTpWJ2BJwttjsfS0RZS4IY-Z58hEKuScOaICtUXnBlgQjJPOxXtgDhqODDQcAcERFHBU4Oa7xSQn1_jDtkzdv2KX93wFzhcABsX0mwX6FEnZPtq3O_2l2RVsdQYP3aB717s_g239JZYHVs6hlE3n6gI28SMbzaaXZox9AXFF0KE
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Heterogeneous+Multi-Robot+Cooperation+With+Asynchronous+Multi-Agent+Reinforcement+Learning&rft.jtitle=IEEE+robotics+and+automation+letters&rft.au=Zhang%2C+Han&rft.au=Zhang%2C+Xiaohui&rft.au=Feng%2C+Zhao&rft.au=Xiao%2C+Xiaohui&rft.date=2024-01-01&rft.pub=IEEE&rft.eissn=2377-3766&rft.volume=9&rft.issue=1&rft.spage=159&rft.epage=166&rft_id=info:doi/10.1109%2FLRA.2023.3328448&rft.externalDocID=10301527
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2377-3766&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2377-3766&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2377-3766&client=summon