Heterogeneous Multi-Robot Cooperation With Asynchronous Multi-Agent Reinforcement Learning
Multi-robot systems (MRSs) are becoming increasingly important in various domains. However, effective communication and coordination among multiple robots remain significant challenges. In this letter, we introduce a novel architecture for multi-robot decision-making and control based on multi-agent...
Uloženo v:
| Vydáno v: | IEEE robotics and automation letters Ročník 9; číslo 1; s. 159 - 166 |
|---|---|
| Hlavní autoři: | , , , |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
Piscataway
IEEE
01.01.2024
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Témata: | |
| ISSN: | 2377-3766, 2377-3766 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Abstract | Multi-robot systems (MRSs) are becoming increasingly important in various domains. However, effective communication and coordination among multiple robots remain significant challenges. In this letter, we introduce a novel architecture for multi-robot decision-making and control based on multi-agent reinforcement learning (MARL). Our architecture can accommodate heterogeneous robots operating asynchronously in different scenarios. We propose an improved practical Q-value mixing network (Qrainbow), which builds on value-decomposition networks and applies the multi-head attention mixer of Qatten and effective components from Rainbow, such as double network, dueling network, and prioritized experience replay. To migrate the algorithm to MRS, we fuse macro-action into Qrainbow and make a slight change to the process of calculating the loss function, enabling Qrainbow to work in asynchronous scenarios. We evaluate our architecture in both the benchmark environment for MARL and a multi-robot environment with varying layouts. In terms of convergence speed and final result, Qrainbow outperforms other state-of-the-art MARL algorithms. Additionally, our architecture achieves superior performance in reducing time costs and avoiding collisions between robots in homogeneous and heterogeneous multi-robot cooperation tasks. |
|---|---|
| AbstractList | Multi-robot systems (MRSs) are becoming increasingly important in various domains. However, effective communication and coordination among multiple robots remain significant challenges. In this letter, we introduce a novel architecture for multi-robot decision-making and control based on multi-agent reinforcement learning (MARL). Our architecture can accommodate heterogeneous robots operating asynchronously in different scenarios. We propose an improved practical Q-value mixing network (Qrainbow), which builds on value-decomposition networks and applies the multi-head attention mixer of Qatten and effective components from Rainbow, such as double network, dueling network, and prioritized experience replay. To migrate the algorithm to MRS, we fuse macro-action into Qrainbow and make a slight change to the process of calculating the loss function, enabling Qrainbow to work in asynchronous scenarios. We evaluate our architecture in both the benchmark environment for MARL and a multi-robot environment with varying layouts. In terms of convergence speed and final result, Qrainbow outperforms other state-of-the-art MARL algorithms. Additionally, our architecture achieves superior performance in reducing time costs and avoiding collisions between robots in homogeneous and heterogeneous multi-robot cooperation tasks. |
| Author | Zhang, Han Xiao, Xiaohui Feng, Zhao Zhang, Xiaohui |
| Author_xml | – sequence: 1 givenname: Han orcidid: 0000-0002-8591-7736 surname: Zhang fullname: Zhang, Han email: sagirizhanghan@whu.edu.cn organization: School of Power and Mechanical Engineering, Wuhan University, Wuhan, China – sequence: 2 givenname: Xiaohui orcidid: 0009-0008-2436-5350 surname: Zhang fullname: Zhang, Xiaohui email: zhangxh@whu.edu.cn organization: School of Power and Mechanical Engineering, Wuhan University, Wuhan, China – sequence: 3 givenname: Zhao orcidid: 0000-0001-7213-9413 surname: Feng fullname: Feng, Zhao email: fengzhao@whu.edu.cn organization: School of Power and Mechanical Engineering, Wuhan University, Wuhan, China – sequence: 4 givenname: Xiaohui orcidid: 0000-0002-8212-2452 surname: Xiao fullname: Xiao, Xiaohui email: xhxiao@whu.edu.cn organization: School of Power and Mechanical Engineering, Wuhan University, Wuhan, China |
| BookMark | eNp9kM1LwzAYh4NMcM7dPXgoeO7MR5uPYxnqhIowFMFLSNN0y9iSmWaH_fd268DhwfeSN_B78iPPNRg47wwAtwhOEILioZwXEwwxmRCCeZbxCzDEhLGUMEoHZ_sVGLftCkKIcsyIyIfga2aiCX5hnPG7NnndraNN577yMZl6vzVBRetd8mnjMinavdPL4N1vsujAmMyNdY0P2mwOt9Ko4Kxb3IDLRq1bMz6dI_Dx9Pg-naXl2_PLtChTjQWOqW4oQVwJJHJaQ6yZbliltaYq4whylilS07zGVFBRdaOxbngmGsMrViOKyQjc9-9ug__emTbKld8F11VKzEX3d5ofU7BP6eDbNphGboPdqLCXCMqDRNlJlAeJ8iSxQ-gfRNt49BGDsuv_wLsetMaYsx7Se_8BkxWBtw |
| CODEN | IRALC6 |
| CitedBy_id | crossref_primary_10_1002_rob_22441 crossref_primary_10_1016_j_cnsns_2025_108846 crossref_primary_10_1109_ACCESS_2024_3518924 crossref_primary_10_1002_rnc_7910 crossref_primary_10_1016_j_jnca_2025_104231 crossref_primary_10_1109_TASE_2024_3486063 crossref_primary_10_1016_j_jfranklin_2025_107775 |
| Cites_doi | 10.1613/jair.1.11418 10.1109/LRA.2021.3092305 10.1609/aaai.v32i1.11631 10.1109/LRA.2022.3161699 10.1016/j.eswa.2017.05.074 10.1038/nature14236 10.3389/frobt.2018.00059 10.1609/aaai.v32i1.11796 10.1609/aaai.v30i1.10295 10.1109/LRA.2021.3092290 10.1006/ijhc.1997.0162 10.1007/978-3-030-60990-0_12 10.1109/LRA.2022.3148489 10.1109/ICRA48506.2021.9561359 10.1109/LRA.2022.3224667 10.1016/0004-3702(92)90058-6 10.1109/LRA.2022.3191204 |
| ContentType | Journal Article |
| Copyright | Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024 |
| Copyright_xml | – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024 |
| DBID | 97E RIA RIE AAYXX CITATION 7SC 7SP 8FD JQ2 L7M L~C L~D |
| DOI | 10.1109/LRA.2023.3328448 |
| DatabaseName | IEEE Xplore (IEEE) IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Electronic Library (IEL) CrossRef Computer and Information Systems Abstracts Electronics & Communications Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional |
| DatabaseTitle | CrossRef Technology Research Database Computer and Information Systems Abstracts – Academic Electronics & Communications Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Professional |
| DatabaseTitleList | Technology Research Database |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Engineering |
| EISSN | 2377-3766 |
| EndPage | 166 |
| ExternalDocumentID | 10_1109_LRA_2023_3328448 10301527 |
| Genre | orig-research |
| GrantInformation_xml | – fundername: National Key R&B Program of China grantid: 2018YFB2100903 – fundername: Knowledge Innovation Program of Wuhan-Shuguang Project grantid: 2023010201020252 |
| GroupedDBID | 0R~ 97E AAJGR AARMG AASAJ AAWTH ABAZT ABQJQ ABVLG ACGFS AGQYO AGSQL AHBIQ AKJIK AKQYR ALMA_UNASSIGNED_HOLDINGS ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ EBS EJD IFIPE IPLJI JAVBF KQ8 M43 M~E O9- OCL RIA RIE AAYXX CITATION 7SC 7SP 8FD JQ2 L7M L~C L~D |
| ID | FETCH-LOGICAL-c292t-cf6318a91956d02c7cf7bccc6a4810874a3d65d26969bbbbc2cf849fe8b7d1623 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 10 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001257126000013&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 2377-3766 |
| IngestDate | Mon Jun 30 04:21:55 EDT 2025 Tue Nov 18 21:18:44 EST 2025 Sat Nov 29 06:03:27 EST 2025 Wed Aug 27 02:29:06 EDT 2025 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 1 |
| Language | English |
| License | https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html https://doi.org/10.15223/policy-029 https://doi.org/10.15223/policy-037 |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c292t-cf6318a91956d02c7cf7bccc6a4810874a3d65d26969bbbbc2cf849fe8b7d1623 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ORCID | 0000-0002-8212-2452 0000-0001-7213-9413 0009-0008-2436-5350 0000-0002-8591-7736 |
| PQID | 2892376562 |
| PQPubID | 4437225 |
| PageCount | 8 |
| ParticipantIDs | crossref_primary_10_1109_LRA_2023_3328448 crossref_citationtrail_10_1109_LRA_2023_3328448 proquest_journals_2892376562 ieee_primary_10301527 |
| PublicationCentury | 2000 |
| PublicationDate | 2024-Jan. 2024-1-00 20240101 |
| PublicationDateYYYYMMDD | 2024-01-01 |
| PublicationDate_xml | – month: 01 year: 2024 text: 2024-Jan. |
| PublicationDecade | 2020 |
| PublicationPlace | Piscataway |
| PublicationPlace_xml | – name: Piscataway |
| PublicationTitle | IEEE robotics and automation letters |
| PublicationTitleAbbrev | LRA |
| PublicationYear | 2024 |
| Publisher | IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Publisher_xml | – name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| References | ref12 Bellemare (ref30) 2017; 70 Sunehag (ref14) 2018 ref32 Schaul (ref27) 2016 ref2 ref1 Yang (ref16) 2002 ref17 Schulman (ref10) 2017 ref19 ref18 Rashid (ref15) 2020; 21 Ming (ref7) 1993 Lillicrap (ref9) 2016 ref24 ref26 Yu (ref13) 2022; 35 ref25 ref20 Xiao (ref21) 2020 ref22 Fortunato (ref31) 2018 Wang (ref28) 2016 ref29 ref8 Lowe (ref11) 2017; 30 ref4 Nachum (ref23) 2020 ref3 ref6 ref5 |
| References_xml | – start-page: 330 volume-title: Proc. 10th Int. Conf. Mach. Learn. year: 1993 ident: ref7 article-title: Multi-agent reinforcement learning: Independent versus cooperative agents – year: 2002 ident: ref16 article-title: Qatten: A general framework for cooperative multiagent reinforcement learning – ident: ref20 doi: 10.1613/jair.1.11418 – volume: 30 volume-title: Adv. Neural Inf. Process. Syst. year: 2017 ident: ref11 article-title: Multi-agent actor-critic for mixed cooperative-competitive environments – start-page: 1146 volume-title: Proc. Conf. Robot Learn. year: 2020 ident: ref21 article-title: Macro-action-based deep multi-agent reinforcement learning – start-page: 1995 volume-title: Proc. Int. Conf. Mach. Learn. year: 2016 ident: ref28 article-title: Dueling network architectures for deep reinforcement learning – ident: ref4 doi: 10.1109/LRA.2021.3092305 – ident: ref29 doi: 10.1609/aaai.v32i1.11631 – ident: ref24 doi: 10.1109/LRA.2022.3161699 – ident: ref6 doi: 10.1016/j.eswa.2017.05.074 – ident: ref8 doi: 10.1038/nature14236 – start-page: 2085 volume-title: Proc. 17th Int. Conf. Auton. Agents MultiAgent Syst. year: 2018 ident: ref14 article-title: Value-decomposition networks for cooperative multi-agent learning based on team reward – volume: 35 start-page: 24611 volume-title: Proc. Adv. Neural Inf. Process. Syst. year: 2022 ident: ref13 article-title: The surprising effectiveness of ppo in cooperative multi-agent games – year: 2017 ident: ref10 article-title: Proximal policy optimization algorithms – volume-title: Proc. 6th Int. Conf. Learn. Representations year: 2018 ident: ref31 article-title: Noisy networks for exploration – ident: ref5 doi: 10.3389/frobt.2018.00059 – ident: ref25 doi: 10.1609/aaai.v32i1.11796 – start-page: 110 volume-title: Conf. Robot Learn. year: 2020 ident: ref23 article-title: Multi-agent manipulation via locomotion using hierarchical Sim2Real – ident: ref26 doi: 10.1609/aaai.v30i1.10295 – ident: ref1 doi: 10.1109/LRA.2021.3092290 – volume-title: Proc. 4th Int. Conf. Learn. Representations year: 2016 ident: ref27 article-title: Prioritized experience replay – ident: ref19 doi: 10.1613/jair.1.11418 – ident: ref18 doi: 10.1006/ijhc.1997.0162 – ident: ref12 doi: 10.1007/978-3-030-60990-0_12 – ident: ref2 doi: 10.1109/LRA.2022.3148489 – volume: 70 start-page: 449 volume-title: Proc. 34th Int. Conf. Mach. Learn. year: 2017 ident: ref30 article-title: A distributional perspective on reinforcement learning – ident: ref32 doi: 10.1109/ICRA48506.2021.9561359 – volume-title: Proc. 4th Int. Conf. Learn. Representations year: 2016 ident: ref9 article-title: Continuous control with deep reinforcement learning – ident: ref22 doi: 10.1109/LRA.2022.3224667 – ident: ref17 doi: 10.1016/0004-3702(92)90058-6 – ident: ref3 doi: 10.1109/LRA.2022.3191204 – volume: 21 start-page: 7234 issue: 1 year: 2020 ident: ref15 article-title: Monotonic value function factorisation for deep multi-agent reinforcement learning publication-title: J. Mach. Learn. Res. |
| SSID | ssj0001527395 |
| Score | 2.3073626 |
| Snippet | Multi-robot systems (MRSs) are becoming increasingly important in various domains. However, effective communication and coordination among multiple robots... |
| SourceID | proquest crossref ieee |
| SourceType | Aggregation Database Enrichment Source Index Database Publisher |
| StartPage | 159 |
| SubjectTerms | Algorithms asynchronous execution Collision avoidance Cooperation Decision making Heterogeneous networks heterogeneous robots Multi-robot systems Multiagent systems Multiple robots Q-learning Reinforcement learning Robot control Robot kinematics Robots System effectiveness Task analysis Training |
| Title | Heterogeneous Multi-Robot Cooperation With Asynchronous Multi-Agent Reinforcement Learning |
| URI | https://ieeexplore.ieee.org/document/10301527 https://www.proquest.com/docview/2892376562 |
| Volume | 9 |
| WOSCitedRecordID | wos001257126000013&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVIEE databaseName: IEEE Electronic Library (IEL) customDbUrl: eissn: 2377-3766 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0001527395 issn: 2377-3766 databaseCode: RIE dateStart: 20160101 isFulltext: true titleUrlDefault: https://ieeexplore.ieee.org/ providerName: IEEE – providerCode: PRVHPJ databaseName: ROAD: Directory of Open Access Scholarly Resources (ISSN International Center) customDbUrl: eissn: 2377-3766 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0001527395 issn: 2377-3766 databaseCode: M~E dateStart: 20160101 isFulltext: true titleUrlDefault: https://road.issn.org providerName: ISSN International Centre |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NT8JAEJ0I8aAHPzGiSHrw4qFQ2mW7eyREwgGJIRqJl6adbpXEtASKiRd_uzvbIiRGE3vqYbZp9u3H7OzMewDXIuSSa8_AFpHLbdYVji2Rod1VUYRShIiJIXEd-eOxmE7lfVmsbmphlFIm-Uy16NXc5ccZrihU1iZJLJJhrUDF93lRrLUJqBCVmOyuryId2R5Nei1SB295nl6ESeFna-sxWio_FmCzqwwO__k_R3BQuo9Wr8D7GHZUegL7W6SCp_A8pAyXTA8MpU_1lqmwtSdZlOVWP8vmqoDceprlr1Zv-ZEi0eNuLHtUa2VNlGFURRM8tEoS1pcaPA5uH_pDu1RQsNGVbm5jwvWcDSUVBcaOi8RBFCEiD5noOMJnoRfzbuxqvGSkH3QxEUwmSkR-3NGe0RlU0yxV52BJQttjsfS0RZS4IY-Z58hEKuScOaIO7XXnBljSi5PKxVtgjhmODDQcAcERlHDU4ea7xbyg1vjDtkbdv2VX9HwdGmsAg3LyLQN9hqRcH-3ZXfzS7BL29NdZEUppQDVfrNQV7OJ7PlsumlC5-7xtmtH1BXANzos |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LS8NAEB60CurBZ8X6zMGLh7RpstnuHkuxVKxFSsXiJSSTjRakKW0q-O_d2aS2IArmlMMsCfvtY3Z25vsArkXIJdeegS0il9vMF44tkaHtqyhCKULExJC4dhu9nhgO5WNRrG5qYZRSJvlMVenV3OXHKc4pVFYjSSySYV2HDZ8x18nLtZYhFSITk_7iMtKRtW6_WSV98Krn6WWYNH5WNh-jpvJjCTb7Snvvn3-0D7uFA2k1c8QPYE2ND2FnhVbwCF46lOOS6qGh9LneMjW2dj-N0sxqpelE5aBbz6PszWrOPsdIBLlLyyZVW1l9ZThV0YQPrYKG9bUMT-3bQatjFxoKNrrSzWxMuJ61oaSywNhxkViIIkTkIRN1RzRY6MXcj12NmIz0gy4mgslEiagR17VvdAylcTpWJ2BJwttjsfS0RZS4IY-Z58hEKuScOaICtUXnBlgQjJPOxXtgDhqODDQcAcERFHBU4Oa7xSQn1_jDtkzdv2KX93wFzhcABsX0mwX6FEnZPtq3O_2l2RVsdQYP3aB717s_g239JZYHVs6hlE3n6gI28SMbzaaXZox9AXFF0KE |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Heterogeneous+Multi-Robot+Cooperation+With+Asynchronous+Multi-Agent+Reinforcement+Learning&rft.jtitle=IEEE+robotics+and+automation+letters&rft.au=Zhang%2C+Han&rft.au=Zhang%2C+Xiaohui&rft.au=Feng%2C+Zhao&rft.au=Xiao%2C+Xiaohui&rft.date=2024-01-01&rft.pub=IEEE&rft.eissn=2377-3766&rft.volume=9&rft.issue=1&rft.spage=159&rft.epage=166&rft_id=info:doi/10.1109%2FLRA.2023.3328448&rft.externalDocID=10301527 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2377-3766&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2377-3766&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2377-3766&client=summon |