A Multi-Agent Centralized Strategy Gradient Reinforcement Learning Algorithm Based on State Transition
The prevalent utilization of deterministic strategy algorithms in Multi-Agent Deep Reinforcement Learning (MADRL) for collaborative tasks has posed a significant challenge in achieving stable and high-performance cooperative behavior. Addressing the need for the balanced exploration and exploitation...
Uloženo v:
| Vydáno v: | Algorithms Ročník 17; číslo 12; s. 579 |
|---|---|
| Hlavní autoři: | , , |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
Basel
MDPI AG
01.12.2024
|
| Témata: | |
| ISSN: | 1999-4893, 1999-4893 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Abstract | The prevalent utilization of deterministic strategy algorithms in Multi-Agent Deep Reinforcement Learning (MADRL) for collaborative tasks has posed a significant challenge in achieving stable and high-performance cooperative behavior. Addressing the need for the balanced exploration and exploitation of multi-agent ant robots within a partially observable continuous action space, this study introduces a multi-agent centralized strategy gradient algorithm grounded in a local state transition mechanism. In order to solve this challenge, the algorithm learns local state and local state-action representation from local observations and action values, thereby establishing a “local state transition” mechanism autonomously. As the input of the actor network, the automatically extracted local observation representation reduces the input state dimension, enhances the local state features closely related to the local state transition, and promotes the agent to use the local state features that affect the next observation state. To mitigate non-stationarity and reliability assignment issues in multi-agent environments, a centralized critic network evaluates the current joint strategy. The proposed algorithm, NST-FACMAC, is evaluated alongside other multi-agent deterministic strategy algorithms in a continuous control simulation environment using a multi-agent ant robot. The experimental results indicate accelerated convergence and higher average reward values in cooperative multi-agent ant simulation environments. Notably, in four simulated environments named Ant-v2 (2 × 4), Ant-v2 (2 × 4d), Ant-v2 (4 × 2), and Manyant (2 × 3), the algorithm demonstrates performance improvements of approximately 1.9%, 4.8%, 11.9%, and 36.1%, respectively, compared to the best baseline algorithm. These findings underscore the algorithm’s effectiveness in enhancing the stability of multi-agent ant robot control within dynamic environments. |
|---|---|
| AbstractList | The prevalent utilization of deterministic strategy algorithms in Multi-Agent Deep Reinforcement Learning (MADRL) for collaborative tasks has posed a significant challenge in achieving stable and high-performance cooperative behavior. Addressing the need for the balanced exploration and exploitation of multi-agent ant robots within a partially observable continuous action space, this study introduces a multi-agent centralized strategy gradient algorithm grounded in a local state transition mechanism. In order to solve this challenge, the algorithm learns local state and local state-action representation from local observations and action values, thereby establishing a “local state transition” mechanism autonomously. As the input of the actor network, the automatically extracted local observation representation reduces the input state dimension, enhances the local state features closely related to the local state transition, and promotes the agent to use the local state features that affect the next observation state. To mitigate non-stationarity and reliability assignment issues in multi-agent environments, a centralized critic network evaluates the current joint strategy. The proposed algorithm, NST-FACMAC, is evaluated alongside other multi-agent deterministic strategy algorithms in a continuous control simulation environment using a multi-agent ant robot. The experimental results indicate accelerated convergence and higher average reward values in cooperative multi-agent ant simulation environments. Notably, in four simulated environments named Ant-v2 (2 × 4), Ant-v2 (2 × 4d), Ant-v2 (4 × 2), and Manyant (2 × 3), the algorithm demonstrates performance improvements of approximately 1.9%, 4.8%, 11.9%, and 36.1%, respectively, compared to the best baseline algorithm. These findings underscore the algorithm’s effectiveness in enhancing the stability of multi-agent ant robot control within dynamic environments. |
| Audience | Academic |
| Author | Sheng, Lei Chen, Xiliang Chen, Honghui |
| Author_xml | – sequence: 1 givenname: Lei surname: Sheng fullname: Sheng, Lei – sequence: 2 givenname: Honghui surname: Chen fullname: Chen, Honghui – sequence: 3 givenname: Xiliang surname: Chen fullname: Chen, Xiliang |
| BookMark | eNptUcFu3CAQRVUqNUl76B9Y6qkHJwPY2BzdVZtG2ihSm54RC4PLyoYUvIf064u7UdJIFRIMw3tvhnln5CTEgIS8p3DBuYRLTTvKoO3kK3JKpZR100t-8k_8hpzlvAcQrRT0lLihujlMi6-HEcNSbcqW9OR_o62-l2jB8aG6Str69fUb-uBiMjivty3qFHwYq2EaY_LLz7n6pHMhxlC4hVrdJR2yX3wMb8lrp6eM7x7Pc_Ljy-e7zdd6e3t1vRm2teF9u9TdjjLRGuNaSntkjWyh7xpmemE7ax2n2tFOgxGIIDTqXjPWAjSc7ToLpuHn5Pqoa6Peq_vkZ50eVNRe_U3ENCqdFm8mVLJjQMEASmobsHxX6jFEt4MWHBWiaH04at2n-OuAeVH7eEihtK84XVvjrWDPqFEX0XU-ZWxm9tmooWdUAG8kL6iL_6DKsjh7Uyx0vuRfED4eCSbFnBO6p89QUKvT6slp_gfHwZof |
| Cites_doi | 10.1523/JNEUROSCI.1327-21.2021 10.1016/j.procs.2017.05.431 10.3390/app11114948 10.1016/j.ins.2022.10.042 10.1109/TCYB.2020.2977661 10.1108/IR-11-2019-0240 10.1007/s43154-020-00039-w 10.1007/s10462-022-10299-x 10.1177/09544100241235824 10.1007/s10462-022-10335-w 10.1016/j.neucom.2020.05.097 10.1093/bib/bbab513 10.1016/j.apenergy.2018.11.002 10.1016/j.eswa.2020.113573 10.1038/nature14236 10.1038/nature16961 10.1016/j.robot.2021.103905 10.1109/IJCNN52387.2021.9533636 10.1109/TSMC.2024.3370186 |
| ContentType | Journal Article |
| Copyright | COPYRIGHT 2024 MDPI AG 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. |
| Copyright_xml | – notice: COPYRIGHT 2024 MDPI AG – notice: 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. |
| DBID | AAYXX CITATION 3V. 7SC 7TB 7XB 8AL 8FD 8FE 8FG 8FK ABJCF ABUWG AFKRA ARAPS AZQEC BENPR BGLVJ CCPQU DWQXO FR3 GNUQQ HCIFZ JQ2 K7- KR7 L6V L7M L~C L~D M0N M7S P62 PHGZM PHGZT PIMPY PKEHL PQEST PQGLB PQQKQ PQUKI PRINS PTHSS Q9U DOA |
| DOI | 10.3390/a17120579 |
| DatabaseName | CrossRef ProQuest Central (Corporate) Computer and Information Systems Abstracts Mechanical & Transportation Engineering Abstracts ProQuest Central (purchase pre-March 2016) Computing Database (Alumni Edition) Technology Research Database ProQuest SciTech Collection ProQuest Technology Collection ProQuest Central (Alumni) (purchase pre-March 2016) Materials Science & Engineering Collection ProQuest Central (Alumni) ProQuest Central UK/Ireland Advanced Technologies & Computer Science Collection ProQuest Central Essentials ProQuest Central Technology Collection ProQuest One ProQuest Central Korea Engineering Research Database ProQuest Central Student SciTech Premium Collection ProQuest Computer Science Collection Computer Science Database Civil Engineering Abstracts ProQuest Engineering Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional Computing Database Engineering Database ProQuest Advanced Technologies & Aerospace Collection ProQuest Central Premium ProQuest One Academic (New) Publicly Available Content Database ProQuest One Academic Middle East (New) ProQuest One Academic Eastern Edition (DO NOT USE) ProQuest One Applied & Life Sciences ProQuest One Academic (retired) ProQuest One Academic UKI Edition ProQuest Central China Engineering Collection ProQuest Central Basic DOAJ Open Access Journals |
| DatabaseTitle | CrossRef Publicly Available Content Database Computer Science Database ProQuest Central Student Technology Collection Technology Research Database Computer and Information Systems Abstracts – Academic ProQuest One Academic Middle East (New) Mechanical & Transportation Engineering Abstracts ProQuest Advanced Technologies & Aerospace Collection ProQuest Central Essentials ProQuest Computer Science Collection Computer and Information Systems Abstracts ProQuest Central (Alumni Edition) SciTech Premium Collection ProQuest One Community College ProQuest Central China ProQuest Central ProQuest One Applied & Life Sciences ProQuest Engineering Collection ProQuest Central Korea ProQuest Central (New) Advanced Technologies Database with Aerospace Engineering Collection Advanced Technologies & Aerospace Collection Civil Engineering Abstracts ProQuest Computing Engineering Database ProQuest Central Basic ProQuest Computing (Alumni Edition) ProQuest One Academic Eastern Edition ProQuest Technology Collection ProQuest SciTech Collection Computer and Information Systems Abstracts Professional ProQuest One Academic UKI Edition Materials Science & Engineering Collection Engineering Research Database ProQuest One Academic ProQuest One Academic (New) ProQuest Central (Alumni) |
| DatabaseTitleList | CrossRef Publicly Available Content Database |
| Database_xml | – sequence: 1 dbid: DOA name: DOAJ Directory of Open Access Journals url: https://www.doaj.org/ sourceTypes: Open Website – sequence: 2 dbid: PIMPY name: Publicly Available Content Database url: http://search.proquest.com/publiccontent sourceTypes: Aggregation Database |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Computer Science |
| EISSN | 1999-4893 |
| ExternalDocumentID | oai_doaj_org_article_972010c0e91d40d3be242eefb050f166 A821603493 10_3390_a17120579 |
| GroupedDBID | 23M 2WC 5VS 8FE 8FG AADQD AAFWJ AAYXX ABDBF ABJCF ABUWG ACUHS ADBBV AFFHD AFKRA AFPKN AFZYC ALMA_UNASSIGNED_HOLDINGS AMVHM ARAPS AZQEC BCNDV BENPR BGLVJ BPHCQ CCPQU CITATION DWQXO E3Z ESX GNUQQ GROUPED_DOAJ HCIFZ IAO ICD ITC J9A K6V K7- KQ8 L6V M7S MODMG M~E OK1 OVT P2P PHGZM PHGZT PIMPY PQGLB PQQKQ PROAC PTHSS TR2 TUS 3V. 7SC 7TB 7XB 8AL 8FD 8FK FR3 JQ2 KR7 L7M L~C L~D M0N P62 PKEHL PQEST PQUKI PRINS Q9U |
| ID | FETCH-LOGICAL-c385t-7b1265ccf5118e249508742c86d7ddf31af17a0c6ee06aea8a22500432b7d0c43 |
| IEDL.DBID | M7S |
| ISICitedReferencesCount | 0 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001384067100001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 1999-4893 |
| IngestDate | Tue Oct 14 14:51:09 EDT 2025 Fri Jul 25 12:14:29 EDT 2025 Tue Nov 11 10:49:16 EST 2025 Tue Nov 04 18:14:08 EST 2025 Sat Nov 29 07:11:08 EST 2025 |
| IsDoiOpenAccess | true |
| IsOpenAccess | true |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 12 |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c385t-7b1265ccf5118e249508742c86d7ddf31af17a0c6ee06aea8a22500432b7d0c43 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| OpenAccessLink | https://www.proquest.com/docview/3149503562?pq-origsite=%requestingapplication% |
| PQID | 3149503562 |
| PQPubID | 2032439 |
| ParticipantIDs | doaj_primary_oai_doaj_org_article_972010c0e91d40d3be242eefb050f166 proquest_journals_3149503562 gale_infotracmisc_A821603493 gale_infotracacademiconefile_A821603493 crossref_primary_10_3390_a17120579 |
| PublicationCentury | 2000 |
| PublicationDate | 2024-12-01 |
| PublicationDateYYYYMMDD | 2024-12-01 |
| PublicationDate_xml | – month: 12 year: 2024 text: 2024-12-01 day: 01 |
| PublicationDecade | 2020 |
| PublicationPlace | Basel |
| PublicationPlace_xml | – name: Basel |
| PublicationTitle | Algorithms |
| PublicationYear | 2024 |
| Publisher | MDPI AG |
| Publisher_xml | – name: MDPI AG |
| References | ref_50 Nagy (ref_5) 2019; 235 Peng (ref_22) 2021; 34 Li (ref_36) 2020; 51 Ji (ref_6) 2024; 238 ref_10 ref_52 Brunec (ref_47) 2022; 42 ref_51 Gans (ref_13) 2021; 2 (ref_33) 2017; 109 Rashid (ref_29) 2020; 33 ref_18 Rashid (ref_14) 2020; 21 Phan (ref_19) 2021; 34 ref_25 Silver (ref_2) 2016; 529 ref_24 Zhang (ref_32) 2024; 54 Cha (ref_37) 2021; 34 Wang (ref_17) 2021; 34 ref_28 ref_27 Liu (ref_34) 2022; 35 Ghassemi (ref_12) 2022; 147 Park (ref_8) 2020; 158 Lin (ref_23) 2021; 34 Shen (ref_31) 2022; 35 Hu (ref_7) 2024; 2 ref_35 Lowe (ref_20) 2017; 30 Volodymyr (ref_3) 2015; 518 Shi (ref_11) 2020; 47 Guo (ref_1) 2014; 4 ref_39 Wong (ref_16) 2023; 56 Rastogi (ref_38) 2018; 12 ref_46 ref_45 ref_44 ref_43 ref_42 (ref_40) 2022; 1–6 ref_41 Gorsane (ref_26) 2022; 35 ref_49 ref_48 ref_9 Ning (ref_15) 2024; 3 Du (ref_30) 2022; 615 Plaat (ref_53) 2023; 56 ref_4 Zhang (ref_21) 2020; 411 |
| References_xml | – volume: 42 start-page: 299 year: 2022 ident: ref_47 article-title: Predictive representations in hippocampal and prefrontal hierarchies publication-title: J. Neurosci. doi: 10.1523/JNEUROSCI.1327-21.2021 – ident: ref_9 – ident: ref_49 – volume: 109 start-page: 1146 year: 2017 ident: ref_33 article-title: An adaptive implementation of ε-greedy in reinforcement learning publication-title: Procedia Comput. Sci. doi: 10.1016/j.procs.2017.05.431 – ident: ref_51 – ident: ref_39 – ident: ref_27 doi: 10.3390/app11114948 – volume: 30 start-page: 1 year: 2017 ident: ref_20 article-title: Multi-agent actor-critic for mixed cooperative-competitive environments publication-title: Adv. Neural Inf. Process. Syst. – volume: 34 start-page: 29142 year: 2021 ident: ref_17 article-title: Towards understanding cooperative multi-agent q-learning with value factorization publication-title: Adv. Neural Inf. Process. Syst. – ident: ref_42 – ident: ref_35 – ident: ref_4 – volume: 21 start-page: 1 year: 2020 ident: ref_14 article-title: Monotonic value function factorisation for deep multi-agent reinforcement learning publication-title: J. Mach. Learn. Res. – ident: ref_52 – volume: 12 start-page: 1 year: 2018 ident: ref_38 article-title: Is Q-learning provably efficient? publication-title: Adv. Neural Inf. Process. Syst. – ident: ref_10 – volume: 615 start-page: 191 year: 2022 ident: ref_30 article-title: Value function factorization with dynamic weighting for deep multi-agent reinforcement learning publication-title: Inf. Sci. doi: 10.1016/j.ins.2022.10.042 – ident: ref_41 – volume: 51 start-page: 3103 year: 2020 ident: ref_36 article-title: Deep reinforcement learning for multi-objective optimization publication-title: IEEE Trans. Cybern. doi: 10.1109/TCYB.2020.2977661 – volume: 47 start-page: 335 year: 2020 ident: ref_11 article-title: Deep reinforcement learning-based attitude motion control for humanoid robots with stability constraints publication-title: Ind. Robot. Int. J. Robot. Res. Appl. doi: 10.1108/IR-11-2019-0240 – ident: ref_45 – volume: 2 start-page: 105 year: 2021 ident: ref_13 article-title: Cooperative multirobot systems for military applications publication-title: Curr. Robot. Rep. doi: 10.1007/s43154-020-00039-w – volume: 56 start-page: 5023 year: 2023 ident: ref_16 article-title: Deep multiagent reinforcement learning: Challenges and directions publication-title: Artif. Intell. Rev. doi: 10.1007/s10462-022-10299-x – volume: 33 start-page: 10199 year: 2020 ident: ref_29 article-title: Weighted qmix: Expanding monotonic value function factorisation for deep multi-agent reinforcement learning publication-title: Adv. Neural Inf. Process. Syst. – volume: 238 start-page: 711 year: 2024 ident: ref_6 article-title: Single-lever control method design based on power management system and deep reinforcement learning for turboprop engines publication-title: Proc. Inst. Mech. Eng. Part G J. Aerosp. Eng. doi: 10.1177/09544100241235824 – volume: 3 start-page: 73 year: 2024 ident: ref_15 article-title: A survey on multi-agent reinforcement learning and its application publication-title: J. Autom. Intell. – volume: 56 start-page: 9541 year: 2023 ident: ref_53 article-title: High-accuracy model-based reinforcement learning, a survey publication-title: Artif. Intell. Rev. doi: 10.1007/s10462-022-10335-w – volume: 411 start-page: 206 year: 2020 ident: ref_21 article-title: A TD3-based multi-agent deep reinforcement learning method in mixed cooperation-competition environment publication-title: Neurocomputing doi: 10.1016/j.neucom.2020.05.097 – ident: ref_28 – ident: ref_48 doi: 10.1093/bib/bbab513 – volume: 1–6 start-page: 84 year: 2022 ident: ref_40 article-title: Intrinsic motivation based on feature extractor distillation publication-title: Kognice Umelý Zivot XX – ident: ref_24 – volume: 235 start-page: 1072 year: 2019 ident: ref_5 article-title: Reinforcement learning for demand response: A review of algorithms and modeling techniques publication-title: Appl. Energy doi: 10.1016/j.apenergy.2018.11.002 – volume: 4 start-page: 3338 year: 2014 ident: ref_1 article-title: Deep learning for real-time Atari game play using offline Monte-Carlo tree search planning publication-title: Int. Conf. Neural Inf. Process. Syst. – ident: ref_44 – volume: 158 start-page: 113573 year: 2020 ident: ref_8 article-title: An intelligent financial portfolio trading strategy using deep Q-learning publication-title: Expert Syst. Appl. doi: 10.1016/j.eswa.2020.113573 – ident: ref_25 – ident: ref_50 – volume: 34 start-page: 22405 year: 2021 ident: ref_37 article-title: Swad: Domain generalization by seeking flat minima publication-title: Adv. Neural Inf. Process. Syst. – volume: 518 start-page: 529 year: 2015 ident: ref_3 article-title: Human-level control through deep reinforcement learning publication-title: Nature doi: 10.1038/nature14236 – ident: ref_46 – volume: 529 start-page: 484 year: 2016 ident: ref_2 article-title: Mastering the game of Go with deep neural networks and tree search publication-title: Nature doi: 10.1038/nature16961 – volume: 35 start-page: 5471 year: 2022 ident: ref_31 article-title: ResQ: A Residual Q Function-based Approach for Multi-Agent Reinforcement Learning Value Factorization publication-title: Adv. Neural Inf. Process. Syst. – volume: 35 start-page: 5093 year: 2022 ident: ref_34 article-title: Understanding deep neural function approximation in reinforcement learning via ϵ-greedy exploration publication-title: Adv. Neural Inf. Process. Syst. – ident: ref_43 – volume: 2 start-page: 2 year: 2024 ident: ref_7 article-title: Towards risk-aware real-time security constrained economic dispatch: A tailored deep reinforcement learning approach publication-title: IEEE Trans. Power Syst. A Publ. Power Eng. Soc. – volume: 147 start-page: 103905 year: 2022 ident: ref_12 article-title: Multi-robot task allocation in disaster response: Addressing dynamic tasks with deadlines and robots with range and payload constraints publication-title: Robot. Auton. Syst. doi: 10.1016/j.robot.2021.103905 – volume: 34 start-page: 15230 year: 2021 ident: ref_23 article-title: Learning to ground multi-agent communication with autoencoders publication-title: Adv. Neural Inf. Process. Syst. – ident: ref_18 doi: 10.1109/IJCNN52387.2021.9533636 – volume: 34 start-page: 24018 year: 2021 ident: ref_19 article-title: Vast: Value function factorization with variable agent sub-teams publication-title: Adv. Neural Inf. Process. Syst. – volume: 35 start-page: 5510 year: 2022 ident: ref_26 article-title: Towards a standardised performance evaluation protocol for cooperative marl publication-title: Adv. Neural Inf. Process. Syst. – volume: 54 start-page: 6550 year: 2024 ident: ref_32 article-title: SQIX: QMIX Algorithm Activated by General Softmax Operator for Cooperative Multiagent Reinforcement Learning publication-title: IEEE Trans. Syst. Man Cybern. Syst. doi: 10.1109/TSMC.2024.3370186 – volume: 34 start-page: 12208 year: 2021 ident: ref_22 article-title: Facmac: Factored multi-agent centralised policy gradients publication-title: Adv. Neural Inf. Process. Syst. |
| SSID | ssj0065961 |
| Score | 2.3232658 |
| Snippet | The prevalent utilization of deterministic strategy algorithms in Multi-Agent Deep Reinforcement Learning (MADRL) for collaborative tasks has posed a... |
| SourceID | doaj proquest gale crossref |
| SourceType | Open Website Aggregation Database Index Database |
| StartPage | 579 |
| SubjectTerms | Actors Actresses Algorithms automatic representation Automation Collaborative learning Control simulation Control stability Cooperation Cooperative control Critics Data mining Deep learning deterministic strategy gradient algorithm Efficiency Exploitation exploration Learning strategies Machine learning multi-agent reinforcement learning Multiagent systems Network reliability Representations Robot control Robots state transition |
| SummonAdditionalLinks | – databaseName: DOAJ Open Access Journals dbid: DOA link: http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV3PS8MwFA4yPHjxt1idEkTwVJY0bZMcO3F6GiIKu4U0SedAN-mqoH-9SZoOdxAvXtscwvf68r6Xvvc9AC4ZksSyZBWXipk4pUTFnGt3HUa5wlwnaVb6YRN0PGaTCb__MerL1YS18sAtcANO3f9ahQzHOkWalMYGFWOqEmWowrkX20aUd8lUewbnGc9xqyNEbFI_kJjixLVdrkUfL9L_21Hs48toF2wHYgiLdkN7YMPM98FON3QBBh88AFUBfdNsXLimKBhuZ2dfRsOgNPsJb2tfyNXAB-OFUZW_A4RBS3UKi5fpop41z69waIOYhos59KQT-sjli7gOwdPo5vH6Lg7DEmJFWNbEtMRJnilVuZTBuInSiNm0V7FcU60rgmWFqUQqNwbl0kgmrSd7Qb6SaqRScgR688XcHANYZYYkyjITbcmSJFpmiZQZxynVLE2rJAIXHYjirdXEEDaXcEiLFdIRGDp4VwucjLV_YI0rgnHFX8aNwJUzjnBYWQiVDD0Ddp9OtkoULHFjslNOItBfW2mdRK2_7swrgpMuBXHZISKWAZ78x2ZPwVZiGU9b69IHvaZ-N2dgU300s2V97r_Pb7Mg6JA priority: 102 providerName: Directory of Open Access Journals |
| Title | A Multi-Agent Centralized Strategy Gradient Reinforcement Learning Algorithm Based on State Transition |
| URI | https://www.proquest.com/docview/3149503562 https://doaj.org/article/972010c0e91d40d3be242eefb050f166 |
| Volume | 17 |
| WOSCitedRecordID | wos001384067100001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVAON databaseName: DOAJ Directory of Open Access Journals customDbUrl: eissn: 1999-4893 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0065961 issn: 1999-4893 databaseCode: DOA dateStart: 20080101 isFulltext: true titleUrlDefault: https://www.doaj.org/ providerName: Directory of Open Access Journals – providerCode: PRVHPJ databaseName: ROAD: Directory of Open Access Scholarly Resources (ISSN International Center) customDbUrl: eissn: 1999-4893 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0065961 issn: 1999-4893 databaseCode: M~E dateStart: 20080101 isFulltext: true titleUrlDefault: https://road.issn.org providerName: ISSN International Centre – providerCode: PRVPQU databaseName: Computer Science Database (ProQuest) customDbUrl: eissn: 1999-4893 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0065961 issn: 1999-4893 databaseCode: K7- dateStart: 20080301 isFulltext: true titleUrlDefault: http://search.proquest.com/compscijour providerName: ProQuest – providerCode: PRVPQU databaseName: Engineering Database customDbUrl: eissn: 1999-4893 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0065961 issn: 1999-4893 databaseCode: M7S dateStart: 20080301 isFulltext: true titleUrlDefault: http://search.proquest.com providerName: ProQuest – providerCode: PRVPQU databaseName: ProQuest Central Database Suite (ProQuest) customDbUrl: eissn: 1999-4893 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0065961 issn: 1999-4893 databaseCode: BENPR dateStart: 20080301 isFulltext: true titleUrlDefault: https://www.proquest.com/central providerName: ProQuest – providerCode: PRVPQU databaseName: Publicly Available Content Database customDbUrl: eissn: 1999-4893 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0065961 issn: 1999-4893 databaseCode: PIMPY dateStart: 20080301 isFulltext: true titleUrlDefault: http://search.proquest.com/publiccontent providerName: ProQuest |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1LbxMxEB5By4EL5SlSSmQhJE5W7fXu2j6hDUoBIaKogFROltf2ppUgaZMFCQ78djyOtygHuHDxYe3DyON57sw3AM8VsyJ6yY62TgVaSuGo1h7TYVI7rn1RVm0aNiFnM3V2puc54bbJZZWDTkyK2q8c5siPBbryTERz_fLyiuLUKPy7mkdo3IR9REngqXTvw6CJ60rXfIsmJGJof2y55AU2X-7YoATV_zeFnKzMycH_0ncX7mT_kjTbB3EPboTlfTgYZjeQLMoPoGtI6r2lDfZWkZzkvfgZPMmAtT_I63WqB-vJaUj4qi6lEkmGZF2Q5ssiUtCffyWTaAs9WS1J8l1JMoCpFuwhfDqZfnz1huaZC9QJVfVUtryoK-c6jDwCDqZmKkbPTtVeet8JbjsuLXN1CKy2wSobFULC9WulZ64Uj2BvuVqGx0C6KojCRQfHR5_LCm-rwtpK81J6VZZdMYJnAxfM5RZaw8SQBFllrlk1ggny5_oAomGnD6v1wmThMlriP33Hgua-ZF60kfAihK5lFet4XY_gBXLX4F3FK3Q2tx5EOhH9yjSqwGnbpRYjONo5GWXN7W4PzDdZ1jfmD-cP_739BG4X0SXaFsMcwV6__haewi33vb_YrMewP5nO5qfjlBWI6ztJx-k54_prGvfnb9_PP_8GuuL9TQ |
| linkProvider | ProQuest |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V1Lb9QwEB6VggQXyqtioYCFQJyi-pHE9gGhFCittqwQKlJvxrGdpRLdbXcDqPwofiMeJynaA9x64BpHkR-fZz47M_MBPFPUisiSXVY7FbJcCpdp7fE6TGrHtOd5USexCTmZqKMj_WENfg25MBhWOdjEZKj93OEd-bZAKk9FdNevTs8yVI3Cv6uDhEYHi3E4_xGPbMuX-2_i-j7nfPft4eu9rFcVyJxQRZvJmvGycK5Bbh1QepmqeD50qvTS-0Yw2zBpqStDoKUNVtkI-VS5rpaeulzE716Bq7lQEvfVWGaD5S8LXbKuepEQmm5bJhnHZM8Vn5ekAf7mAJJX29343-bjFtzs-TOpOsDfhrUwuwMbgzYF6U3VXWgqknKLswpzx0h_iX38M3jSF-Q9J-8WKd6tJR9Dqh_r0lUp6UvOTkn1dRpH3H45ITvR13syn5HEzUly8CnW7R58upTRbsL6bD4L94E0RRDcRQLnI6e0wtuCW1tolkuv8rzhI3g6rLo57UqHmHjkQmiYC2iMYAfxcPECVvtOD-aLqemNh9ESYxYcDZr5nHpRx47zEJqaFrRhZTmCF4gmg3MVp9DZPrUi9hOre5lKcVQTz7UYwdbKm9GWuNXmAWymt2VL8wdpD_7d_ASu7x2-PzAH-5PxQ7jBI_3rAn-2YL1dfAuP4Jr73h4vF4_TtiHw-bJx-Rspa1Km |
| linkToPdf | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V1Nb9QwEB2VLUJcKJ9ioYCFQJyi9UcS2weEUsrCqrBaIZDKyTi2s61UdstuAJWfxq_DdpyiPcCtB66xFdnO88yzM_MG4InAmnmWbLLaCJflnJlMShuuw7g0RFqaF3UsNsGnU3F4KGdb8KvPhQlhlb1NjIbaLk24Ix-xQOUx8-561KSwiNn--MXp1yxUkAp_WvtyGh1EDtzZD398Wz-f7Ptv_ZTS8asPL99kqcJAZpgo2ozXhJaFMU3g2S6UYcbCnxWNKC23tmFEN4RrbErncKmdFtrDP6rY1dxikzP_3kuw7Sl5TgewPZu8m33q_UBZyJJ0WkaMSTzShBMaUj83PGAsFPA3dxB93Hjnf16d63AtMWtUdVvhBmy5xU3Y6atWoGTEbkFToZh1nFUhqwyl6-3jn86iJNV7hl6vYiRci967qCxr4iUqSmK0c1SdzP2M26MvaM-zAIuWCxRZO4quP0bB3YaPFzLbOzBYLBfuLqCmcIwaT-2sZ5uaWV1QrQtJcm5Fnjd0CI97BKjTTlRE-cNYgIk6h8kQ9gI2zjsEHfD4YLmaq2RWlOQhmsFgJ4nNsWW1Hzh1rqlxgRtSlkN4FpClwlr5JTQ6JV34cQbdL1UJGuqM55INYXejp7cyZrO5B55KVm6t_qDu3r-bH8EVD0f1djI9uA9XqeeFXUTQLgza1Tf3AC6b7-3xevUw7SEEny8amL8B_5BdJw |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+Multi-Agent+Centralized+Strategy+Gradient+Reinforcement+Learning+Algorithm+Based+on+State+Transition&rft.jtitle=Algorithms&rft.au=Sheng%2C+Lei&rft.au=Chen%2C+Honghui&rft.au=Chen%2C+Xiliang&rft.date=2024-12-01&rft.pub=MDPI+AG&rft.issn=1999-4893&rft.eissn=1999-4893&rft.volume=17&rft.issue=12&rft_id=info:doi/10.3390%2Fa17120579&rft.externalDocID=A821603493 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1999-4893&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1999-4893&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1999-4893&client=summon |