Hindsight-aware deep reinforcement learning algorithm for multi-agent systems
Classic reinforcement learning algorithms generate experiences by the agent's constant trial and error, which leads to a large number of failure experiences stored in the replay buffer. As a result, the agents can only learn through these low-quality experiences. In the case of multi-agent syst...
Uloženo v:
| Vydáno v: | International journal of machine learning and cybernetics Ročník 13; číslo 7; s. 2045 - 2057 |
|---|---|
| Hlavní autoři: | , , |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
Berlin/Heidelberg
Springer Berlin Heidelberg
01.07.2022
Springer Nature B.V |
| Témata: | |
| ISSN: | 1868-8071, 1868-808X |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Abstract | Classic reinforcement learning algorithms generate experiences by the agent's constant trial and error, which leads to a large number of failure experiences stored in the replay buffer. As a result, the agents can only learn through these low-quality experiences. In the case of multi-agent systems, this problem is more serious. MADDPG (Multi-Agent Deep Deterministic Policy Gradient) has achieved significant results in solving multi-agent problems by using a framework of centralized training with decentralized execution. Nevertheless, the problem of too many failure experiences in the replay buffer has not been resolved. In this paper, we propose HMADDPG (Hindsight Multi-Agent Deep Deterministic Policy Gradient) to mitigate the negative impact of failure experience. HMADDPG has a hindsight unit, which allows the agents to reflect and produces pseudo experiences that tend to succeed. Pseudo experiences are stored in the replay buffer, so that the agents can combine two kinds of experiences to learn. We have evaluated our algorithm on a number of environments. The results show that the algorithm can guide agents to learn better strategies and can be applied in multi-agent systems which are cooperative, competitive, or mixed cooperative and competitive. |
|---|---|
| AbstractList | Classic reinforcement learning algorithms generate experiences by the agent's constant trial and error, which leads to a large number of failure experiences stored in the replay buffer. As a result, the agents can only learn through these low-quality experiences. In the case of multi-agent systems, this problem is more serious. MADDPG (Multi-Agent Deep Deterministic Policy Gradient) has achieved significant results in solving multi-agent problems by using a framework of centralized training with decentralized execution. Nevertheless, the problem of too many failure experiences in the replay buffer has not been resolved. In this paper, we propose HMADDPG (Hindsight Multi-Agent Deep Deterministic Policy Gradient) to mitigate the negative impact of failure experience. HMADDPG has a hindsight unit, which allows the agents to reflect and produces pseudo experiences that tend to succeed. Pseudo experiences are stored in the replay buffer, so that the agents can combine two kinds of experiences to learn. We have evaluated our algorithm on a number of environments. The results show that the algorithm can guide agents to learn better strategies and can be applied in multi-agent systems which are cooperative, competitive, or mixed cooperative and competitive. |
| Author | Li, Chengjing Wang, Li Huang, Zirong |
| Author_xml | – sequence: 1 givenname: Chengjing surname: Li fullname: Li, Chengjing organization: College of Data Science, Taiyuan University of Technology – sequence: 2 givenname: Li orcidid: 0000-0002-7385-1426 surname: Wang fullname: Wang, Li email: wangli@tyut.edu.cn organization: College of Data Science, Taiyuan University of Technology – sequence: 3 givenname: Zirong surname: Huang fullname: Huang, Zirong organization: College of Data Science, Taiyuan University of Technology |
| BookMark | eNp9kMFKw0AQhhepYK19AU8Bz9HZ3SbdHKWoFRQvCt6WSTJJtySburvF9u1NjSh46MAwA_N_M8N_zka2s8TYJYdrDjC_8VzCTMQg-uQJJPHuhI25SlWsQL2Pfvs5P2NT79fQRwpSghiz56WxpTf1KsT4iY6ikmgTOTK26lxBLdkQNYTOGltH2NSdM2HVRv0wardNMDHWB4nf-0Ctv2CnFTaepj91wt7u714Xy_jp5eFxcfsUF5Jn_aVCAvECywJUrggJBWZ5CSATBF6ilGmWVXkyU0ogVTNF85SwwhLKHAsOcsKuhr0b131syQe97rbO9ie1yHgmUymSg0oNqsJ13juqdGECBtPZ4NA0moM--KcH_3Tvn_72T-96VPxDN8606PbHITlAvhfbmtzfV0eoLyMAh14 |
| CitedBy_id | crossref_primary_10_1016_j_eswa_2025_129617 crossref_primary_10_1145_3675167 crossref_primary_10_1007_s13042_023_01845_2 |
| Cites_doi | 10.1038/nature14236 10.1038/nature16961 10.1109/TPWRS.2018.2876612 10.2352/ISSN.2470-1173.2017.19.AVM-023 10.1016/j.neucom.2019.06.022 10.3390/s20205911 10.1007/s10994-019-05864-5 10.1016/B978-1-55860-335-6.50027-1 10.1109/SAUPEC/RobMech/PRASA48453.2020.9041058 10.1609/aaai.v32i1.11794 10.1371/journal.pone.0172395 10.1177/1729881419898342 |
| ContentType | Journal Article |
| Copyright | The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2022 The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2022. |
| Copyright_xml | – notice: The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2022 – notice: The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2022. |
| DBID | AAYXX CITATION 8FE 8FG ABJCF AFKRA ARAPS AZQEC BENPR BGLVJ CCPQU DWQXO GNUQQ HCIFZ JQ2 K7- L6V M7S P5Z P62 PHGZM PHGZT PKEHL PQEST PQGLB PQQKQ PQUKI PTHSS |
| DOI | 10.1007/s13042-022-01505-x |
| DatabaseName | CrossRef ProQuest SciTech Collection ProQuest Technology Collection Materials Science & Engineering Collection ProQuest Central UK/Ireland Advanced Technologies & Computer Science Collection ProQuest Central Essentials ProQuest Central ProQuest Technology Collection ProQuest One ProQuest Central ProQuest Central Student SciTech Premium Collection ProQuest Computer Science Collection Computer Science Database ProQuest Engineering Collection Engineering Database ProQuest advanced technologies & aerospace journals ProQuest Advanced Technologies & Aerospace Collection Proquest Central Premium ProQuest One Academic (New) ProQuest One Academic Middle East (New) ProQuest One Academic Eastern Edition (DO NOT USE) One Applied & Life Sciences ProQuest One Academic (retired) ProQuest One Academic UKI Edition Engineering collection |
| DatabaseTitle | CrossRef Computer Science Database ProQuest Central Student Technology Collection ProQuest One Academic Middle East (New) ProQuest Advanced Technologies & Aerospace Collection ProQuest Central Essentials ProQuest Computer Science Collection SciTech Premium Collection ProQuest One Community College ProQuest Central ProQuest One Applied & Life Sciences ProQuest Engineering Collection ProQuest Central Korea ProQuest Central (New) Engineering Collection Advanced Technologies & Aerospace Collection Engineering Database ProQuest One Academic Eastern Edition ProQuest Technology Collection ProQuest SciTech Collection Advanced Technologies & Aerospace Database ProQuest One Academic UKI Edition Materials Science & Engineering Collection ProQuest One Academic ProQuest One Academic (New) |
| DatabaseTitleList | Computer Science Database |
| Database_xml | – sequence: 1 dbid: BENPR name: ProQuest Central url: https://www.proquest.com/central sourceTypes: Aggregation Database |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Engineering Sciences (General) |
| EISSN | 1868-808X |
| EndPage | 2057 |
| ExternalDocumentID | 10_1007_s13042_022_01505_x |
| GrantInformation_xml | – fundername: national natural science foundation of china grantid: 61872260 funderid: http://dx.doi.org/10.13039/501100001809 |
| GroupedDBID | -EM 06D 0R~ 0VY 1N0 203 29~ 2JY 2VQ 30V 4.4 406 408 409 40D 96X AACDK AAHNG AAIAL AAJBT AAJKR AANZL AARHV AARTL AASML AATNV AATVU AAUYE AAWCG AAYIU AAYQN AAYTO AAYZH AAZMS ABAKF ABBXA ABDZT ABECU ABFTD ABFTV ABHQN ABJCF ABJNI ABJOX ABKCH ABMQK ABQBU ABSXP ABTEG ABTHY ABTKH ABTMW ABULA ABWNU ABXPI ACAOD ACDTI ACGFS ACHSB ACKNC ACMLO ACOKC ACPIV ACZOJ ADHHG ADHIR ADINQ ADKNI ADKPE ADRFC ADTPH ADURQ ADYFF ADZKW AEBTG AEFQL AEGNC AEJHL AEJRE AEMSY AENEX AEOHA AEPYU AESKC AETCA AEVLU AEXYK AFBBN AFKRA AFLOW AFQWF AFWTZ AFZKB AGAYW AGDGC AGJBK AGMZJ AGQEE AGQMX AGRTI AGWZB AGYKE AHAVH AHBYD AHKAY AHSBF AHYZX AIAKS AIGIU AIIXL AILAN AITGF AJBLW AJRNO AJZVZ AKLTO ALFXC ALMA_UNASSIGNED_HOLDINGS AMKLP AMXSW AMYLF AMYQR ANMIH ARAPS AUKKA AXYYD AYJHY BENPR BGLVJ BGNMA CCPQU CSCUP DNIVK DPUIP EBLON EBS EIOEI EJD ESBYG FERAY FIGPU FINBP FNLPD FRRFC FSGXE FYJPI GGCAI GGRSB GJIRD GQ6 GQ7 GQ8 H13 HCIFZ HMJXF HQYDN HRMNR HZ~ I0C IKXTQ IWAJR IXD IZIGR J-C J0Z JBSCW JCJTX JZLTJ K7- KOV LLZTM M4Y M7S NPVJJ NQJWS NU0 O9- O93 O9J P2P P9P PT4 PTHSS QOS R89 R9I RLLFE ROL RSV S27 S3B SEG SHX SISQX SJYHP SNE SNPRN SNX SOHCF SOJ SPISZ SRMVM SSLCW STPWE T13 TSG U2A UG4 UOJIU UTJUX UZXMN VC2 VFIZW W48 WK8 Z45 Z7X Z83 Z88 ZMTXR ~A9 AAYXX ABBRH ABDBE ABFSG ABRTQ ACSTC ADKFA AEZWR AFDZB AFFHD AFHIU AFOHR AHPBZ AHWEU AIXLP ATHPR AYFIA CITATION PHGZM PHGZT PQGLB 8FE 8FG AZQEC DWQXO GNUQQ JQ2 L6V P62 PKEHL PQEST PQQKQ PQUKI |
| ID | FETCH-LOGICAL-c319t-ac30e1cadc08b8eaea2a9bd0035a01da33699fb54882aef48e76eafad0dbac103 |
| IEDL.DBID | RSV |
| ISICitedReferencesCount | 3 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000749027600001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 1868-8071 |
| IngestDate | Wed Nov 05 02:13:11 EST 2025 Sat Nov 29 05:59:42 EST 2025 Tue Nov 18 21:47:03 EST 2025 Fri Feb 21 02:45:16 EST 2025 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 7 |
| Keywords | Multi-agent system Hindsight Experience replay Artificial intelligence Machine learning Reinforcement learning |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c319t-ac30e1cadc08b8eaea2a9bd0035a01da33699fb54882aef48e76eafad0dbac103 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ORCID | 0000-0002-7385-1426 |
| PQID | 2919363250 |
| PQPubID | 2043904 |
| PageCount | 13 |
| ParticipantIDs | proquest_journals_2919363250 crossref_citationtrail_10_1007_s13042_022_01505_x crossref_primary_10_1007_s13042_022_01505_x springer_journals_10_1007_s13042_022_01505_x |
| PublicationCentury | 2000 |
| PublicationDate | 20220700 2022-07-00 20220701 |
| PublicationDateYYYYMMDD | 2022-07-01 |
| PublicationDate_xml | – month: 7 year: 2022 text: 20220700 |
| PublicationDecade | 2020 |
| PublicationPlace | Berlin/Heidelberg |
| PublicationPlace_xml | – name: Berlin/Heidelberg – name: Heidelberg |
| PublicationTitle | International journal of machine learning and cybernetics |
| PublicationTitleAbbrev | Int. J. Mach. Learn. & Cyber |
| PublicationYear | 2022 |
| Publisher | Springer Berlin Heidelberg Springer Nature B.V |
| Publisher_xml | – name: Springer Berlin Heidelberg – name: Springer Nature B.V |
| References | Pesce, Montana (CR25) 2020; 109 CR19 CR18 CR17 CR15 CR14 CR13 CR12 CR11 CR10 Luo, Dong, Liang, Murata, Xu (CR1) 2019; 34 Bai, Liu, Zhao, Tang (CR16) 2019; 359 Prianto, Kim, Park, Bae, Kim (CR20) 2020; 20 Lin (CR8) 1992; 8 Mnih, Kavukcuoglu, Silver, Rusu, Veness, Bellemare, Graves, Riedmiller, Fidjeland, Ostrovski, Petersen, Beattie, Sadik, Antonoglou, King, Kumaran, Wierstra, Legg, Hassabis (CR9) 2015; 518 CR4 Silver, Huang, Maddison, Guez, Sifre, van den Driessche, Schrittwieser, Antonoglou, Panneershelvam, Lanctot, Dieleman, Grewe, Nham, Kalchbrenner, Sutskever, Lillicrap, Leach, Kavukcuoglu, Graepel, Hassabis (CR3) 2016; 529 CR6 CR5 CR7 CR29 CR28 CR27 CR26 CR24 CR23 CR22 CR21 Sallab, Abdou, Perot, Yogamani (CR2) 2017; 2017 1505_CR19 1505_CR18 1505_CR15 1505_CR14 1505_CR17 1505_CR11 1505_CR10 1505_CR13 1505_CR12 1505_CR4 1505_CR7 1505_CR6 1505_CR5 C Bai (1505_CR16) 2019; 359 D Silver (1505_CR3) 2016; 529 AE Sallab (1505_CR2) 2017; 2017 1505_CR29 F Luo (1505_CR1) 2019; 34 1505_CR26 V Mnih (1505_CR9) 2015; 518 1505_CR28 1505_CR27 E Prianto (1505_CR20) 2020; 20 1505_CR22 1505_CR21 1505_CR24 1505_CR23 E Pesce (1505_CR25) 2020; 109 LJ Lin (1505_CR8) 1992; 8 |
| References_xml | – ident: CR22 – ident: CR18 – volume: 518 start-page: 529 issue: 7540 year: 2015 end-page: 533 ident: CR9 article-title: Human-level control through deep reinforcement learning publication-title: Nat doi: 10.1038/nature14236 – ident: CR4 – ident: CR14 – ident: CR12 – ident: CR10 – volume: 529 start-page: 484 issue: 7587 year: 2016 end-page: 489 ident: CR3 article-title: Mastering the game of go with deep neural networks and tree search publication-title: Nature doi: 10.1038/nature16961 – volume: 34 start-page: 4097 year: 2019 end-page: 4108 ident: CR1 article-title: A distributed electricity trading system in active distribution networks based on multi-agent coalition and blockchain publication-title: IEEE Trans Power Syst doi: 10.1109/TPWRS.2018.2876612 – volume: 2017 start-page: 70 issue: 19 year: 2017 end-page: 76 ident: CR2 article-title: Deep reinforcement learning framework for autonomous driving publication-title: Electron Imaging doi: 10.2352/ISSN.2470-1173.2017.19.AVM-023 – ident: CR6 – ident: CR29 – volume: 359 start-page: 353 year: 2019 end-page: 367 ident: CR16 article-title: Guided goal generation for hindsight multi-goal reinforcement learning publication-title: Neurocomputing doi: 10.1016/j.neucom.2019.06.022 – volume: 20 start-page: 5911 year: 2020 ident: CR20 article-title: Path planning for multi-arm manipulators using deep reinforcement learning: soft actor-critic with hindsight experience replay publication-title: Sensors (Basel, Switzerland) doi: 10.3390/s20205911 – ident: CR27 – ident: CR23 – ident: CR21 – ident: CR19 – volume: 109 start-page: 1727 issue: 9–10 year: 2020 end-page: 1747 ident: CR25 article-title: Improving coordination in small-scale multi-agent deep reinforcement learning through memory-driven communication publication-title: Mach Learn doi: 10.1007/s10994-019-05864-5 – ident: CR15 – ident: CR17 – ident: CR13 – ident: CR11 – ident: CR5 – ident: CR7 – ident: CR28 – volume: 8 start-page: 293 year: 1992 end-page: 321 ident: CR8 article-title: Self-improving reactive agents based on reinforcement learning, planning and teaching publication-title: Mach Learn – ident: CR26 – ident: CR24 – ident: 1505_CR29 – ident: 1505_CR7 – ident: 1505_CR5 – ident: 1505_CR22 doi: 10.1016/B978-1-55860-335-6.50027-1 – ident: 1505_CR10 – ident: 1505_CR18 – ident: 1505_CR19 doi: 10.1109/SAUPEC/RobMech/PRASA48453.2020.9041058 – volume: 359 start-page: 353 year: 2019 ident: 1505_CR16 publication-title: Neurocomputing doi: 10.1016/j.neucom.2019.06.022 – volume: 20 start-page: 5911 year: 2020 ident: 1505_CR20 publication-title: Sensors (Basel, Switzerland) doi: 10.3390/s20205911 – ident: 1505_CR14 – ident: 1505_CR27 doi: 10.1609/aaai.v32i1.11794 – volume: 518 start-page: 529 issue: 7540 year: 2015 ident: 1505_CR9 publication-title: Nat doi: 10.1038/nature14236 – ident: 1505_CR12 – volume: 109 start-page: 1727 issue: 9–10 year: 2020 ident: 1505_CR25 publication-title: Mach Learn doi: 10.1007/s10994-019-05864-5 – volume: 529 start-page: 484 issue: 7587 year: 2016 ident: 1505_CR3 publication-title: Nature doi: 10.1038/nature16961 – ident: 1505_CR28 – ident: 1505_CR6 – ident: 1505_CR4 – ident: 1505_CR11 – ident: 1505_CR24 – ident: 1505_CR23 doi: 10.1371/journal.pone.0172395 – ident: 1505_CR26 – ident: 1505_CR21 doi: 10.1177/1729881419898342 – ident: 1505_CR17 – ident: 1505_CR15 – volume: 2017 start-page: 70 issue: 19 year: 2017 ident: 1505_CR2 publication-title: Electron Imaging doi: 10.2352/ISSN.2470-1173.2017.19.AVM-023 – volume: 34 start-page: 4097 year: 2019 ident: 1505_CR1 publication-title: IEEE Trans Power Syst doi: 10.1109/TPWRS.2018.2876612 – volume: 8 start-page: 293 year: 1992 ident: 1505_CR8 publication-title: Mach Learn – ident: 1505_CR13 |
| SSID | ssj0000603302 ssib031263576 ssib033405570 |
| Score | 2.2698853 |
| Snippet | Classic reinforcement learning algorithms generate experiences by the agent's constant trial and error, which leads to a large number of failure experiences... |
| SourceID | proquest crossref springer |
| SourceType | Aggregation Database Enrichment Source Index Database Publisher |
| StartPage | 2045 |
| SubjectTerms | Algorithms Artificial Intelligence Buffers Communication Complex Systems Computational Intelligence Control Curricula Deep learning Efficiency Engineering Failure Machine learning Mechatronics Multiagent systems Original Article Pattern Recognition Robotics Success Systems Biology Teaching methods |
| SummonAdditionalLinks | – databaseName: Engineering Database dbid: M7S link: http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1LS8QwEA6-Dl58i-uLHDwoGmyatpueRETZg4qgwt7KNElV0HVtV92f7yRNtyjoxWvTTklmJvPI5BtC9hIVyRi6muk4LliUBIpJjCNQlkWE9riITOHQ9S-719ey309vfMKt8mWVzZ7oNmr9qmyO_DhM0dVIBFrsk-Ebs12j7Omqb6ExTWYtSgJ3pXu3jTwJbpFWWnMrROQQpyY5mCDBZ3VZokykxeXl_l5NfbvOhvrMlrvbtEDMxt9tV-uQ_jhDdabpYvG_k1oiC94ppae1FC2TKTNYIcte7Su677GpD1bJVQ9j-MqBj8AnlIZqY4a0NA5_VblUI_WNKB4oPD_gz0aPLxQHqStdZGCvctEaP7paI_cX53dnPeY7MjCFqoqUlQgMV6BVIHNpwEAIaa7tcSQEXIMQSZoWOUZBMgRTRNJ0EwMF6EDnoHgg1snM4HVgNgiNc3Q-cPboYARRgUaRA6RRXkAKeRom3Q7hzVpnysOV264Zz1kLtGz5kyF_MsefbNwhh5NvhjVYx59vbzdMybziVlnLkQ45atjaDv9ObfNvaltkPnSSZAt9t8nMqHw3O2ROfYyeqnLXie0XUvHu7A priority: 102 providerName: ProQuest |
| Title | Hindsight-aware deep reinforcement learning algorithm for multi-agent systems |
| URI | https://link.springer.com/article/10.1007/s13042-022-01505-x https://www.proquest.com/docview/2919363250 |
| Volume | 13 |
| WOSCitedRecordID | wos000749027600001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVPQU databaseName: Computer Science Database customDbUrl: eissn: 1868-808X dateEnd: 20241209 omitProxy: false ssIdentifier: ssj0000603302 issn: 1868-8071 databaseCode: K7- dateStart: 20101201 isFulltext: true titleUrlDefault: http://search.proquest.com/compscijour providerName: ProQuest – providerCode: PRVPQU databaseName: Engineering Database customDbUrl: eissn: 1868-808X dateEnd: 20241209 omitProxy: false ssIdentifier: ssj0000603302 issn: 1868-8071 databaseCode: M7S dateStart: 20101201 isFulltext: true titleUrlDefault: http://search.proquest.com providerName: ProQuest – providerCode: PRVPQU databaseName: ProQuest advanced technologies & aerospace journals customDbUrl: eissn: 1868-808X dateEnd: 20241209 omitProxy: false ssIdentifier: ssj0000603302 issn: 1868-8071 databaseCode: P5Z dateStart: 20101201 isFulltext: true titleUrlDefault: https://search.proquest.com/hightechjournals providerName: ProQuest – providerCode: PRVPQU databaseName: ProQuest Central customDbUrl: eissn: 1868-808X dateEnd: 20241209 omitProxy: false ssIdentifier: ssj0000603302 issn: 1868-8071 databaseCode: BENPR dateStart: 20101201 isFulltext: true titleUrlDefault: https://www.proquest.com/central providerName: ProQuest – providerCode: PRVAVX databaseName: SpringerLINK Contemporary 1997-Present customDbUrl: eissn: 1868-808X dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0000603302 issn: 1868-8071 databaseCode: RSV dateStart: 20101201 isFulltext: true titleUrlDefault: https://link.springer.com/search?facet-content-type=%22Journal%22 providerName: Springer Nature |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LT-QwDLZY4LAceO0ihscoBw4gNlLb9JEeAYGQgNGIl9BeKjdNAQkG1A6Pn4-TSSkgQNq99NCmaes49efE_gywFqtQRpgUvIiikoexp7gkP4J0WYRkj8tQl5Zd_zDp9eTFRdp3SWF1E-3ebEnaP3Wb7GY8b26iz42XHnFCjhNk7qQp2HB8ct5okfANv0prZIUILc_U68qLF9O5UTCijKVh4_VdNs3nj3lvsVoY-mHn1BqkvZn_-5RZmHYAlG2NNGYOxvRgHqbe0BLOw5yb8DVbd6zUG7_gaJ-899rSjuATVpoVWt-zSlvmVWUXGZkrQXHJ8ObyrroeXt0yushs0CJHk8TFRszR9W8429s93dnnrhYDVzRJqWclPO0rLJQnc6lRY4BpXpiNSPT8AoWI07TMyf-RAeoylDqJNZZYeEWOyvfEAowP7gZ6EViUE-wg3EjQwgtLMoc-YhrmJaaYp0GcdMBv5J0pR1Ru6mXcZC3FspFfRvLLrPyy5w5svt5zP6Lp-Lb1SjOMmZuydRak9E6xIEjYgT_NsLWXv-5t6d-aL8PPwI68CfldgfFh9aBXYVI9Dq_rqgsT27u9_nEXfhwkvGsiUU_o2I_-dq2SvwBgVO5y |
| linkProvider | Springer Nature |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V1LT9wwEB4BrQQXKH2oW6D40EqtqNUkdhLnUFWoLVq0y6oHkLiFie1QJFi2m-XRP9XfyNhJiKhUbhy4xrEl259nxuOZbwDeJVqqGFPDTRyXXCaB5oruEYRlIUkfl9KWnl1_mI5G6vAw-zkHf9tcGBdW2cpEL6jNuXY-8s9RRqZGIkhjf5385q5qlHtdbUto1LAY2D9XdGWrvux-p_19H0U7P_a_9XlTVYBrgtuMoxaBDTUaHahCWbQYYVYY96SGQWhQiCTLyoIseRWhLaWyaWKxRBOYAnUYCBp3Hp5IoVJ3rgYpb_ErQsfs0ql3IaRnuLr1-QQJfavDIFWiHA9w2OTx1Nl8zrXAXXi9c0PE_PquruwM4H_ebL0q3Fl5bIv4DJYbo5tt16dkFebs-DmsNmKtYh8a7u2PL2CvfzI2lSdXwSucWmasnbCp9fyy2rtSWVNo45jh6TFNbvbrjFEj86GZHF2qGqv5sauXcPAg03oFC-PzsX0NLC7IuKLVJgMqkCUp_RAxk0WJGRZZlKQ9CNu9zXVDx-6qgpzmHZG0w0NOeMg9HvLrHmzd9pnUZCT3_r3egiBvBFOVdwjowacWRl3z_0d7c_9om7DY398b5sPd0WANliKPYhfUvA4Ls-mF3YCn-nJ2Uk3f-iPD4Oih4XUDnK9P8Q |
| linkToPdf | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1LT9wwEB4hilA5tLwqtjzqAwdQiTaJnaxzrFpWVKUrJErFLZr4QZFgWSVL4eczdhJCESChXmPHScbjzDeemc8A26kSMsGBDnSS2ECkoQok-RGky1yQPbbCWM-ufzgYjeTpaXb0oIrfZ7u3Icm6psGxNI2n_Ym2_a7wzXnhgctEdx57EhCKfCNcIr3z149_txrFI8e10hlczoXnnLrfhQlTulYnJspUOmbeqKmsefox_1qvDpI-iqJ64zR8__-ftQjvGmDKvtSatAQzZrwMCw_oCpdhqfkRVGynYaveXYGfB-TVV56OBG-wNEwbM2Gl8Yysym8-suZoijOGF2dX5fn0zyWjRuaTGQN0xV2sZpSuVuFkuP_r60HQnNEQKFq8NLLioYkUahXKQho0GGNWaBegxDDSyHmaZbYgv0jGaKyQZpAatKhDXaCKQv4BZsdXY7MGLCkIjhCeJMgRCktmMkLMRGExwyKL00EPolb2uWoIzN05Ghd5R73s5JeT_HIvv_y2B5_v75nU9B0v9t5opzRvlnKVxxm9U8oJKvZgr53Crvn50T6-rvsnmD_6NswPv49-rMPb2CuBywregNlpeW02YU79nZ5X5ZbX8Dsx4vXY |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Hindsight-aware+deep+reinforcement+learning+algorithm+for+multi-agent+systems&rft.jtitle=International+journal+of+machine+learning+and+cybernetics&rft.au=Li%2C+Chengjing&rft.au=Wang%2C+Li&rft.au=Huang%2C+Zirong&rft.date=2022-07-01&rft.pub=Springer+Berlin+Heidelberg&rft.issn=1868-8071&rft.eissn=1868-808X&rft.volume=13&rft.issue=7&rft.spage=2045&rft.epage=2057&rft_id=info:doi/10.1007%2Fs13042-022-01505-x&rft.externalDocID=10_1007_s13042_022_01505_x |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1868-8071&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1868-8071&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1868-8071&client=summon |