Hindsight-aware deep reinforcement learning algorithm for multi-agent systems

Classic reinforcement learning algorithms generate experiences by the agent's constant trial and error, which leads to a large number of failure experiences stored in the replay buffer. As a result, the agents can only learn through these low-quality experiences. In the case of multi-agent syst...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:International journal of machine learning and cybernetics Ročník 13; číslo 7; s. 2045 - 2057
Hlavní autoři: Li, Chengjing, Wang, Li, Huang, Zirong
Médium: Journal Article
Jazyk:angličtina
Vydáno: Berlin/Heidelberg Springer Berlin Heidelberg 01.07.2022
Springer Nature B.V
Témata:
ISSN:1868-8071, 1868-808X
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Abstract Classic reinforcement learning algorithms generate experiences by the agent's constant trial and error, which leads to a large number of failure experiences stored in the replay buffer. As a result, the agents can only learn through these low-quality experiences. In the case of multi-agent systems, this problem is more serious. MADDPG (Multi-Agent Deep Deterministic Policy Gradient) has achieved significant results in solving multi-agent problems by using a framework of centralized training with decentralized execution. Nevertheless, the problem of too many failure experiences in the replay buffer has not been resolved. In this paper, we propose HMADDPG (Hindsight Multi-Agent Deep Deterministic Policy Gradient) to mitigate the negative impact of failure experience. HMADDPG has a hindsight unit, which allows the agents to reflect and produces pseudo experiences that tend to succeed. Pseudo experiences are stored in the replay buffer, so that the agents can combine two kinds of experiences to learn. We have evaluated our algorithm on a number of environments. The results show that the algorithm can guide agents to learn better strategies and can be applied in multi-agent systems which are cooperative, competitive, or mixed cooperative and competitive.
AbstractList Classic reinforcement learning algorithms generate experiences by the agent's constant trial and error, which leads to a large number of failure experiences stored in the replay buffer. As a result, the agents can only learn through these low-quality experiences. In the case of multi-agent systems, this problem is more serious. MADDPG (Multi-Agent Deep Deterministic Policy Gradient) has achieved significant results in solving multi-agent problems by using a framework of centralized training with decentralized execution. Nevertheless, the problem of too many failure experiences in the replay buffer has not been resolved. In this paper, we propose HMADDPG (Hindsight Multi-Agent Deep Deterministic Policy Gradient) to mitigate the negative impact of failure experience. HMADDPG has a hindsight unit, which allows the agents to reflect and produces pseudo experiences that tend to succeed. Pseudo experiences are stored in the replay buffer, so that the agents can combine two kinds of experiences to learn. We have evaluated our algorithm on a number of environments. The results show that the algorithm can guide agents to learn better strategies and can be applied in multi-agent systems which are cooperative, competitive, or mixed cooperative and competitive.
Author Li, Chengjing
Wang, Li
Huang, Zirong
Author_xml – sequence: 1
  givenname: Chengjing
  surname: Li
  fullname: Li, Chengjing
  organization: College of Data Science, Taiyuan University of Technology
– sequence: 2
  givenname: Li
  orcidid: 0000-0002-7385-1426
  surname: Wang
  fullname: Wang, Li
  email: wangli@tyut.edu.cn
  organization: College of Data Science, Taiyuan University of Technology
– sequence: 3
  givenname: Zirong
  surname: Huang
  fullname: Huang, Zirong
  organization: College of Data Science, Taiyuan University of Technology
BookMark eNp9kMFKw0AQhhepYK19AU8Bz9HZ3SbdHKWoFRQvCt6WSTJJtySburvF9u1NjSh46MAwA_N_M8N_zka2s8TYJYdrDjC_8VzCTMQg-uQJJPHuhI25SlWsQL2Pfvs5P2NT79fQRwpSghiz56WxpTf1KsT4iY6ikmgTOTK26lxBLdkQNYTOGltH2NSdM2HVRv0wardNMDHWB4nf-0Ctv2CnFTaepj91wt7u714Xy_jp5eFxcfsUF5Jn_aVCAvECywJUrggJBWZ5CSATBF6ilGmWVXkyU0ogVTNF85SwwhLKHAsOcsKuhr0b131syQe97rbO9ie1yHgmUymSg0oNqsJ13juqdGECBtPZ4NA0moM--KcH_3Tvn_72T-96VPxDN8606PbHITlAvhfbmtzfV0eoLyMAh14
CitedBy_id crossref_primary_10_1016_j_eswa_2025_129617
crossref_primary_10_1145_3675167
crossref_primary_10_1007_s13042_023_01845_2
Cites_doi 10.1038/nature14236
10.1038/nature16961
10.1109/TPWRS.2018.2876612
10.2352/ISSN.2470-1173.2017.19.AVM-023
10.1016/j.neucom.2019.06.022
10.3390/s20205911
10.1007/s10994-019-05864-5
10.1016/B978-1-55860-335-6.50027-1
10.1109/SAUPEC/RobMech/PRASA48453.2020.9041058
10.1609/aaai.v32i1.11794
10.1371/journal.pone.0172395
10.1177/1729881419898342
ContentType Journal Article
Copyright The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2022
The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2022.
Copyright_xml – notice: The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2022
– notice: The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2022.
DBID AAYXX
CITATION
8FE
8FG
ABJCF
AFKRA
ARAPS
AZQEC
BENPR
BGLVJ
CCPQU
DWQXO
GNUQQ
HCIFZ
JQ2
K7-
L6V
M7S
P5Z
P62
PHGZM
PHGZT
PKEHL
PQEST
PQGLB
PQQKQ
PQUKI
PTHSS
DOI 10.1007/s13042-022-01505-x
DatabaseName CrossRef
ProQuest SciTech Collection
ProQuest Technology Collection
Materials Science & Engineering Collection
ProQuest Central UK/Ireland
Advanced Technologies & Computer Science Collection
ProQuest Central Essentials
ProQuest Central
ProQuest Technology Collection
ProQuest One
ProQuest Central
ProQuest Central Student
SciTech Premium Collection
ProQuest Computer Science Collection
Computer Science Database
ProQuest Engineering Collection
Engineering Database
ProQuest advanced technologies & aerospace journals
ProQuest Advanced Technologies & Aerospace Collection
Proquest Central Premium
ProQuest One Academic (New)
ProQuest One Academic Middle East (New)
ProQuest One Academic Eastern Edition (DO NOT USE)
One Applied & Life Sciences
ProQuest One Academic (retired)
ProQuest One Academic UKI Edition
Engineering collection
DatabaseTitle CrossRef
Computer Science Database
ProQuest Central Student
Technology Collection
ProQuest One Academic Middle East (New)
ProQuest Advanced Technologies & Aerospace Collection
ProQuest Central Essentials
ProQuest Computer Science Collection
SciTech Premium Collection
ProQuest One Community College
ProQuest Central
ProQuest One Applied & Life Sciences
ProQuest Engineering Collection
ProQuest Central Korea
ProQuest Central (New)
Engineering Collection
Advanced Technologies & Aerospace Collection
Engineering Database
ProQuest One Academic Eastern Edition
ProQuest Technology Collection
ProQuest SciTech Collection
Advanced Technologies & Aerospace Database
ProQuest One Academic UKI Edition
Materials Science & Engineering Collection
ProQuest One Academic
ProQuest One Academic (New)
DatabaseTitleList Computer Science Database

Database_xml – sequence: 1
  dbid: BENPR
  name: ProQuest Central
  url: https://www.proquest.com/central
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
Sciences (General)
EISSN 1868-808X
EndPage 2057
ExternalDocumentID 10_1007_s13042_022_01505_x
GrantInformation_xml – fundername: national natural science foundation of china
  grantid: 61872260
  funderid: http://dx.doi.org/10.13039/501100001809
GroupedDBID -EM
06D
0R~
0VY
1N0
203
29~
2JY
2VQ
30V
4.4
406
408
409
40D
96X
AACDK
AAHNG
AAIAL
AAJBT
AAJKR
AANZL
AARHV
AARTL
AASML
AATNV
AATVU
AAUYE
AAWCG
AAYIU
AAYQN
AAYTO
AAYZH
AAZMS
ABAKF
ABBXA
ABDZT
ABECU
ABFTD
ABFTV
ABHQN
ABJCF
ABJNI
ABJOX
ABKCH
ABMQK
ABQBU
ABSXP
ABTEG
ABTHY
ABTKH
ABTMW
ABULA
ABWNU
ABXPI
ACAOD
ACDTI
ACGFS
ACHSB
ACKNC
ACMLO
ACOKC
ACPIV
ACZOJ
ADHHG
ADHIR
ADINQ
ADKNI
ADKPE
ADRFC
ADTPH
ADURQ
ADYFF
ADZKW
AEBTG
AEFQL
AEGNC
AEJHL
AEJRE
AEMSY
AENEX
AEOHA
AEPYU
AESKC
AETCA
AEVLU
AEXYK
AFBBN
AFKRA
AFLOW
AFQWF
AFWTZ
AFZKB
AGAYW
AGDGC
AGJBK
AGMZJ
AGQEE
AGQMX
AGRTI
AGWZB
AGYKE
AHAVH
AHBYD
AHKAY
AHSBF
AHYZX
AIAKS
AIGIU
AIIXL
AILAN
AITGF
AJBLW
AJRNO
AJZVZ
AKLTO
ALFXC
ALMA_UNASSIGNED_HOLDINGS
AMKLP
AMXSW
AMYLF
AMYQR
ANMIH
ARAPS
AUKKA
AXYYD
AYJHY
BENPR
BGLVJ
BGNMA
CCPQU
CSCUP
DNIVK
DPUIP
EBLON
EBS
EIOEI
EJD
ESBYG
FERAY
FIGPU
FINBP
FNLPD
FRRFC
FSGXE
FYJPI
GGCAI
GGRSB
GJIRD
GQ6
GQ7
GQ8
H13
HCIFZ
HMJXF
HQYDN
HRMNR
HZ~
I0C
IKXTQ
IWAJR
IXD
IZIGR
J-C
J0Z
JBSCW
JCJTX
JZLTJ
K7-
KOV
LLZTM
M4Y
M7S
NPVJJ
NQJWS
NU0
O9-
O93
O9J
P2P
P9P
PT4
PTHSS
QOS
R89
R9I
RLLFE
ROL
RSV
S27
S3B
SEG
SHX
SISQX
SJYHP
SNE
SNPRN
SNX
SOHCF
SOJ
SPISZ
SRMVM
SSLCW
STPWE
T13
TSG
U2A
UG4
UOJIU
UTJUX
UZXMN
VC2
VFIZW
W48
WK8
Z45
Z7X
Z83
Z88
ZMTXR
~A9
AAYXX
ABBRH
ABDBE
ABFSG
ABRTQ
ACSTC
ADKFA
AEZWR
AFDZB
AFFHD
AFHIU
AFOHR
AHPBZ
AHWEU
AIXLP
ATHPR
AYFIA
CITATION
PHGZM
PHGZT
PQGLB
8FE
8FG
AZQEC
DWQXO
GNUQQ
JQ2
L6V
P62
PKEHL
PQEST
PQQKQ
PQUKI
ID FETCH-LOGICAL-c319t-ac30e1cadc08b8eaea2a9bd0035a01da33699fb54882aef48e76eafad0dbac103
IEDL.DBID RSV
ISICitedReferencesCount 3
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000749027600001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 1868-8071
IngestDate Wed Nov 05 02:13:11 EST 2025
Sat Nov 29 05:59:42 EST 2025
Tue Nov 18 21:47:03 EST 2025
Fri Feb 21 02:45:16 EST 2025
IsPeerReviewed true
IsScholarly true
Issue 7
Keywords Multi-agent system
Hindsight
Experience replay
Artificial intelligence
Machine learning
Reinforcement learning
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c319t-ac30e1cadc08b8eaea2a9bd0035a01da33699fb54882aef48e76eafad0dbac103
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ORCID 0000-0002-7385-1426
PQID 2919363250
PQPubID 2043904
PageCount 13
ParticipantIDs proquest_journals_2919363250
crossref_citationtrail_10_1007_s13042_022_01505_x
crossref_primary_10_1007_s13042_022_01505_x
springer_journals_10_1007_s13042_022_01505_x
PublicationCentury 2000
PublicationDate 20220700
2022-07-00
20220701
PublicationDateYYYYMMDD 2022-07-01
PublicationDate_xml – month: 7
  year: 2022
  text: 20220700
PublicationDecade 2020
PublicationPlace Berlin/Heidelberg
PublicationPlace_xml – name: Berlin/Heidelberg
– name: Heidelberg
PublicationTitle International journal of machine learning and cybernetics
PublicationTitleAbbrev Int. J. Mach. Learn. & Cyber
PublicationYear 2022
Publisher Springer Berlin Heidelberg
Springer Nature B.V
Publisher_xml – name: Springer Berlin Heidelberg
– name: Springer Nature B.V
References Pesce, Montana (CR25) 2020; 109
CR19
CR18
CR17
CR15
CR14
CR13
CR12
CR11
CR10
Luo, Dong, Liang, Murata, Xu (CR1) 2019; 34
Bai, Liu, Zhao, Tang (CR16) 2019; 359
Prianto, Kim, Park, Bae, Kim (CR20) 2020; 20
Lin (CR8) 1992; 8
Mnih, Kavukcuoglu, Silver, Rusu, Veness, Bellemare, Graves, Riedmiller, Fidjeland, Ostrovski, Petersen, Beattie, Sadik, Antonoglou, King, Kumaran, Wierstra, Legg, Hassabis (CR9) 2015; 518
CR4
Silver, Huang, Maddison, Guez, Sifre, van den Driessche, Schrittwieser, Antonoglou, Panneershelvam, Lanctot, Dieleman, Grewe, Nham, Kalchbrenner, Sutskever, Lillicrap, Leach, Kavukcuoglu, Graepel, Hassabis (CR3) 2016; 529
CR6
CR5
CR7
CR29
CR28
CR27
CR26
CR24
CR23
CR22
CR21
Sallab, Abdou, Perot, Yogamani (CR2) 2017; 2017
1505_CR19
1505_CR18
1505_CR15
1505_CR14
1505_CR17
1505_CR11
1505_CR10
1505_CR13
1505_CR12
1505_CR4
1505_CR7
1505_CR6
1505_CR5
C Bai (1505_CR16) 2019; 359
D Silver (1505_CR3) 2016; 529
AE Sallab (1505_CR2) 2017; 2017
1505_CR29
F Luo (1505_CR1) 2019; 34
1505_CR26
V Mnih (1505_CR9) 2015; 518
1505_CR28
1505_CR27
E Prianto (1505_CR20) 2020; 20
1505_CR22
1505_CR21
1505_CR24
1505_CR23
E Pesce (1505_CR25) 2020; 109
LJ Lin (1505_CR8) 1992; 8
References_xml – ident: CR22
– ident: CR18
– volume: 518
  start-page: 529
  issue: 7540
  year: 2015
  end-page: 533
  ident: CR9
  article-title: Human-level control through deep reinforcement learning
  publication-title: Nat
  doi: 10.1038/nature14236
– ident: CR4
– ident: CR14
– ident: CR12
– ident: CR10
– volume: 529
  start-page: 484
  issue: 7587
  year: 2016
  end-page: 489
  ident: CR3
  article-title: Mastering the game of go with deep neural networks and tree search
  publication-title: Nature
  doi: 10.1038/nature16961
– volume: 34
  start-page: 4097
  year: 2019
  end-page: 4108
  ident: CR1
  article-title: A distributed electricity trading system in active distribution networks based on multi-agent coalition and blockchain
  publication-title: IEEE Trans Power Syst
  doi: 10.1109/TPWRS.2018.2876612
– volume: 2017
  start-page: 70
  issue: 19
  year: 2017
  end-page: 76
  ident: CR2
  article-title: Deep reinforcement learning framework for autonomous driving
  publication-title: Electron Imaging
  doi: 10.2352/ISSN.2470-1173.2017.19.AVM-023
– ident: CR6
– ident: CR29
– volume: 359
  start-page: 353
  year: 2019
  end-page: 367
  ident: CR16
  article-title: Guided goal generation for hindsight multi-goal reinforcement learning
  publication-title: Neurocomputing
  doi: 10.1016/j.neucom.2019.06.022
– volume: 20
  start-page: 5911
  year: 2020
  ident: CR20
  article-title: Path planning for multi-arm manipulators using deep reinforcement learning: soft actor-critic with hindsight experience replay
  publication-title: Sensors (Basel, Switzerland)
  doi: 10.3390/s20205911
– ident: CR27
– ident: CR23
– ident: CR21
– ident: CR19
– volume: 109
  start-page: 1727
  issue: 9–10
  year: 2020
  end-page: 1747
  ident: CR25
  article-title: Improving coordination in small-scale multi-agent deep reinforcement learning through memory-driven communication
  publication-title: Mach Learn
  doi: 10.1007/s10994-019-05864-5
– ident: CR15
– ident: CR17
– ident: CR13
– ident: CR11
– ident: CR5
– ident: CR7
– ident: CR28
– volume: 8
  start-page: 293
  year: 1992
  end-page: 321
  ident: CR8
  article-title: Self-improving reactive agents based on reinforcement learning, planning and teaching
  publication-title: Mach Learn
– ident: CR26
– ident: CR24
– ident: 1505_CR29
– ident: 1505_CR7
– ident: 1505_CR5
– ident: 1505_CR22
  doi: 10.1016/B978-1-55860-335-6.50027-1
– ident: 1505_CR10
– ident: 1505_CR18
– ident: 1505_CR19
  doi: 10.1109/SAUPEC/RobMech/PRASA48453.2020.9041058
– volume: 359
  start-page: 353
  year: 2019
  ident: 1505_CR16
  publication-title: Neurocomputing
  doi: 10.1016/j.neucom.2019.06.022
– volume: 20
  start-page: 5911
  year: 2020
  ident: 1505_CR20
  publication-title: Sensors (Basel, Switzerland)
  doi: 10.3390/s20205911
– ident: 1505_CR14
– ident: 1505_CR27
  doi: 10.1609/aaai.v32i1.11794
– volume: 518
  start-page: 529
  issue: 7540
  year: 2015
  ident: 1505_CR9
  publication-title: Nat
  doi: 10.1038/nature14236
– ident: 1505_CR12
– volume: 109
  start-page: 1727
  issue: 9–10
  year: 2020
  ident: 1505_CR25
  publication-title: Mach Learn
  doi: 10.1007/s10994-019-05864-5
– volume: 529
  start-page: 484
  issue: 7587
  year: 2016
  ident: 1505_CR3
  publication-title: Nature
  doi: 10.1038/nature16961
– ident: 1505_CR28
– ident: 1505_CR6
– ident: 1505_CR4
– ident: 1505_CR11
– ident: 1505_CR24
– ident: 1505_CR23
  doi: 10.1371/journal.pone.0172395
– ident: 1505_CR26
– ident: 1505_CR21
  doi: 10.1177/1729881419898342
– ident: 1505_CR17
– ident: 1505_CR15
– volume: 2017
  start-page: 70
  issue: 19
  year: 2017
  ident: 1505_CR2
  publication-title: Electron Imaging
  doi: 10.2352/ISSN.2470-1173.2017.19.AVM-023
– volume: 34
  start-page: 4097
  year: 2019
  ident: 1505_CR1
  publication-title: IEEE Trans Power Syst
  doi: 10.1109/TPWRS.2018.2876612
– volume: 8
  start-page: 293
  year: 1992
  ident: 1505_CR8
  publication-title: Mach Learn
– ident: 1505_CR13
SSID ssj0000603302
ssib031263576
ssib033405570
Score 2.2698853
Snippet Classic reinforcement learning algorithms generate experiences by the agent's constant trial and error, which leads to a large number of failure experiences...
SourceID proquest
crossref
springer
SourceType Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 2045
SubjectTerms Algorithms
Artificial Intelligence
Buffers
Communication
Complex Systems
Computational Intelligence
Control
Curricula
Deep learning
Efficiency
Engineering
Failure
Machine learning
Mechatronics
Multiagent systems
Original Article
Pattern Recognition
Robotics
Success
Systems Biology
Teaching methods
SummonAdditionalLinks – databaseName: Engineering Database
  dbid: M7S
  link: http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1LS8QwEA6-Dl58i-uLHDwoGmyatpueRETZg4qgwt7KNElV0HVtV92f7yRNtyjoxWvTTklmJvPI5BtC9hIVyRi6muk4LliUBIpJjCNQlkWE9riITOHQ9S-719ey309vfMKt8mWVzZ7oNmr9qmyO_DhM0dVIBFrsk-Ebs12j7Omqb6ExTWYtSgJ3pXu3jTwJbpFWWnMrROQQpyY5mCDBZ3VZokykxeXl_l5NfbvOhvrMlrvbtEDMxt9tV-uQ_jhDdabpYvG_k1oiC94ppae1FC2TKTNYIcte7Su677GpD1bJVQ9j-MqBj8AnlIZqY4a0NA5_VblUI_WNKB4oPD_gz0aPLxQHqStdZGCvctEaP7paI_cX53dnPeY7MjCFqoqUlQgMV6BVIHNpwEAIaa7tcSQEXIMQSZoWOUZBMgRTRNJ0EwMF6EDnoHgg1snM4HVgNgiNc3Q-cPboYARRgUaRA6RRXkAKeRom3Q7hzVpnysOV264Zz1kLtGz5kyF_MsefbNwhh5NvhjVYx59vbzdMybziVlnLkQ45atjaDv9ObfNvaltkPnSSZAt9t8nMqHw3O2ROfYyeqnLXie0XUvHu7A
  priority: 102
  providerName: ProQuest
Title Hindsight-aware deep reinforcement learning algorithm for multi-agent systems
URI https://link.springer.com/article/10.1007/s13042-022-01505-x
https://www.proquest.com/docview/2919363250
Volume 13
WOSCitedRecordID wos000749027600001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVPQU
  databaseName: Computer Science Database
  customDbUrl:
  eissn: 1868-808X
  dateEnd: 20241209
  omitProxy: false
  ssIdentifier: ssj0000603302
  issn: 1868-8071
  databaseCode: K7-
  dateStart: 20101201
  isFulltext: true
  titleUrlDefault: http://search.proquest.com/compscijour
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: Engineering Database
  customDbUrl:
  eissn: 1868-808X
  dateEnd: 20241209
  omitProxy: false
  ssIdentifier: ssj0000603302
  issn: 1868-8071
  databaseCode: M7S
  dateStart: 20101201
  isFulltext: true
  titleUrlDefault: http://search.proquest.com
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: ProQuest advanced technologies & aerospace journals
  customDbUrl:
  eissn: 1868-808X
  dateEnd: 20241209
  omitProxy: false
  ssIdentifier: ssj0000603302
  issn: 1868-8071
  databaseCode: P5Z
  dateStart: 20101201
  isFulltext: true
  titleUrlDefault: https://search.proquest.com/hightechjournals
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: ProQuest Central
  customDbUrl:
  eissn: 1868-808X
  dateEnd: 20241209
  omitProxy: false
  ssIdentifier: ssj0000603302
  issn: 1868-8071
  databaseCode: BENPR
  dateStart: 20101201
  isFulltext: true
  titleUrlDefault: https://www.proquest.com/central
  providerName: ProQuest
– providerCode: PRVAVX
  databaseName: SpringerLINK Contemporary 1997-Present
  customDbUrl:
  eissn: 1868-808X
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0000603302
  issn: 1868-8071
  databaseCode: RSV
  dateStart: 20101201
  isFulltext: true
  titleUrlDefault: https://link.springer.com/search?facet-content-type=%22Journal%22
  providerName: Springer Nature
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LT-QwDLZY4LAceO0ihscoBw4gNlLb9JEeAYGQgNGIl9BeKjdNAQkG1A6Pn4-TSSkgQNq99NCmaes49efE_gywFqtQRpgUvIiikoexp7gkP4J0WYRkj8tQl5Zd_zDp9eTFRdp3SWF1E-3ebEnaP3Wb7GY8b26iz42XHnFCjhNk7qQp2HB8ct5okfANv0prZIUILc_U68qLF9O5UTCijKVh4_VdNs3nj3lvsVoY-mHn1BqkvZn_-5RZmHYAlG2NNGYOxvRgHqbe0BLOw5yb8DVbd6zUG7_gaJ-899rSjuATVpoVWt-zSlvmVWUXGZkrQXHJ8ObyrroeXt0yushs0CJHk8TFRszR9W8429s93dnnrhYDVzRJqWclPO0rLJQnc6lRY4BpXpiNSPT8AoWI07TMyf-RAeoylDqJNZZYeEWOyvfEAowP7gZ6EViUE-wg3EjQwgtLMoc-YhrmJaaYp0GcdMBv5J0pR1Ru6mXcZC3FspFfRvLLrPyy5w5svt5zP6Lp-Lb1SjOMmZuydRak9E6xIEjYgT_NsLWXv-5t6d-aL8PPwI68CfldgfFh9aBXYVI9Dq_rqgsT27u9_nEXfhwkvGsiUU_o2I_-dq2SvwBgVO5y
linkProvider Springer Nature
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V1LT9wwEB4BrQQXKH2oW6D40EqtqNUkdhLnUFWoLVq0y6oHkLiFie1QJFi2m-XRP9XfyNhJiKhUbhy4xrEl259nxuOZbwDeJVqqGFPDTRyXXCaB5oruEYRlIUkfl9KWnl1_mI5G6vAw-zkHf9tcGBdW2cpEL6jNuXY-8s9RRqZGIkhjf5385q5qlHtdbUto1LAY2D9XdGWrvux-p_19H0U7P_a_9XlTVYBrgtuMoxaBDTUaHahCWbQYYVYY96SGQWhQiCTLyoIseRWhLaWyaWKxRBOYAnUYCBp3Hp5IoVJ3rgYpb_ErQsfs0ql3IaRnuLr1-QQJfavDIFWiHA9w2OTx1Nl8zrXAXXi9c0PE_PquruwM4H_ebL0q3Fl5bIv4DJYbo5tt16dkFebs-DmsNmKtYh8a7u2PL2CvfzI2lSdXwSucWmasnbCp9fyy2rtSWVNo45jh6TFNbvbrjFEj86GZHF2qGqv5sauXcPAg03oFC-PzsX0NLC7IuKLVJgMqkCUp_RAxk0WJGRZZlKQ9CNu9zXVDx-6qgpzmHZG0w0NOeMg9HvLrHmzd9pnUZCT3_r3egiBvBFOVdwjowacWRl3z_0d7c_9om7DY398b5sPd0WANliKPYhfUvA4Ls-mF3YCn-nJ2Uk3f-iPD4Oih4XUDnK9P8Q
linkToPdf http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1LT9wwEB4hilA5tLwqtjzqAwdQiTaJnaxzrFpWVKUrJErFLZr4QZFgWSVL4eczdhJCESChXmPHScbjzDeemc8A26kSMsGBDnSS2ECkoQok-RGky1yQPbbCWM-ufzgYjeTpaXb0oIrfZ7u3Icm6psGxNI2n_Ym2_a7wzXnhgctEdx57EhCKfCNcIr3z149_txrFI8e10hlczoXnnLrfhQlTulYnJspUOmbeqKmsefox_1qvDpI-iqJ64zR8__-ftQjvGmDKvtSatAQzZrwMCw_oCpdhqfkRVGynYaveXYGfB-TVV56OBG-wNEwbM2Gl8Yysym8-suZoijOGF2dX5fn0zyWjRuaTGQN0xV2sZpSuVuFkuP_r60HQnNEQKFq8NLLioYkUahXKQho0GGNWaBegxDDSyHmaZbYgv0jGaKyQZpAatKhDXaCKQv4BZsdXY7MGLCkIjhCeJMgRCktmMkLMRGExwyKL00EPolb2uWoIzN05Ghd5R73s5JeT_HIvv_y2B5_v75nU9B0v9t5opzRvlnKVxxm9U8oJKvZgr53Crvn50T6-rvsnmD_6NswPv49-rMPb2CuBywregNlpeW02YU79nZ5X5ZbX8Dsx4vXY
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Hindsight-aware+deep+reinforcement+learning+algorithm+for+multi-agent+systems&rft.jtitle=International+journal+of+machine+learning+and+cybernetics&rft.au=Li%2C+Chengjing&rft.au=Wang%2C+Li&rft.au=Huang%2C+Zirong&rft.date=2022-07-01&rft.pub=Springer+Berlin+Heidelberg&rft.issn=1868-8071&rft.eissn=1868-808X&rft.volume=13&rft.issue=7&rft.spage=2045&rft.epage=2057&rft_id=info:doi/10.1007%2Fs13042-022-01505-x&rft.externalDocID=10_1007_s13042_022_01505_x
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1868-8071&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1868-8071&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1868-8071&client=summon