Applications of Multi-Agent Deep Reinforcement Learning: Models and Algorithms

Recent advancements in deep reinforcement learning (DRL) have led to its application in multi-agent scenarios to solve complex real-world problems, such as network resource allocation and sharing, network routing, and traffic signal controls. Multi-agent DRL (MADRL) enables multiple agents to intera...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Applied Sciences Jg. 11; H. 22; S. 10870
Hauptverfasser: Ibrahim, Abdikarim Mohamed, Yau, Kok-Lim Alvin, Chong, Yung-Wey, Wu, Celimuge
Format: Journal Article
Sprache:Englisch
Veröffentlicht: Basel MDPI AG 17.11.2021
Schlagworte:
ISSN:2076-3417, 2076-3417
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Abstract Recent advancements in deep reinforcement learning (DRL) have led to its application in multi-agent scenarios to solve complex real-world problems, such as network resource allocation and sharing, network routing, and traffic signal controls. Multi-agent DRL (MADRL) enables multiple agents to interact with each other and with their operating environment, and learn without the need for external critics (or teachers), thereby solving complex problems. Significant performance enhancements brought about by the use of MADRL have been reported in multi-agent domains; for instance, it has been shown to provide higher quality of service (QoS) in network resource allocation and sharing. This paper presents a survey of MADRL models that have been proposed for various kinds of multi-agent domains, in a taxonomic approach that highlights various aspects of MADRL models and applications, including objectives, characteristics, challenges, applications, and performance measures. Furthermore, we present open issues and future directions of MADRL.
AbstractList Recent advancements in deep reinforcement learning (DRL) have led to its application in multi-agent scenarios to solve complex real-world problems, such as network resource allocation and sharing, network routing, and traffic signal controls. Multi-agent DRL (MADRL) enables multiple agents to interact with each other and with their operating environment, and learn without the need for external critics (or teachers), thereby solving complex problems. Significant performance enhancements brought about by the use of MADRL have been reported in multi-agent domains; for instance, it has been shown to provide higher quality of service (QoS) in network resource allocation and sharing. This paper presents a survey of MADRL models that have been proposed for various kinds of multi-agent domains, in a taxonomic approach that highlights various aspects of MADRL models and applications, including objectives, characteristics, challenges, applications, and performance measures. Furthermore, we present open issues and future directions of MADRL.
Author Abdikarim Mohamed Ibrahim
Kok-Lim Alvin Yau
Yung-Wey Chong
Celimuge Wu
Author_xml – sequence: 1
  givenname: Abdikarim Mohamed
  orcidid: 0000-0001-9861-266X
  surname: Ibrahim
  fullname: Ibrahim, Abdikarim Mohamed
– sequence: 2
  givenname: Kok-Lim Alvin
  orcidid: 0000-0003-3110-2782
  surname: Yau
  fullname: Yau, Kok-Lim Alvin
– sequence: 3
  givenname: Yung-Wey
  orcidid: 0000-0003-1750-7441
  surname: Chong
  fullname: Chong, Yung-Wey
– sequence: 4
  givenname: Celimuge
  orcidid: 0000-0001-6853-5878
  surname: Wu
  fullname: Wu, Celimuge
BackLink https://cir.nii.ac.jp/crid/1870866216144577024$$DView record in CiNii
BookMark eNp1kU1rHDEMhk1JoWmSW3_AQHvstJbtkWd6W9KvwKaBkJyNx2NvvMzaU9t76L-vN1tCKEQHy7w8eiWkt-QkxGAJeQf0E-cD_ayXBYAxoL2kr8gpoxJbLkCePPu_IRc5b2mNAXgP9JT8Wi3L7I0uPobcRNdc7-fi29XGhtJ8tXZpbq0PLiZjdwdpbXUKPmy-NNdxsnNudJia1byJyZeHXT4nr52es734l8_I_fdvd5c_2_XNj6vL1bo1AllpRz0Z7cbB2Y4hMGMnJizKjlENQLmUprNCUxwGFE5USMPU8dH1vUOOE-Nn5OroO0W9VUvyO53-qKi9ehRi2iidijezVY4bOcKA4HqsVnoQCMIgdNNoGTJdvd4fvZYUf-9tLmob9ynU8RVDymhdKPaV-nikTIo5J-ueugJVhwOo5weoOPsPN748brkk7eeXij4ci4L3lT-8UPUekUEdWnRSUib4X_ixk0o
CitedBy_id crossref_primary_10_1186_s40537_025_01104_x
crossref_primary_10_3390_fi16080292
crossref_primary_10_3390_app13053027
crossref_primary_10_3390_drones8070321
crossref_primary_10_1016_j_aei_2024_102960
crossref_primary_10_1016_j_tcs_2024_115025
crossref_primary_10_1016_j_apenergy_2025_125734
crossref_primary_10_1109_TMC_2024_3404125
crossref_primary_10_1109_TSMC_2023_3250620
crossref_primary_10_3390_en15072518
crossref_primary_10_3390_electronics12020327
crossref_primary_10_1016_j_cie_2025_110856
crossref_primary_10_1016_j_knosys_2025_114101
Cites_doi 10.1126/science.1259433
10.1007/978-3-030-60990-0_12
10.1109/JSAC.2019.2904366
10.1109/ISSNIP.2007.4496884
10.1016/j.future.2020.03.065
10.1109/IJCNN48605.2020.9206820
10.23919/WiOPT47501.2019.9144110
10.3390/s20102900
10.1109/TSMCC.2007.913919
10.1109/9.580874
10.1109/ACCESS.2020.3034141
10.1109/PIMRC48278.2020.9217051
10.1145/3403953
10.1016/B978-1-55860-307-3.50049-6
10.1109/TVT.2019.2961405
10.1109/JSAC.2019.2933962
10.1109/JIOT.2020.3014926
10.1109/TCYB.2020.2977374
10.1007/BF01460115
10.1145/3068287
10.1109/ACCESS.2019.2941229
10.1007/978-3-540-32274-0_8
10.1049/iet-its.2009.0070
10.1109/MSP.2017.2743240
10.1007/978-3-319-71682-4_5
10.1109/TCCN.2019.2952909
10.1287/moor.27.4.819.297
10.1007/s10462-021-09996-w
10.1038/nature14236
10.1109/ICCS.2018.8689183
10.1109/TII.2019.2933867
10.1109/ACCESS.2021.3053348
10.1016/j.artint.2006.02.006
10.1109/TWC.2019.2933417
10.1109/TVT.2020.2999617
10.1109/ACCESS.2020.3002895
10.1109/TSTE.2019.2958361
10.1609/aaai.v30i1.10295
10.3390/w12061578
10.1016/j.trc.2021.103059
10.1016/B978-1-55860-335-6.50027-1
10.1109/WCNC45663.2020.9120693
10.1007/s10994-019-05864-5
10.1109/ICCChinaW.2019.8849971
10.1109/JIOT.2020.3021141
10.1109/ACCESS.2019.2907618
10.1111/mice.12702
10.1109/ICNC47757.2020.9049771
10.1109/ISBI45749.2020.9098329
10.1109/SMC.2019.8914266
10.23919/EUSIPCO.2019.8903067
10.1109/CCTA41146.2020.9206275
10.1109/ACCESS.2019.2936863
10.1145/1390156.1390240
10.1109/PIMRC48278.2020.9217135
10.1364/OFC.2019.M2A.2
10.1109/ICMLA.2017.0-184
10.1109/JIOT.2020.3022572
10.1109/TVT.2020.2997896
10.1109/JIOT.2020.3024666
10.1016/j.catena.2019.04.009
10.1109/ICNSC.2008.4525499
10.1109/TNN.1998.712192
10.1109/TITS.2019.2901791
10.1016/j.jnca.2020.102539
10.1109/TWC.2020.2984227
10.1109/ACCESS.2019.2937108
10.1016/j.artmed.2012.12.003
10.1007/BF00993306
10.1109/TSG.2020.3010130
ContentType Journal Article
Copyright 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Copyright_xml – notice: 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
DBID RYH
AAYXX
CITATION
ABUWG
AFKRA
AZQEC
BENPR
CCPQU
DWQXO
PHGZM
PHGZT
PIMPY
PKEHL
PQEST
PQQKQ
PQUKI
PRINS
DOA
DOI 10.3390/app112210870
DatabaseName CiNii Complete
CrossRef
ProQuest Central (Alumni)
ProQuest Central UK/Ireland
ProQuest Central Essentials - QC
ProQuest Central
ProQuest One Community College
ProQuest Central Korea
ProQuest One Academic
ProQuest One Academic (New)
Publicly Available Content Database
ProQuest One Academic Middle East (New)
ProQuest One Academic Eastern Edition (DO NOT USE)
ProQuest One Academic (retired)
ProQuest One Academic UKI Edition
ProQuest Central China
DOAJ Directory of Open Access Journals
DatabaseTitle CrossRef
Publicly Available Content Database
ProQuest One Academic Middle East (New)
ProQuest Central Essentials
ProQuest One Academic Eastern Edition
ProQuest Central (Alumni Edition)
ProQuest One Community College
ProQuest Central China
ProQuest Central
ProQuest One Academic UKI Edition
ProQuest Central Korea
ProQuest Central (New)
ProQuest One Academic
ProQuest One Academic (New)
DatabaseTitleList Publicly Available Content Database

CrossRef
Database_xml – sequence: 1
  dbid: DOA
  name: DOAJ Directory of Open Access Journals
  url: https://www.doaj.org/
  sourceTypes: Open Website
– sequence: 2
  dbid: PIMPY
  name: ProQuest Publicly Available Content Database
  url: http://search.proquest.com/publiccontent
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
Chemistry
Sciences (General)
Physics
EISSN 2076-3417
ExternalDocumentID oai_doaj_org_article_f3c7b1961f864e5a94614c615dbe262a
10_3390_app112210870
GroupedDBID .4S
2XV
5VS
7XC
8CJ
8FE
8FG
8FH
AADQD
AAFWJ
ADBBV
ADMLS
AFFHD
AFKRA
AFPKN
AFZYC
ALMA_UNASSIGNED_HOLDINGS
APEBS
ARCSS
BCNDV
BENPR
CCPQU
CZ9
D1I
D1J
D1K
GROUPED_DOAJ
IAO
IGS
ITC
K6-
K6V
KC.
KQ8
L6V
LK5
LK8
M7R
MODMG
M~E
OK1
P62
PHGZM
PHGZT
PIMPY
PROAC
RYH
TUS
AAYXX
CITATION
ABUWG
AZQEC
DWQXO
PKEHL
PQEST
PQQKQ
PQUKI
PRINS
ID FETCH-LOGICAL-c462t-badcafb9fe52612ced24e67520a110377c5e4a069964f4e52a1d53bf88f636d23
IEDL.DBID BENPR
ISICitedReferencesCount 18
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000724053500001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 2076-3417
IngestDate Mon Nov 10 04:33:20 EST 2025
Mon Jun 30 07:29:33 EDT 2025
Tue Nov 18 22:15:34 EST 2025
Sat Nov 29 07:18:59 EST 2025
Mon Nov 10 09:15:26 EST 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 22
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c462t-badcafb9fe52612ced24e67520a110377c5e4a069964f4e52a1d53bf88f636d23
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ORCID 0000-0003-3110-2782
0000-0001-9861-266x
0000-0001-6853-5878
0000-0003-1750-7441
0000-0001-9861-266X
OpenAccessLink https://www.proquest.com/docview/2602008768?pq-origsite=%requestingapplication%
PQID 2602008768
PQPubID 2032433
ParticipantIDs doaj_primary_oai_doaj_org_article_f3c7b1961f864e5a94614c615dbe262a
proquest_journals_2602008768
crossref_primary_10_3390_app112210870
crossref_citationtrail_10_3390_app112210870
nii_cinii_1870866216144577024
PublicationCentury 2000
PublicationDate 2021-11-17
PublicationDateYYYYMMDD 2021-11-17
PublicationDate_xml – month: 11
  year: 2021
  text: 2021-11-17
  day: 17
PublicationDecade 2020
PublicationPlace Basel
PublicationPlace_xml – name: Basel
PublicationTitle Applied Sciences
PublicationYear 2021
Publisher MDPI AG
Publisher_xml – name: MDPI AG
References ref_93
ref_92
ref_90
Seo (ref_94) 2019; 7
ref_14
ref_13
ref_12
Zhao (ref_54) 2019; 18
Tuyls (ref_89) 2012; 33
ref_98
ref_97
Zhong (ref_77) 2019; 5
Li (ref_81) 2021; 125
ref_95
Yuan (ref_62) 2020; 69
Tesauro (ref_41) 2003; 16
ref_16
Jang (ref_5) 2019; 7
Littman (ref_40) 2001; 1
Yau (ref_67) 2017; 50
Zhang (ref_84) 2020; 8
Kartal (ref_8) 2018; 21
Li (ref_63) 2020; 8
Rasheed (ref_32) 2020; 8
ref_22
Zhao (ref_57) 2020; 8
Mansour (ref_19) 2003; 5
Guestrin (ref_26) 2002; 2
ref_27
Laurent (ref_25) 2011; 15
Chu (ref_79) 2019; 21
ref_72
ref_71
Warnell (ref_10) 2020; 34
ref_78
ref_76
ref_73
Luis (ref_74) 2021; 9
Bui (ref_59) 2019; 179
ref_83
ref_82
Wu (ref_36) 2020; 69
Yang (ref_53) 2019; 16
Chen (ref_75) 2021; 36
Tsitsiklis (ref_20) 1997; 42
Hansen (ref_24) 2004; 4
ref_88
Budhiraja (ref_52) 2020; 8
ref_86
ref_85
Mnih (ref_30) 2015; 518
Arel (ref_2) 2010; 4
Li (ref_35) 2019; 69
ref_50
Zhang (ref_87) 2019; 7
Bennett (ref_15) 2013; 57
Ge (ref_33) 2019; 7
Xi (ref_55) 2019; 11
Zhang (ref_11) 2021; 325
ref_56
Aljeri (ref_58) 2020; 53
ref_61
ref_60
Khan (ref_65) 2020; 154
ref_68
ref_64
Tsitsiklis (ref_18) 1994; 16
Xu (ref_70) 2020; 19
Bowling (ref_17) 2015; 347
Shah (ref_80) 2020; 8
Shoham (ref_96) 2007; 171
Jiang (ref_69) 2019; 37
Rasheed (ref_66) 2020; 109
Bernstein (ref_23) 2002; 27
ref_34
ref_31
Nguyen (ref_7) 2020; 50
Pesce (ref_91) 2020; 109
ref_39
Busoniu (ref_28) 2008; 38
ref_38
Ernst (ref_45) 2005; 6
ref_37
Liang (ref_51) 2019; 37
ref_47
ref_46
Hu (ref_21) 1998; 98
ref_44
ref_43
ref_42
ref_1
ref_3
ref_49
ref_48
ref_9
Stettner (ref_29) 1982; 9
ref_4
ref_6
References_xml – volume: 347
  start-page: 145
  year: 2015
  ident: ref_17
  article-title: Heads-up limit hold’em poker is solved
  publication-title: Science
  doi: 10.1126/science.1259433
– ident: ref_9
– volume: 325
  start-page: 321
  year: 2021
  ident: ref_11
  article-title: Multi-agent reinforcement learning: A selective overview of theories and algorithms
  publication-title: Handb. Reinf. Learn. Control
  doi: 10.1007/978-3-030-60990-0_12
– volume: 37
  start-page: 1424
  year: 2019
  ident: ref_69
  article-title: Reinforcement learning for real-time optimization in NB-IoT networks
  publication-title: IEEE J. Sel. Areas Commun.
  doi: 10.1109/JSAC.2019.2904366
– ident: ref_3
  doi: 10.1109/ISSNIP.2007.4496884
– volume: 109
  start-page: 431
  year: 2020
  ident: ref_66
  article-title: Deep reinforcement learning for traffic signal control under disturbances: A case study on Sunway city, Malaysia
  publication-title: Future Gener. Comput. Syst.
  doi: 10.1016/j.future.2020.03.065
– volume: 21
  start-page: 22
  year: 2018
  ident: ref_8
  article-title: Is multiagent deep reinforcement learning the answer or the question? A brief survey
  publication-title: Learning
– ident: ref_76
  doi: 10.1109/IJCNN48605.2020.9206820
– ident: ref_6
  doi: 10.23919/WiOPT47501.2019.9144110
– volume: 6
  start-page: 503
  year: 2005
  ident: ref_45
  article-title: Tree-based batch mode reinforcement learning
  publication-title: J. Mach. Learn. Res.
– ident: ref_16
– ident: ref_64
  doi: 10.3390/s20102900
– volume: 38
  start-page: 156
  year: 2008
  ident: ref_28
  article-title: A comprehensive survey of multiagent reinforcement learning
  publication-title: IEEE Trans. Syst. Man, Cybern. Part C
  doi: 10.1109/TSMCC.2007.913919
– ident: ref_39
– volume: 42
  start-page: 674
  year: 1997
  ident: ref_20
  article-title: An analysis of temporal-difference learning with function approximation
  publication-title: IEEE Trans. Autom. Control
  doi: 10.1109/9.580874
– volume: 8
  start-page: 208016
  year: 2020
  ident: ref_32
  article-title: Deep Reinforcement Learning for Traffic Signal Control: A Review
  publication-title: IEEE Access
  doi: 10.1109/ACCESS.2020.3034141
– ident: ref_88
  doi: 10.1109/PIMRC48278.2020.9217051
– volume: 53
  start-page: 1
  year: 2020
  ident: ref_58
  article-title: Mobility Management in 5G-enabled Vehicular Networks: Models, Protocols, and Classification
  publication-title: ACM Comput. Surv. (CSUR)
  doi: 10.1145/3403953
– ident: ref_34
  doi: 10.1016/B978-1-55860-307-3.50049-6
– volume: 69
  start-page: 1828
  year: 2019
  ident: ref_35
  article-title: Multi-Agent Deep Reinforcement Learning based Spectrum Allocation for D2D Underlay Communications
  publication-title: IEEE Trans. Veh. Technol.
  doi: 10.1109/TVT.2019.2961405
– volume: 37
  start-page: 2282
  year: 2019
  ident: ref_51
  article-title: Spectrum sharing in vehicular networks based on multi-agent reinforcement learning
  publication-title: IEEE J. Sel. Areas Commun.
  doi: 10.1109/JSAC.2019.2933962
– volume: 8
  start-page: 3143
  year: 2020
  ident: ref_52
  article-title: Deep Reinforcement Learning Based Proportional Fair Scheduling Control Scheme for Underlay D2D Communication
  publication-title: IEEE Internet Things J.
  doi: 10.1109/JIOT.2020.3014926
– volume: 50
  start-page: 3826
  year: 2020
  ident: ref_7
  article-title: Deep reinforcement learning for multiagent systems: A review of challenges, solutions, and applications
  publication-title: IEEE Trans. Cybern.
  doi: 10.1109/TCYB.2020.2977374
– ident: ref_27
– volume: 9
  start-page: 1
  year: 1982
  ident: ref_29
  article-title: Zero-sum Markov games with stopping and impulsive strategies
  publication-title: Appl. Math. Optim.
  doi: 10.1007/BF01460115
– volume: 50
  start-page: 1
  year: 2017
  ident: ref_67
  article-title: A survey on reinforcement learning models and algorithms for traffic signal control
  publication-title: ACM Comput. Surv. (CSUR)
  doi: 10.1145/3068287
– volume: 7
  start-page: 133653
  year: 2019
  ident: ref_5
  article-title: Q-learning algorithms: A comprehensive classification and applications
  publication-title: IEEE Access
  doi: 10.1109/ACCESS.2019.2941229
– ident: ref_97
– volume: 98
  start-page: 242
  year: 1998
  ident: ref_21
  article-title: Multiagent reinforcement learning: Theoretical framework and an algorithm
  publication-title: ICML
– ident: ref_42
  doi: 10.1007/978-3-540-32274-0_8
– volume: 4
  start-page: 128
  year: 2010
  ident: ref_2
  article-title: Reinforcement learning-based multi-agent system for network traffic signal control
  publication-title: IET Intell. Transp. Syst.
  doi: 10.1049/iet-its.2009.0070
– ident: ref_1
  doi: 10.1109/MSP.2017.2743240
– ident: ref_92
  doi: 10.1007/978-3-319-71682-4_5
– volume: 5
  start-page: 1125
  year: 2019
  ident: ref_77
  article-title: A Deep Actor-Critic Reinforcement Learning Framework for Dynamic Multichannel Access
  publication-title: IEEE Trans. Cogn. Commun. Netw.
  doi: 10.1109/TCCN.2019.2952909
– ident: ref_47
– volume: 27
  start-page: 819
  year: 2002
  ident: ref_23
  article-title: The complexity of decentralized control of Markov decision processes
  publication-title: Math. Oper. Res.
  doi: 10.1287/moor.27.4.819.297
– ident: ref_14
– ident: ref_12
  doi: 10.1007/s10462-021-09996-w
– volume: 518
  start-page: 529
  year: 2015
  ident: ref_30
  article-title: Human-level control through deep reinforcement learning
  publication-title: Nature
  doi: 10.1038/nature14236
– ident: ref_68
  doi: 10.1109/ICCS.2018.8689183
– ident: ref_50
– volume: 16
  start-page: 5565
  year: 2019
  ident: ref_53
  article-title: Learning-based energy-efficient resource management by heterogeneous RF/VLC for ultra-reliable low-latency industrial IoT networks
  publication-title: IEEE Trans. Ind. Inform.
  doi: 10.1109/TII.2019.2933867
– volume: 9
  start-page: 17084
  year: 2021
  ident: ref_74
  article-title: A Multiagent Deep Reinforcement Learning Approach for Path Planning in Autonomous Surface Vehicles: The Ypacaraí Lake Patrolling Case
  publication-title: IEEE Access
  doi: 10.1109/ACCESS.2021.3053348
– volume: 171
  start-page: 365
  year: 2007
  ident: ref_96
  article-title: If multi-agent learning is the answer, what is the question?
  publication-title: Artif. Intell.
  doi: 10.1016/j.artint.2006.02.006
– volume: 33
  start-page: 41
  year: 2012
  ident: ref_89
  article-title: Multiagent learning: Basics, challenges, and prospects
  publication-title: Ai Mag.
– volume: 18
  start-page: 5141
  year: 2019
  ident: ref_54
  article-title: Deep reinforcement learning for user association and resource allocation in heterogeneous cellular networks
  publication-title: IEEE Trans. Wirel. Commun.
  doi: 10.1109/TWC.2019.2933417
– volume: 69
  start-page: 9041
  year: 2020
  ident: ref_62
  article-title: A Joint Service Migration and Mobility Optimization Approach for Vehicular Edge Computing
  publication-title: IEEE Trans. Veh. Technol.
  doi: 10.1109/TVT.2020.2999617
– volume: 8
  start-page: 112762
  year: 2020
  ident: ref_63
  article-title: Joint Optimization of Caching and Computation in Multi-Server NOMA-MEC System via Reinforcement Learning
  publication-title: IEEE Access
  doi: 10.1109/ACCESS.2020.3002895
– volume: 11
  start-page: 2417
  year: 2019
  ident: ref_55
  article-title: A novel multi-agent DDQN-AD method-based distributed strategy for automatic generation control of integrated energy systems
  publication-title: IEEE Trans. Sustain. Energy
  doi: 10.1109/TSTE.2019.2958361
– ident: ref_48
  doi: 10.1609/aaai.v30i1.10295
– ident: ref_60
  doi: 10.3390/w12061578
– ident: ref_95
– volume: 125
  start-page: 103059
  year: 2021
  ident: ref_81
  article-title: Network-wide traffic signal control optimization using a multi-agent deep reinforcement learning
  publication-title: Transp. Res. Part C Emerg. Technol.
  doi: 10.1016/j.trc.2021.103059
– ident: ref_22
  doi: 10.1016/B978-1-55860-335-6.50027-1
– volume: 4
  start-page: 709
  year: 2004
  ident: ref_24
  article-title: Dynamic programming for partially observable stochastic games
  publication-title: AAAI
– ident: ref_49
– ident: ref_73
  doi: 10.1109/WCNC45663.2020.9120693
– volume: 109
  start-page: 1727
  year: 2020
  ident: ref_91
  article-title: Improving coordination in small-scale multi-agent deep reinforcement learning through memory-driven communication
  publication-title: Mach. Learn.
  doi: 10.1007/s10994-019-05864-5
– ident: ref_78
  doi: 10.1109/ICCChinaW.2019.8849971
– volume: 8
  start-page: 2066
  year: 2020
  ident: ref_57
  article-title: A novel generation adversarial network-based vehicle trajectory prediction method for intelligent vehicular networks
  publication-title: IEEE Internet Things J.
  doi: 10.1109/JIOT.2020.3021141
– volume: 2
  start-page: 227
  year: 2002
  ident: ref_26
  article-title: Coordinated reinforcement learning
  publication-title: ICML
– ident: ref_90
– volume: 7
  start-page: 40797
  year: 2019
  ident: ref_33
  article-title: Cooperative deep Q-learning with Q-value transfer for multi-intersection signal control
  publication-title: IEEE Access
  doi: 10.1109/ACCESS.2019.2907618
– ident: ref_98
– volume: 36
  start-page: 838
  year: 2021
  ident: ref_75
  article-title: Graph neural network and reinforcement learning for multi-agent cooperative control of connected autonomous vehicles
  publication-title: Comput.-Aided Civ. Infrastruct. Eng.
  doi: 10.1111/mice.12702
– ident: ref_82
  doi: 10.1109/ICNC47757.2020.9049771
– ident: ref_71
  doi: 10.1109/ISBI45749.2020.9098329
– ident: ref_56
  doi: 10.1109/SMC.2019.8914266
– ident: ref_61
  doi: 10.23919/EUSIPCO.2019.8903067
– ident: ref_83
  doi: 10.1109/CCTA41146.2020.9206275
– volume: 7
  start-page: 118776
  year: 2019
  ident: ref_94
  article-title: Rewards prediction-based credit assignment for reinforcement learning with sparse binary rewards
  publication-title: IEEE Access
  doi: 10.1109/ACCESS.2019.2936863
– ident: ref_38
– ident: ref_44
  doi: 10.1145/1390156.1390240
– ident: ref_72
  doi: 10.1109/PIMRC48278.2020.9217135
– ident: ref_31
  doi: 10.1364/OFC.2019.M2A.2
– ident: ref_85
  doi: 10.1109/ICMLA.2017.0-184
– volume: 8
  start-page: 3410
  year: 2020
  ident: ref_80
  article-title: Multi-Agent Deep Reinforcement Learning Based Virtual Resource Allocation Through Network Function Virtualization in Internet of Things
  publication-title: IEEE Internet Things J.
  doi: 10.1109/JIOT.2020.3022572
– ident: ref_93
– volume: 5
  start-page: 1
  year: 2003
  ident: ref_19
  article-title: Learning Rates for Q-learning
  publication-title: J. Mach. Learn. Res.
– volume: 15
  start-page: 55
  year: 2011
  ident: ref_25
  article-title: The world of independent learners is not Markovian
  publication-title: Int. J. Knowl.-Based Intell. Eng. Syst.
– volume: 69
  start-page: 8243
  year: 2020
  ident: ref_36
  article-title: Multi-Agent Deep Reinforcement Learning for Urban Traffic Light Control in Vehicular Networks
  publication-title: IEEE Trans. Veh. Technol.
  doi: 10.1109/TVT.2020.2997896
– volume: 8
  start-page: 3786
  year: 2020
  ident: ref_84
  article-title: Hierarchical Deep Reinforcement Learning for Backscattering Data Collection with Multiple UAVs
  publication-title: IEEE Internet Things J.
  doi: 10.1109/JIOT.2020.3024666
– volume: 179
  start-page: 184
  year: 2019
  ident: ref_59
  article-title: A novel hybrid approach based on a swarm intelligence optimized extreme learning machine for flash flood susceptibility mapping
  publication-title: Catena
  doi: 10.1016/j.catena.2019.04.009
– ident: ref_4
  doi: 10.1109/ICNSC.2008.4525499
– ident: ref_13
  doi: 10.1109/TNN.1998.712192
– ident: ref_37
– volume: 21
  start-page: 1086
  year: 2019
  ident: ref_79
  article-title: Multi-agent deep reinforcement learning for large-scale traffic signal control
  publication-title: IEEE Trans. Intell. Transp. Syst.
  doi: 10.1109/TITS.2019.2901791
– volume: 154
  start-page: 102539
  year: 2020
  ident: ref_65
  article-title: Survey and taxonomy of clustering algorithms in 5G
  publication-title: J. Netw. Comput. Appl.
  doi: 10.1016/j.jnca.2020.102539
– volume: 19
  start-page: 4494
  year: 2020
  ident: ref_70
  article-title: The application of deep reinforcement learning to distributed spectrum access in dynamic heterogeneous environments with partial observations
  publication-title: IEEE Trans. Wirel. Commun.
  doi: 10.1109/TWC.2020.2984227
– volume: 7
  start-page: 118898
  year: 2019
  ident: ref_87
  article-title: Multi-agent deep reinforcement learning-based cooperative spectrum sensing with upper confidence bound exploration
  publication-title: IEEE Access
  doi: 10.1109/ACCESS.2019.2937108
– volume: 34
  start-page: 1
  year: 2020
  ident: ref_10
  article-title: Agents teaching agents: A survey on inter-agent transfer learning
  publication-title: Auton. Agents Multi-Agent Syst.
– ident: ref_46
– volume: 57
  start-page: 9
  year: 2013
  ident: ref_15
  article-title: Artificial intelligence framework for simulating clinical decision-making: A Markov decision process approach
  publication-title: Artif. Intell. Med.
  doi: 10.1016/j.artmed.2012.12.003
– volume: 16
  start-page: 185
  year: 1994
  ident: ref_18
  article-title: Asynchronous stochastic approximation and Q-learning
  publication-title: Mach. Learn.
  doi: 10.1007/BF00993306
– volume: 16
  start-page: 871
  year: 2003
  ident: ref_41
  article-title: Extending Q-learning to general adaptive multi-agent systems
  publication-title: Adv. Neural Inf. Process. Syst.
– volume: 1
  start-page: 322
  year: 2001
  ident: ref_40
  article-title: Friend-or-foe Q-learning in general-sum games
  publication-title: ICML
– ident: ref_43
– ident: ref_86
  doi: 10.1109/TSG.2020.3010130
SSID ssj0000913810
ssib023169898
ssib045317060
ssib045318793
ssib045318197
ssib045321406
ssib045321407
ssib045316253
ssib030194696
ssib045320369
ssib045316688
ssib045318623
ssib045318831
ssib045321377
ssib045320582
ssib045318927
Score 2.3568988
SecondaryResourceType review_article
Snippet Recent advancements in deep reinforcement learning (DRL) have led to its application in multi-agent scenarios to solve complex real-world problems, such as...
SourceID doaj
proquest
crossref
nii
SourceType Open Website
Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 10870
SubjectTerms Algorithms
applied reinforcement learning
Biology (General)
Chemistry
Cooperative learning
Decision making
Deep learning
deep Q-network
Engineering (General). Civil engineering (General)
Knowledge
Markov analysis
multi-agent deep reinforcement learning
multi-agent reinforcement learning
Physics
QC1-999
QD1-999
QH301-705.5
reinforcement learning
T
TA1-2040
Taxonomy
Technology
Wireless networks
SummonAdditionalLinks – databaseName: DOAJ Directory of Open Access Journals
  dbid: DOA
  link: http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1La9wwEB5C6CE9lOZRum1SdGihoZiuZVmSe9s-Qk9LKAnkZvQYpQtbb1hv-_s7IzvBUEovvfhgZCFmRpr5rJlvAF47gyZaJwsuUyxUDL6wIegCtUNXh7lL6HOzCbNc2pub5nLS6otzwgZ64EFw71MVjCczKZPVCmvXKHIogfxw9Ci1zKHR3DQTMJXP4KZk6qoh070iXM_3wRRaEMCx3Jd44oMyVT95lm61-uM8zk7m4ik8GaNDsRhWdQh72B3B4wln4BEcjruxF29HyujzY1guJvfQYpNErqstFlw3JT4j3olvmClSQ_4bKEZW1dsPgnuhrXvhuigW69vNdrX7_qM_geuLL1efvhZjp4QiKC13hXcxuOSbhDVTggWMUiFBATl3JRcCmlCjcnNN4EYlkqJ0Zawrn6xNutJRVs9gv9t0-BxEpUITfI2pQprDaUefykgozXkTUq1m8O5edm0YacS5m8W6JTjBkm6nkp7Bm4fRdwN9xl_GfWQ1PIxh0uv8gkyhHU2h_ZcpzOCMlEir4mdJs1qtZcmItzaGYpEZnN6rtx13at8SnpOZl8---B9reAkHkrNeOFHQnML-bvsTz-BR-LVb9dtX2Uh_A0NL6N0
  priority: 102
  providerName: Directory of Open Access Journals
Title Applications of Multi-Agent Deep Reinforcement Learning: Models and Algorithms
URI https://cir.nii.ac.jp/crid/1870866216144577024
https://www.proquest.com/docview/2602008768
https://doaj.org/article/f3c7b1961f864e5a94614c615dbe262a
Volume 11
WOSCitedRecordID wos000724053500001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVAON
  databaseName: DOAJ Directory of Open Access Journals
  customDbUrl:
  eissn: 2076-3417
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0000913810
  issn: 2076-3417
  databaseCode: DOA
  dateStart: 20110101
  isFulltext: true
  titleUrlDefault: https://www.doaj.org/
  providerName: Directory of Open Access Journals
– providerCode: PRVHPJ
  databaseName: ROAD: Directory of Open Access Scholarly Resources
  customDbUrl:
  eissn: 2076-3417
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0000913810
  issn: 2076-3417
  databaseCode: M~E
  dateStart: 20110101
  isFulltext: true
  titleUrlDefault: https://road.issn.org
  providerName: ISSN International Centre
– providerCode: PRVPQU
  databaseName: ProQuest Central
  customDbUrl:
  eissn: 2076-3417
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0000913810
  issn: 2076-3417
  databaseCode: BENPR
  dateStart: 20110101
  isFulltext: true
  titleUrlDefault: https://www.proquest.com/central
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: ProQuest Publicly Available Content Database
  customDbUrl:
  eissn: 2076-3417
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0000913810
  issn: 2076-3417
  databaseCode: PIMPY
  dateStart: 20110101
  isFulltext: true
  titleUrlDefault: http://search.proquest.com/publiccontent
  providerName: ProQuest
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV3db9MwED_BxgM8DDaY6NgmP4AEQhaN49guL6iDTfBAVU0gjafI8UepVNLSFP5-7ly3VELwwkukJJfIyZ3Pd-e73wE8tTpob6zgVKbIpXcNN84pHpQNtnJ9G0OTmk3o0cjc3AzGOeDW5bTKjU5MitrPHcXIX6HdLRJ-mnmz-M6paxTtruYWGrdhn5DKUM73Ly5H4-ttlIVQL03RX2e8l-jf074wmhjo6BjqT7yzFiXIflxh2un0D72cFpur-_87zAdwkM1MNlzLxSHcCu0R3NsBHzyCwzytO_Y8Y0-_eAij4c6GNptHlgp0-ZAKsNi7EBbsOiSsVZfCiizDs05eM2qqNuuYbT0bziY4pNXXb90j-Hx1-ente55bLnAnlVjxxnpnYzOIoSJsMRe8kAF9CtG3BVUUalcFafsKvSQZJRLZwldlE42JqlRelMew187b8BhYKd3ANVWIZcB3WGXxUeHR3bONdrGSPXi5-fm1y3jk1BZjVqNfQqyqd1nVg2db6sUah-MvdBfExy0NoWenC_PlpM6TsY6l0w2qniIahR9hBxKNFIe2nW-CUML24AylAEdFxwLfapQSBbnOldZo1PTgdMP8Ok_5rv7N-ZN_334CdwUlxlAuoT6FvdXyRziDO-7natotz7MEn6fgAJ6NP3wcf_kFRLn7Gg
linkProvider ProQuest
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V3Nb9MwFH-aOiTgAGwwrbCBD0wCoYjEcRxnEkKFMa3aVlVoSOMUHH-USiUtTQHtn-Jv5L00KZUQ3HbgkkPiWLH9y_uy3-8BPNWpS63SPKA0xUBYUwTKGBk4qZ1OTKi9K-piE-lgoC4vs-EG_GxzYehYZSsTa0Ftp4Zi5C_R7uY1f5p6PfsaUNUo2l1tS2gsYXHqrn6gy1a96h_h-h5wfvzu4u1J0FQVCIyQfBEU2hrti8y7hOizjLNcODSbeagjSppLTeKEDiU6AsILbKQjm8SFV8rLWFoiOkCRvykI7B3YHPbPhx9XUR1i2VRRuDxhH8dZSPvQaNKgY6WoHvKa7qtLBKBGK8fjP_RArdyO7_5v03IP7jRmNOstcb8FG67chttr5IrbsNWIrYo9a7i1n9-HQW9tw55NPasTkIMeJZixI-dm7L2ruWRNHTZlDf3s6JBR0bhJxXRpWW8ywilYfP5SPYAP1zLKHeiU09LtAouFyUyROB877ENLja9yi-6sLlLjE9GFF-1i56bhW6eyH5Mc_S6CRr4OjS4crFrPljwjf2n3hnCzakPs4PWN6XyUN8Im97FJCxStkVcSB6EzgUaYQdvVFo5Lrruwj6jDr6JrhL0qKXlEoYEkTdFo68JeC7a8EWlV_htpD__9-AncPLk4P8vP-oPTR3CL0yEgOjeZ7kFnMf_m9uGG-b4YV_PHzd_D4NN1I_MXc7xV1Q
linkToPdf http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V1Rb9MwED5NHULwAGyA6NjAD0wCoWiJ4zgOEkKFUlENqgqBNJ4yx7FLpS4tTQHx1_h13KVOqYTgbQ-85CFxotj-fL47330H8EinNi2V5gGlKQaiNEWgjJGBldrqxITa2aIpNpGORursLBvvwM82F4bCKluZ2Ajqcm7IR36Cejdv-NPUifNhEeP-4MXiS0AVpOiktS2nsYbIqf3xHc23-vmwj3N9zPng9YdXbwJfYSAwQvJVUOjSaFdkziZEpWVsyYVFFZqHOqIEutQkVuhQolEgnMBGOiqTuHBKORnLkkgPUPzvokoueAd2x8N3408bDw8xbqooXEfbx3EW0pk0qjdoZCmqjby1DzblAnB3q6bTP_aEZqMb3Pyfh-gW3PDqNeut18Me7NhqH65vkS7uw54XZzV77Dm3n9yGUW_rIJ_NHWsSk4MeJZ6xvrUL9t42HLOmcacyT0s7ecaomNysZroqWW82wSFYfb6o78DHS-nlXehU88reAxYLk5kisS62-A0tNb7KSzRzdZEal4guPG0nPjeeh53KgcxytMcIJvk2TLpwvGm9WPOP_KXdS8LQpg2xhjc35stJ7oVQ7mKTFihyI6ckdkJnApUzgzptWVguue7CESIQ_4quEX5VSckjchkkaYrKXBcOW-DlXtTV-W_UHfz78UO4inDM3w5Hp_fhGqfYIAqnTA-hs1p-tUdwxXxbTevlA7-QGJxfNjB_ARwQXpU
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Applications+of+Multi-Agent+Deep+Reinforcement+Learning%3A+Models+and+Algorithms&rft.jtitle=Applied+Sciences&rft.au=Abdikarim+Mohamed+Ibrahim&rft.au=Kok-Lim+Alvin+Yau&rft.au=Yung-Wey+Chong&rft.au=Celimuge+Wu&rft.date=2021-11-17&rft.pub=MDPI+AG&rft.eissn=2076-3417&rft.volume=11&rft.spage=10870&rft_id=info:doi/10.3390%2Fapp112210870
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2076-3417&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2076-3417&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2076-3417&client=summon