Applications of Multi-Agent Deep Reinforcement Learning: Models and Algorithms
Recent advancements in deep reinforcement learning (DRL) have led to its application in multi-agent scenarios to solve complex real-world problems, such as network resource allocation and sharing, network routing, and traffic signal controls. Multi-agent DRL (MADRL) enables multiple agents to intera...
Gespeichert in:
| Veröffentlicht in: | Applied Sciences Jg. 11; H. 22; S. 10870 |
|---|---|
| Hauptverfasser: | , , , |
| Format: | Journal Article |
| Sprache: | Englisch |
| Veröffentlicht: |
Basel
MDPI AG
17.11.2021
|
| Schlagworte: | |
| ISSN: | 2076-3417, 2076-3417 |
| Online-Zugang: | Volltext |
| Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
| Abstract | Recent advancements in deep reinforcement learning (DRL) have led to its application in multi-agent scenarios to solve complex real-world problems, such as network resource allocation and sharing, network routing, and traffic signal controls. Multi-agent DRL (MADRL) enables multiple agents to interact with each other and with their operating environment, and learn without the need for external critics (or teachers), thereby solving complex problems. Significant performance enhancements brought about by the use of MADRL have been reported in multi-agent domains; for instance, it has been shown to provide higher quality of service (QoS) in network resource allocation and sharing. This paper presents a survey of MADRL models that have been proposed for various kinds of multi-agent domains, in a taxonomic approach that highlights various aspects of MADRL models and applications, including objectives, characteristics, challenges, applications, and performance measures. Furthermore, we present open issues and future directions of MADRL. |
|---|---|
| AbstractList | Recent advancements in deep reinforcement learning (DRL) have led to its application in multi-agent scenarios to solve complex real-world problems, such as network resource allocation and sharing, network routing, and traffic signal controls. Multi-agent DRL (MADRL) enables multiple agents to interact with each other and with their operating environment, and learn without the need for external critics (or teachers), thereby solving complex problems. Significant performance enhancements brought about by the use of MADRL have been reported in multi-agent domains; for instance, it has been shown to provide higher quality of service (QoS) in network resource allocation and sharing. This paper presents a survey of MADRL models that have been proposed for various kinds of multi-agent domains, in a taxonomic approach that highlights various aspects of MADRL models and applications, including objectives, characteristics, challenges, applications, and performance measures. Furthermore, we present open issues and future directions of MADRL. |
| Author | Abdikarim Mohamed Ibrahim Kok-Lim Alvin Yau Yung-Wey Chong Celimuge Wu |
| Author_xml | – sequence: 1 givenname: Abdikarim Mohamed orcidid: 0000-0001-9861-266X surname: Ibrahim fullname: Ibrahim, Abdikarim Mohamed – sequence: 2 givenname: Kok-Lim Alvin orcidid: 0000-0003-3110-2782 surname: Yau fullname: Yau, Kok-Lim Alvin – sequence: 3 givenname: Yung-Wey orcidid: 0000-0003-1750-7441 surname: Chong fullname: Chong, Yung-Wey – sequence: 4 givenname: Celimuge orcidid: 0000-0001-6853-5878 surname: Wu fullname: Wu, Celimuge |
| BackLink | https://cir.nii.ac.jp/crid/1870866216144577024$$DView record in CiNii |
| BookMark | eNp1kU1rHDEMhk1JoWmSW3_AQHvstJbtkWd6W9KvwKaBkJyNx2NvvMzaU9t76L-vN1tCKEQHy7w8eiWkt-QkxGAJeQf0E-cD_ayXBYAxoL2kr8gpoxJbLkCePPu_IRc5b2mNAXgP9JT8Wi3L7I0uPobcRNdc7-fi29XGhtJ8tXZpbq0PLiZjdwdpbXUKPmy-NNdxsnNudJia1byJyZeHXT4nr52es734l8_I_fdvd5c_2_XNj6vL1bo1AllpRz0Z7cbB2Y4hMGMnJizKjlENQLmUprNCUxwGFE5USMPU8dH1vUOOE-Nn5OroO0W9VUvyO53-qKi9ehRi2iidijezVY4bOcKA4HqsVnoQCMIgdNNoGTJdvd4fvZYUf-9tLmob9ynU8RVDymhdKPaV-nikTIo5J-ueugJVhwOo5weoOPsPN748brkk7eeXij4ci4L3lT-8UPUekUEdWnRSUib4X_ixk0o |
| CitedBy_id | crossref_primary_10_1186_s40537_025_01104_x crossref_primary_10_3390_fi16080292 crossref_primary_10_3390_app13053027 crossref_primary_10_3390_drones8070321 crossref_primary_10_1016_j_aei_2024_102960 crossref_primary_10_1016_j_tcs_2024_115025 crossref_primary_10_1016_j_apenergy_2025_125734 crossref_primary_10_1109_TMC_2024_3404125 crossref_primary_10_1109_TSMC_2023_3250620 crossref_primary_10_3390_en15072518 crossref_primary_10_3390_electronics12020327 crossref_primary_10_1016_j_cie_2025_110856 crossref_primary_10_1016_j_knosys_2025_114101 |
| Cites_doi | 10.1126/science.1259433 10.1007/978-3-030-60990-0_12 10.1109/JSAC.2019.2904366 10.1109/ISSNIP.2007.4496884 10.1016/j.future.2020.03.065 10.1109/IJCNN48605.2020.9206820 10.23919/WiOPT47501.2019.9144110 10.3390/s20102900 10.1109/TSMCC.2007.913919 10.1109/9.580874 10.1109/ACCESS.2020.3034141 10.1109/PIMRC48278.2020.9217051 10.1145/3403953 10.1016/B978-1-55860-307-3.50049-6 10.1109/TVT.2019.2961405 10.1109/JSAC.2019.2933962 10.1109/JIOT.2020.3014926 10.1109/TCYB.2020.2977374 10.1007/BF01460115 10.1145/3068287 10.1109/ACCESS.2019.2941229 10.1007/978-3-540-32274-0_8 10.1049/iet-its.2009.0070 10.1109/MSP.2017.2743240 10.1007/978-3-319-71682-4_5 10.1109/TCCN.2019.2952909 10.1287/moor.27.4.819.297 10.1007/s10462-021-09996-w 10.1038/nature14236 10.1109/ICCS.2018.8689183 10.1109/TII.2019.2933867 10.1109/ACCESS.2021.3053348 10.1016/j.artint.2006.02.006 10.1109/TWC.2019.2933417 10.1109/TVT.2020.2999617 10.1109/ACCESS.2020.3002895 10.1109/TSTE.2019.2958361 10.1609/aaai.v30i1.10295 10.3390/w12061578 10.1016/j.trc.2021.103059 10.1016/B978-1-55860-335-6.50027-1 10.1109/WCNC45663.2020.9120693 10.1007/s10994-019-05864-5 10.1109/ICCChinaW.2019.8849971 10.1109/JIOT.2020.3021141 10.1109/ACCESS.2019.2907618 10.1111/mice.12702 10.1109/ICNC47757.2020.9049771 10.1109/ISBI45749.2020.9098329 10.1109/SMC.2019.8914266 10.23919/EUSIPCO.2019.8903067 10.1109/CCTA41146.2020.9206275 10.1109/ACCESS.2019.2936863 10.1145/1390156.1390240 10.1109/PIMRC48278.2020.9217135 10.1364/OFC.2019.M2A.2 10.1109/ICMLA.2017.0-184 10.1109/JIOT.2020.3022572 10.1109/TVT.2020.2997896 10.1109/JIOT.2020.3024666 10.1016/j.catena.2019.04.009 10.1109/ICNSC.2008.4525499 10.1109/TNN.1998.712192 10.1109/TITS.2019.2901791 10.1016/j.jnca.2020.102539 10.1109/TWC.2020.2984227 10.1109/ACCESS.2019.2937108 10.1016/j.artmed.2012.12.003 10.1007/BF00993306 10.1109/TSG.2020.3010130 |
| ContentType | Journal Article |
| Copyright | 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. |
| Copyright_xml | – notice: 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. |
| DBID | RYH AAYXX CITATION ABUWG AFKRA AZQEC BENPR CCPQU DWQXO PHGZM PHGZT PIMPY PKEHL PQEST PQQKQ PQUKI PRINS DOA |
| DOI | 10.3390/app112210870 |
| DatabaseName | CiNii Complete CrossRef ProQuest Central (Alumni) ProQuest Central UK/Ireland ProQuest Central Essentials - QC ProQuest Central ProQuest One Community College ProQuest Central Korea ProQuest One Academic ProQuest One Academic (New) Publicly Available Content Database ProQuest One Academic Middle East (New) ProQuest One Academic Eastern Edition (DO NOT USE) ProQuest One Academic (retired) ProQuest One Academic UKI Edition ProQuest Central China DOAJ Directory of Open Access Journals |
| DatabaseTitle | CrossRef Publicly Available Content Database ProQuest One Academic Middle East (New) ProQuest Central Essentials ProQuest One Academic Eastern Edition ProQuest Central (Alumni Edition) ProQuest One Community College ProQuest Central China ProQuest Central ProQuest One Academic UKI Edition ProQuest Central Korea ProQuest Central (New) ProQuest One Academic ProQuest One Academic (New) |
| DatabaseTitleList | Publicly Available Content Database CrossRef |
| Database_xml | – sequence: 1 dbid: DOA name: DOAJ Directory of Open Access Journals url: https://www.doaj.org/ sourceTypes: Open Website – sequence: 2 dbid: PIMPY name: ProQuest Publicly Available Content Database url: http://search.proquest.com/publiccontent sourceTypes: Aggregation Database |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Engineering Chemistry Sciences (General) Physics |
| EISSN | 2076-3417 |
| ExternalDocumentID | oai_doaj_org_article_f3c7b1961f864e5a94614c615dbe262a 10_3390_app112210870 |
| GroupedDBID | .4S 2XV 5VS 7XC 8CJ 8FE 8FG 8FH AADQD AAFWJ ADBBV ADMLS AFFHD AFKRA AFPKN AFZYC ALMA_UNASSIGNED_HOLDINGS APEBS ARCSS BCNDV BENPR CCPQU CZ9 D1I D1J D1K GROUPED_DOAJ IAO IGS ITC K6- K6V KC. KQ8 L6V LK5 LK8 M7R MODMG M~E OK1 P62 PHGZM PHGZT PIMPY PROAC RYH TUS AAYXX CITATION ABUWG AZQEC DWQXO PKEHL PQEST PQQKQ PQUKI PRINS |
| ID | FETCH-LOGICAL-c462t-badcafb9fe52612ced24e67520a110377c5e4a069964f4e52a1d53bf88f636d23 |
| IEDL.DBID | BENPR |
| ISICitedReferencesCount | 18 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000724053500001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 2076-3417 |
| IngestDate | Mon Nov 10 04:33:20 EST 2025 Mon Jun 30 07:29:33 EDT 2025 Tue Nov 18 22:15:34 EST 2025 Sat Nov 29 07:18:59 EST 2025 Mon Nov 10 09:15:26 EST 2025 |
| IsDoiOpenAccess | true |
| IsOpenAccess | true |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 22 |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c462t-badcafb9fe52612ced24e67520a110377c5e4a069964f4e52a1d53bf88f636d23 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ORCID | 0000-0003-3110-2782 0000-0001-9861-266x 0000-0001-6853-5878 0000-0003-1750-7441 0000-0001-9861-266X |
| OpenAccessLink | https://www.proquest.com/docview/2602008768?pq-origsite=%requestingapplication% |
| PQID | 2602008768 |
| PQPubID | 2032433 |
| ParticipantIDs | doaj_primary_oai_doaj_org_article_f3c7b1961f864e5a94614c615dbe262a proquest_journals_2602008768 crossref_primary_10_3390_app112210870 crossref_citationtrail_10_3390_app112210870 nii_cinii_1870866216144577024 |
| PublicationCentury | 2000 |
| PublicationDate | 2021-11-17 |
| PublicationDateYYYYMMDD | 2021-11-17 |
| PublicationDate_xml | – month: 11 year: 2021 text: 2021-11-17 day: 17 |
| PublicationDecade | 2020 |
| PublicationPlace | Basel |
| PublicationPlace_xml | – name: Basel |
| PublicationTitle | Applied Sciences |
| PublicationYear | 2021 |
| Publisher | MDPI AG |
| Publisher_xml | – name: MDPI AG |
| References | ref_93 ref_92 ref_90 Seo (ref_94) 2019; 7 ref_14 ref_13 ref_12 Zhao (ref_54) 2019; 18 Tuyls (ref_89) 2012; 33 ref_98 ref_97 Zhong (ref_77) 2019; 5 Li (ref_81) 2021; 125 ref_95 Yuan (ref_62) 2020; 69 Tesauro (ref_41) 2003; 16 ref_16 Jang (ref_5) 2019; 7 Littman (ref_40) 2001; 1 Yau (ref_67) 2017; 50 Zhang (ref_84) 2020; 8 Kartal (ref_8) 2018; 21 Li (ref_63) 2020; 8 Rasheed (ref_32) 2020; 8 ref_22 Zhao (ref_57) 2020; 8 Mansour (ref_19) 2003; 5 Guestrin (ref_26) 2002; 2 ref_27 Laurent (ref_25) 2011; 15 Chu (ref_79) 2019; 21 ref_72 ref_71 Warnell (ref_10) 2020; 34 ref_78 ref_76 ref_73 Luis (ref_74) 2021; 9 Bui (ref_59) 2019; 179 ref_83 ref_82 Wu (ref_36) 2020; 69 Yang (ref_53) 2019; 16 Chen (ref_75) 2021; 36 Tsitsiklis (ref_20) 1997; 42 Hansen (ref_24) 2004; 4 ref_88 Budhiraja (ref_52) 2020; 8 ref_86 ref_85 Mnih (ref_30) 2015; 518 Arel (ref_2) 2010; 4 Li (ref_35) 2019; 69 ref_50 Zhang (ref_87) 2019; 7 Bennett (ref_15) 2013; 57 Ge (ref_33) 2019; 7 Xi (ref_55) 2019; 11 Zhang (ref_11) 2021; 325 ref_56 Aljeri (ref_58) 2020; 53 ref_61 ref_60 Khan (ref_65) 2020; 154 ref_68 ref_64 Tsitsiklis (ref_18) 1994; 16 Xu (ref_70) 2020; 19 Bowling (ref_17) 2015; 347 Shah (ref_80) 2020; 8 Shoham (ref_96) 2007; 171 Jiang (ref_69) 2019; 37 Rasheed (ref_66) 2020; 109 Bernstein (ref_23) 2002; 27 ref_34 ref_31 Nguyen (ref_7) 2020; 50 Pesce (ref_91) 2020; 109 ref_39 Busoniu (ref_28) 2008; 38 ref_38 Ernst (ref_45) 2005; 6 ref_37 Liang (ref_51) 2019; 37 ref_47 ref_46 Hu (ref_21) 1998; 98 ref_44 ref_43 ref_42 ref_1 ref_3 ref_49 ref_48 ref_9 Stettner (ref_29) 1982; 9 ref_4 ref_6 |
| References_xml | – volume: 347 start-page: 145 year: 2015 ident: ref_17 article-title: Heads-up limit hold’em poker is solved publication-title: Science doi: 10.1126/science.1259433 – ident: ref_9 – volume: 325 start-page: 321 year: 2021 ident: ref_11 article-title: Multi-agent reinforcement learning: A selective overview of theories and algorithms publication-title: Handb. Reinf. Learn. Control doi: 10.1007/978-3-030-60990-0_12 – volume: 37 start-page: 1424 year: 2019 ident: ref_69 article-title: Reinforcement learning for real-time optimization in NB-IoT networks publication-title: IEEE J. Sel. Areas Commun. doi: 10.1109/JSAC.2019.2904366 – ident: ref_3 doi: 10.1109/ISSNIP.2007.4496884 – volume: 109 start-page: 431 year: 2020 ident: ref_66 article-title: Deep reinforcement learning for traffic signal control under disturbances: A case study on Sunway city, Malaysia publication-title: Future Gener. Comput. Syst. doi: 10.1016/j.future.2020.03.065 – volume: 21 start-page: 22 year: 2018 ident: ref_8 article-title: Is multiagent deep reinforcement learning the answer or the question? A brief survey publication-title: Learning – ident: ref_76 doi: 10.1109/IJCNN48605.2020.9206820 – ident: ref_6 doi: 10.23919/WiOPT47501.2019.9144110 – volume: 6 start-page: 503 year: 2005 ident: ref_45 article-title: Tree-based batch mode reinforcement learning publication-title: J. Mach. Learn. Res. – ident: ref_16 – ident: ref_64 doi: 10.3390/s20102900 – volume: 38 start-page: 156 year: 2008 ident: ref_28 article-title: A comprehensive survey of multiagent reinforcement learning publication-title: IEEE Trans. Syst. Man, Cybern. Part C doi: 10.1109/TSMCC.2007.913919 – ident: ref_39 – volume: 42 start-page: 674 year: 1997 ident: ref_20 article-title: An analysis of temporal-difference learning with function approximation publication-title: IEEE Trans. Autom. Control doi: 10.1109/9.580874 – volume: 8 start-page: 208016 year: 2020 ident: ref_32 article-title: Deep Reinforcement Learning for Traffic Signal Control: A Review publication-title: IEEE Access doi: 10.1109/ACCESS.2020.3034141 – ident: ref_88 doi: 10.1109/PIMRC48278.2020.9217051 – volume: 53 start-page: 1 year: 2020 ident: ref_58 article-title: Mobility Management in 5G-enabled Vehicular Networks: Models, Protocols, and Classification publication-title: ACM Comput. Surv. (CSUR) doi: 10.1145/3403953 – ident: ref_34 doi: 10.1016/B978-1-55860-307-3.50049-6 – volume: 69 start-page: 1828 year: 2019 ident: ref_35 article-title: Multi-Agent Deep Reinforcement Learning based Spectrum Allocation for D2D Underlay Communications publication-title: IEEE Trans. Veh. Technol. doi: 10.1109/TVT.2019.2961405 – volume: 37 start-page: 2282 year: 2019 ident: ref_51 article-title: Spectrum sharing in vehicular networks based on multi-agent reinforcement learning publication-title: IEEE J. Sel. Areas Commun. doi: 10.1109/JSAC.2019.2933962 – volume: 8 start-page: 3143 year: 2020 ident: ref_52 article-title: Deep Reinforcement Learning Based Proportional Fair Scheduling Control Scheme for Underlay D2D Communication publication-title: IEEE Internet Things J. doi: 10.1109/JIOT.2020.3014926 – volume: 50 start-page: 3826 year: 2020 ident: ref_7 article-title: Deep reinforcement learning for multiagent systems: A review of challenges, solutions, and applications publication-title: IEEE Trans. Cybern. doi: 10.1109/TCYB.2020.2977374 – ident: ref_27 – volume: 9 start-page: 1 year: 1982 ident: ref_29 article-title: Zero-sum Markov games with stopping and impulsive strategies publication-title: Appl. Math. Optim. doi: 10.1007/BF01460115 – volume: 50 start-page: 1 year: 2017 ident: ref_67 article-title: A survey on reinforcement learning models and algorithms for traffic signal control publication-title: ACM Comput. Surv. (CSUR) doi: 10.1145/3068287 – volume: 7 start-page: 133653 year: 2019 ident: ref_5 article-title: Q-learning algorithms: A comprehensive classification and applications publication-title: IEEE Access doi: 10.1109/ACCESS.2019.2941229 – ident: ref_97 – volume: 98 start-page: 242 year: 1998 ident: ref_21 article-title: Multiagent reinforcement learning: Theoretical framework and an algorithm publication-title: ICML – ident: ref_42 doi: 10.1007/978-3-540-32274-0_8 – volume: 4 start-page: 128 year: 2010 ident: ref_2 article-title: Reinforcement learning-based multi-agent system for network traffic signal control publication-title: IET Intell. Transp. Syst. doi: 10.1049/iet-its.2009.0070 – ident: ref_1 doi: 10.1109/MSP.2017.2743240 – ident: ref_92 doi: 10.1007/978-3-319-71682-4_5 – volume: 5 start-page: 1125 year: 2019 ident: ref_77 article-title: A Deep Actor-Critic Reinforcement Learning Framework for Dynamic Multichannel Access publication-title: IEEE Trans. Cogn. Commun. Netw. doi: 10.1109/TCCN.2019.2952909 – ident: ref_47 – volume: 27 start-page: 819 year: 2002 ident: ref_23 article-title: The complexity of decentralized control of Markov decision processes publication-title: Math. Oper. Res. doi: 10.1287/moor.27.4.819.297 – ident: ref_14 – ident: ref_12 doi: 10.1007/s10462-021-09996-w – volume: 518 start-page: 529 year: 2015 ident: ref_30 article-title: Human-level control through deep reinforcement learning publication-title: Nature doi: 10.1038/nature14236 – ident: ref_68 doi: 10.1109/ICCS.2018.8689183 – ident: ref_50 – volume: 16 start-page: 5565 year: 2019 ident: ref_53 article-title: Learning-based energy-efficient resource management by heterogeneous RF/VLC for ultra-reliable low-latency industrial IoT networks publication-title: IEEE Trans. Ind. Inform. doi: 10.1109/TII.2019.2933867 – volume: 9 start-page: 17084 year: 2021 ident: ref_74 article-title: A Multiagent Deep Reinforcement Learning Approach for Path Planning in Autonomous Surface Vehicles: The Ypacaraí Lake Patrolling Case publication-title: IEEE Access doi: 10.1109/ACCESS.2021.3053348 – volume: 171 start-page: 365 year: 2007 ident: ref_96 article-title: If multi-agent learning is the answer, what is the question? publication-title: Artif. Intell. doi: 10.1016/j.artint.2006.02.006 – volume: 33 start-page: 41 year: 2012 ident: ref_89 article-title: Multiagent learning: Basics, challenges, and prospects publication-title: Ai Mag. – volume: 18 start-page: 5141 year: 2019 ident: ref_54 article-title: Deep reinforcement learning for user association and resource allocation in heterogeneous cellular networks publication-title: IEEE Trans. Wirel. Commun. doi: 10.1109/TWC.2019.2933417 – volume: 69 start-page: 9041 year: 2020 ident: ref_62 article-title: A Joint Service Migration and Mobility Optimization Approach for Vehicular Edge Computing publication-title: IEEE Trans. Veh. Technol. doi: 10.1109/TVT.2020.2999617 – volume: 8 start-page: 112762 year: 2020 ident: ref_63 article-title: Joint Optimization of Caching and Computation in Multi-Server NOMA-MEC System via Reinforcement Learning publication-title: IEEE Access doi: 10.1109/ACCESS.2020.3002895 – volume: 11 start-page: 2417 year: 2019 ident: ref_55 article-title: A novel multi-agent DDQN-AD method-based distributed strategy for automatic generation control of integrated energy systems publication-title: IEEE Trans. Sustain. Energy doi: 10.1109/TSTE.2019.2958361 – ident: ref_48 doi: 10.1609/aaai.v30i1.10295 – ident: ref_60 doi: 10.3390/w12061578 – ident: ref_95 – volume: 125 start-page: 103059 year: 2021 ident: ref_81 article-title: Network-wide traffic signal control optimization using a multi-agent deep reinforcement learning publication-title: Transp. Res. Part C Emerg. Technol. doi: 10.1016/j.trc.2021.103059 – ident: ref_22 doi: 10.1016/B978-1-55860-335-6.50027-1 – volume: 4 start-page: 709 year: 2004 ident: ref_24 article-title: Dynamic programming for partially observable stochastic games publication-title: AAAI – ident: ref_49 – ident: ref_73 doi: 10.1109/WCNC45663.2020.9120693 – volume: 109 start-page: 1727 year: 2020 ident: ref_91 article-title: Improving coordination in small-scale multi-agent deep reinforcement learning through memory-driven communication publication-title: Mach. Learn. doi: 10.1007/s10994-019-05864-5 – ident: ref_78 doi: 10.1109/ICCChinaW.2019.8849971 – volume: 8 start-page: 2066 year: 2020 ident: ref_57 article-title: A novel generation adversarial network-based vehicle trajectory prediction method for intelligent vehicular networks publication-title: IEEE Internet Things J. doi: 10.1109/JIOT.2020.3021141 – volume: 2 start-page: 227 year: 2002 ident: ref_26 article-title: Coordinated reinforcement learning publication-title: ICML – ident: ref_90 – volume: 7 start-page: 40797 year: 2019 ident: ref_33 article-title: Cooperative deep Q-learning with Q-value transfer for multi-intersection signal control publication-title: IEEE Access doi: 10.1109/ACCESS.2019.2907618 – ident: ref_98 – volume: 36 start-page: 838 year: 2021 ident: ref_75 article-title: Graph neural network and reinforcement learning for multi-agent cooperative control of connected autonomous vehicles publication-title: Comput.-Aided Civ. Infrastruct. Eng. doi: 10.1111/mice.12702 – ident: ref_82 doi: 10.1109/ICNC47757.2020.9049771 – ident: ref_71 doi: 10.1109/ISBI45749.2020.9098329 – ident: ref_56 doi: 10.1109/SMC.2019.8914266 – ident: ref_61 doi: 10.23919/EUSIPCO.2019.8903067 – ident: ref_83 doi: 10.1109/CCTA41146.2020.9206275 – volume: 7 start-page: 118776 year: 2019 ident: ref_94 article-title: Rewards prediction-based credit assignment for reinforcement learning with sparse binary rewards publication-title: IEEE Access doi: 10.1109/ACCESS.2019.2936863 – ident: ref_38 – ident: ref_44 doi: 10.1145/1390156.1390240 – ident: ref_72 doi: 10.1109/PIMRC48278.2020.9217135 – ident: ref_31 doi: 10.1364/OFC.2019.M2A.2 – ident: ref_85 doi: 10.1109/ICMLA.2017.0-184 – volume: 8 start-page: 3410 year: 2020 ident: ref_80 article-title: Multi-Agent Deep Reinforcement Learning Based Virtual Resource Allocation Through Network Function Virtualization in Internet of Things publication-title: IEEE Internet Things J. doi: 10.1109/JIOT.2020.3022572 – ident: ref_93 – volume: 5 start-page: 1 year: 2003 ident: ref_19 article-title: Learning Rates for Q-learning publication-title: J. Mach. Learn. Res. – volume: 15 start-page: 55 year: 2011 ident: ref_25 article-title: The world of independent learners is not Markovian publication-title: Int. J. Knowl.-Based Intell. Eng. Syst. – volume: 69 start-page: 8243 year: 2020 ident: ref_36 article-title: Multi-Agent Deep Reinforcement Learning for Urban Traffic Light Control in Vehicular Networks publication-title: IEEE Trans. Veh. Technol. doi: 10.1109/TVT.2020.2997896 – volume: 8 start-page: 3786 year: 2020 ident: ref_84 article-title: Hierarchical Deep Reinforcement Learning for Backscattering Data Collection with Multiple UAVs publication-title: IEEE Internet Things J. doi: 10.1109/JIOT.2020.3024666 – volume: 179 start-page: 184 year: 2019 ident: ref_59 article-title: A novel hybrid approach based on a swarm intelligence optimized extreme learning machine for flash flood susceptibility mapping publication-title: Catena doi: 10.1016/j.catena.2019.04.009 – ident: ref_4 doi: 10.1109/ICNSC.2008.4525499 – ident: ref_13 doi: 10.1109/TNN.1998.712192 – ident: ref_37 – volume: 21 start-page: 1086 year: 2019 ident: ref_79 article-title: Multi-agent deep reinforcement learning for large-scale traffic signal control publication-title: IEEE Trans. Intell. Transp. Syst. doi: 10.1109/TITS.2019.2901791 – volume: 154 start-page: 102539 year: 2020 ident: ref_65 article-title: Survey and taxonomy of clustering algorithms in 5G publication-title: J. Netw. Comput. Appl. doi: 10.1016/j.jnca.2020.102539 – volume: 19 start-page: 4494 year: 2020 ident: ref_70 article-title: The application of deep reinforcement learning to distributed spectrum access in dynamic heterogeneous environments with partial observations publication-title: IEEE Trans. Wirel. Commun. doi: 10.1109/TWC.2020.2984227 – volume: 7 start-page: 118898 year: 2019 ident: ref_87 article-title: Multi-agent deep reinforcement learning-based cooperative spectrum sensing with upper confidence bound exploration publication-title: IEEE Access doi: 10.1109/ACCESS.2019.2937108 – volume: 34 start-page: 1 year: 2020 ident: ref_10 article-title: Agents teaching agents: A survey on inter-agent transfer learning publication-title: Auton. Agents Multi-Agent Syst. – ident: ref_46 – volume: 57 start-page: 9 year: 2013 ident: ref_15 article-title: Artificial intelligence framework for simulating clinical decision-making: A Markov decision process approach publication-title: Artif. Intell. Med. doi: 10.1016/j.artmed.2012.12.003 – volume: 16 start-page: 185 year: 1994 ident: ref_18 article-title: Asynchronous stochastic approximation and Q-learning publication-title: Mach. Learn. doi: 10.1007/BF00993306 – volume: 16 start-page: 871 year: 2003 ident: ref_41 article-title: Extending Q-learning to general adaptive multi-agent systems publication-title: Adv. Neural Inf. Process. Syst. – volume: 1 start-page: 322 year: 2001 ident: ref_40 article-title: Friend-or-foe Q-learning in general-sum games publication-title: ICML – ident: ref_43 – ident: ref_86 doi: 10.1109/TSG.2020.3010130 |
| SSID | ssj0000913810 ssib023169898 ssib045317060 ssib045318793 ssib045318197 ssib045321406 ssib045321407 ssib045316253 ssib030194696 ssib045320369 ssib045316688 ssib045318623 ssib045318831 ssib045321377 ssib045320582 ssib045318927 |
| Score | 2.3568988 |
| SecondaryResourceType | review_article |
| Snippet | Recent advancements in deep reinforcement learning (DRL) have led to its application in multi-agent scenarios to solve complex real-world problems, such as... |
| SourceID | doaj proquest crossref nii |
| SourceType | Open Website Aggregation Database Enrichment Source Index Database Publisher |
| StartPage | 10870 |
| SubjectTerms | Algorithms applied reinforcement learning Biology (General) Chemistry Cooperative learning Decision making Deep learning deep Q-network Engineering (General). Civil engineering (General) Knowledge Markov analysis multi-agent deep reinforcement learning multi-agent reinforcement learning Physics QC1-999 QD1-999 QH301-705.5 reinforcement learning T TA1-2040 Taxonomy Technology Wireless networks |
| SummonAdditionalLinks | – databaseName: DOAJ Directory of Open Access Journals dbid: DOA link: http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1La9wwEB5C6CE9lOZRum1SdGihoZiuZVmSe9s-Qk9LKAnkZvQYpQtbb1hv-_s7IzvBUEovvfhgZCFmRpr5rJlvAF47gyZaJwsuUyxUDL6wIegCtUNXh7lL6HOzCbNc2pub5nLS6otzwgZ64EFw71MVjCczKZPVCmvXKHIogfxw9Ci1zKHR3DQTMJXP4KZk6qoh070iXM_3wRRaEMCx3Jd44oMyVT95lm61-uM8zk7m4ik8GaNDsRhWdQh72B3B4wln4BEcjruxF29HyujzY1guJvfQYpNErqstFlw3JT4j3olvmClSQ_4bKEZW1dsPgnuhrXvhuigW69vNdrX7_qM_geuLL1efvhZjp4QiKC13hXcxuOSbhDVTggWMUiFBATl3JRcCmlCjcnNN4EYlkqJ0Zawrn6xNutJRVs9gv9t0-BxEpUITfI2pQprDaUefykgozXkTUq1m8O5edm0YacS5m8W6JTjBkm6nkp7Bm4fRdwN9xl_GfWQ1PIxh0uv8gkyhHU2h_ZcpzOCMlEir4mdJs1qtZcmItzaGYpEZnN6rtx13at8SnpOZl8---B9reAkHkrNeOFHQnML-bvsTz-BR-LVb9dtX2Uh_A0NL6N0 priority: 102 providerName: Directory of Open Access Journals |
| Title | Applications of Multi-Agent Deep Reinforcement Learning: Models and Algorithms |
| URI | https://cir.nii.ac.jp/crid/1870866216144577024 https://www.proquest.com/docview/2602008768 https://doaj.org/article/f3c7b1961f864e5a94614c615dbe262a |
| Volume | 11 |
| WOSCitedRecordID | wos000724053500001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVAON databaseName: DOAJ Directory of Open Access Journals customDbUrl: eissn: 2076-3417 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0000913810 issn: 2076-3417 databaseCode: DOA dateStart: 20110101 isFulltext: true titleUrlDefault: https://www.doaj.org/ providerName: Directory of Open Access Journals – providerCode: PRVHPJ databaseName: ROAD: Directory of Open Access Scholarly Resources customDbUrl: eissn: 2076-3417 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0000913810 issn: 2076-3417 databaseCode: M~E dateStart: 20110101 isFulltext: true titleUrlDefault: https://road.issn.org providerName: ISSN International Centre – providerCode: PRVPQU databaseName: ProQuest Central customDbUrl: eissn: 2076-3417 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0000913810 issn: 2076-3417 databaseCode: BENPR dateStart: 20110101 isFulltext: true titleUrlDefault: https://www.proquest.com/central providerName: ProQuest – providerCode: PRVPQU databaseName: ProQuest Publicly Available Content Database customDbUrl: eissn: 2076-3417 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0000913810 issn: 2076-3417 databaseCode: PIMPY dateStart: 20110101 isFulltext: true titleUrlDefault: http://search.proquest.com/publiccontent providerName: ProQuest |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV3db9MwED_BxgM8DDaY6NgmP4AEQhaN49guL6iDTfBAVU0gjafI8UepVNLSFP5-7ly3VELwwkukJJfIyZ3Pd-e73wE8tTpob6zgVKbIpXcNN84pHpQNtnJ9G0OTmk3o0cjc3AzGOeDW5bTKjU5MitrPHcXIX6HdLRJ-mnmz-M6paxTtruYWGrdhn5DKUM73Ly5H4-ttlIVQL03RX2e8l-jf074wmhjo6BjqT7yzFiXIflxh2un0D72cFpur-_87zAdwkM1MNlzLxSHcCu0R3NsBHzyCwzytO_Y8Y0-_eAij4c6GNptHlgp0-ZAKsNi7EBbsOiSsVZfCiizDs05eM2qqNuuYbT0bziY4pNXXb90j-Hx1-ente55bLnAnlVjxxnpnYzOIoSJsMRe8kAF9CtG3BVUUalcFafsKvSQZJRLZwldlE42JqlRelMew187b8BhYKd3ANVWIZcB3WGXxUeHR3bONdrGSPXi5-fm1y3jk1BZjVqNfQqyqd1nVg2db6sUah-MvdBfExy0NoWenC_PlpM6TsY6l0w2qniIahR9hBxKNFIe2nW-CUML24AylAEdFxwLfapQSBbnOldZo1PTgdMP8Ok_5rv7N-ZN_334CdwUlxlAuoT6FvdXyRziDO-7natotz7MEn6fgAJ6NP3wcf_kFRLn7Gg |
| linkProvider | ProQuest |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V3Nb9MwFH-aOiTgAGwwrbCBD0wCoYjEcRxnEkKFMa3aVlVoSOMUHH-USiUtTQHtn-Jv5L00KZUQ3HbgkkPiWLH9y_uy3-8BPNWpS63SPKA0xUBYUwTKGBk4qZ1OTKi9K-piE-lgoC4vs-EG_GxzYehYZSsTa0Ftp4Zi5C_R7uY1f5p6PfsaUNUo2l1tS2gsYXHqrn6gy1a96h_h-h5wfvzu4u1J0FQVCIyQfBEU2hrti8y7hOizjLNcODSbeagjSppLTeKEDiU6AsILbKQjm8SFV8rLWFoiOkCRvykI7B3YHPbPhx9XUR1i2VRRuDxhH8dZSPvQaNKgY6WoHvKa7qtLBKBGK8fjP_RArdyO7_5v03IP7jRmNOstcb8FG67chttr5IrbsNWIrYo9a7i1n9-HQW9tw55NPasTkIMeJZixI-dm7L2ruWRNHTZlDf3s6JBR0bhJxXRpWW8ywilYfP5SPYAP1zLKHeiU09LtAouFyUyROB877ENLja9yi-6sLlLjE9GFF-1i56bhW6eyH5Mc_S6CRr4OjS4crFrPljwjf2n3hnCzakPs4PWN6XyUN8Im97FJCxStkVcSB6EzgUaYQdvVFo5Lrruwj6jDr6JrhL0qKXlEoYEkTdFo68JeC7a8EWlV_htpD__9-AncPLk4P8vP-oPTR3CL0yEgOjeZ7kFnMf_m9uGG-b4YV_PHzd_D4NN1I_MXc7xV1Q |
| linkToPdf | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V1Rb9MwED5NHULwAGyA6NjAD0wCoWiJ4zgOEkKFUlENqgqBNJ4yx7FLpS4tTQHx1_h13KVOqYTgbQ-85CFxotj-fL47330H8EinNi2V5gGlKQaiNEWgjJGBldrqxITa2aIpNpGORursLBvvwM82F4bCKluZ2Ajqcm7IR36Cejdv-NPUifNhEeP-4MXiS0AVpOiktS2nsYbIqf3xHc23-vmwj3N9zPng9YdXbwJfYSAwQvJVUOjSaFdkziZEpWVsyYVFFZqHOqIEutQkVuhQolEgnMBGOiqTuHBKORnLkkgPUPzvokoueAd2x8N3408bDw8xbqooXEfbx3EW0pk0qjdoZCmqjby1DzblAnB3q6bTP_aEZqMb3Pyfh-gW3PDqNeut18Me7NhqH65vkS7uw54XZzV77Dm3n9yGUW_rIJ_NHWsSk4MeJZ6xvrUL9t42HLOmcacyT0s7ecaomNysZroqWW82wSFYfb6o78DHS-nlXehU88reAxYLk5kisS62-A0tNb7KSzRzdZEal4guPG0nPjeeh53KgcxytMcIJvk2TLpwvGm9WPOP_KXdS8LQpg2xhjc35stJ7oVQ7mKTFihyI6ckdkJnApUzgzptWVguue7CESIQ_4quEX5VSckjchkkaYrKXBcOW-DlXtTV-W_UHfz78UO4inDM3w5Hp_fhGqfYIAqnTA-hs1p-tUdwxXxbTevlA7-QGJxfNjB_ARwQXpU |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Applications+of+Multi-Agent+Deep+Reinforcement+Learning%3A+Models+and+Algorithms&rft.jtitle=Applied+Sciences&rft.au=Abdikarim+Mohamed+Ibrahim&rft.au=Kok-Lim+Alvin+Yau&rft.au=Yung-Wey+Chong&rft.au=Celimuge+Wu&rft.date=2021-11-17&rft.pub=MDPI+AG&rft.eissn=2076-3417&rft.volume=11&rft.spage=10870&rft_id=info:doi/10.3390%2Fapp112210870 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2076-3417&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2076-3417&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2076-3417&client=summon |