Decentralized optimal large scale multi-player pursuit-evasion strategies: A mean field game approach with reinforcement learning

In this paper, the intelligent design for the pursuit-evasion game with large scale multi-pursuer and multi-evader has been investigated. Due to the vast number of agents, the notorious ”Curse of Dimensionality” can seriously challenge the traditional design in multi-player pursuit-evasion game, esp...

Full description

Saved in:
Bibliographic Details
Published in:Neurocomputing (Amsterdam) Vol. 484; pp. 46 - 58
Main Authors: Zhou, Zejian, Xu, Hao
Format: Journal Article
Language:English
Published: Elsevier B.V 01.05.2022
Subjects:
ISSN:0925-2312, 1872-8286
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Abstract In this paper, the intelligent design for the pursuit-evasion game with large scale multi-pursuer and multi-evader has been investigated. Due to the vast number of agents, the notorious ”Curse of Dimensionality” can seriously challenge the traditional design in multi-player pursuit-evasion game, especially under harsh environment with limited communication resource to support information exchange among multi-players. To address this intractable challenge, the emerging Mean Field Games (MFG) theory has been utilized to solve the optimal pursuit-evasion strategies based on a new form of probability density function (PDF) instead of detailed information from all the other players/agents. As such, not only the information exchange is reduced, but also the computation dimension for the optimal strategy derivation is decreased. Specifically, the MFG has been integrated into the pursuit-evasion game to generate a hierarchical structure where the pursuers and the evaders form two mean field groups separately. To online solve the mean field equations, i.e., two coupled partial differential equations, the actor-critic reinforcement learning mechanism is adopted and further extended to a novel actor-critic-mass-opponent (ACMO) approach. In ACMO, the actor neural network estimates the optimal control, the critic neural network approximates the optimal cost function, the mass neural network learns the agent’s group PDF, and the opponent neural network predicts the opponents’ average states in the form of PDF that causes maximum cost for the agent’s group. The Lyapunov theory is utilized to provide the convergence analysis for all neural networks and the stability analysis for the closed-loop system. Eventually, a series of numerical simulations are conducted to demonstrate the effectiveness of the developed scheme.
AbstractList In this paper, the intelligent design for the pursuit-evasion game with large scale multi-pursuer and multi-evader has been investigated. Due to the vast number of agents, the notorious ”Curse of Dimensionality” can seriously challenge the traditional design in multi-player pursuit-evasion game, especially under harsh environment with limited communication resource to support information exchange among multi-players. To address this intractable challenge, the emerging Mean Field Games (MFG) theory has been utilized to solve the optimal pursuit-evasion strategies based on a new form of probability density function (PDF) instead of detailed information from all the other players/agents. As such, not only the information exchange is reduced, but also the computation dimension for the optimal strategy derivation is decreased. Specifically, the MFG has been integrated into the pursuit-evasion game to generate a hierarchical structure where the pursuers and the evaders form two mean field groups separately. To online solve the mean field equations, i.e., two coupled partial differential equations, the actor-critic reinforcement learning mechanism is adopted and further extended to a novel actor-critic-mass-opponent (ACMO) approach. In ACMO, the actor neural network estimates the optimal control, the critic neural network approximates the optimal cost function, the mass neural network learns the agent’s group PDF, and the opponent neural network predicts the opponents’ average states in the form of PDF that causes maximum cost for the agent’s group. The Lyapunov theory is utilized to provide the convergence analysis for all neural networks and the stability analysis for the closed-loop system. Eventually, a series of numerical simulations are conducted to demonstrate the effectiveness of the developed scheme.
Author Zhou, Zejian
Xu, Hao
Author_xml – sequence: 1
  givenname: Zejian
  surname: Zhou
  fullname: Zhou, Zejian
  organization: Department of Electrical & Biomedical Engineering, University of Nevada, Reno, NV, USA
– sequence: 2
  givenname: Hao
  surname: Xu
  fullname: Xu, Hao
  email: haoxu@unr.edu
  organization: Department of Electrical & Biomedical Engineering, University of Nevada, Reno, NV, USA
BookMark eNqFkLtOxDAQRS0EEsvjDyj8Awkeb7LJUiAh3hISDdTWxB4vXjlOZHtB0PHnBC0VBVRTnTs654DthiEQYycgShCwOF2XgTZ66EspJJQCSqhgh82gbWTRynaxy2ZiKetCzkHus4OU1kJAA3I5Y59XpCnkiN59kOHDmF2PnnuMK-JJoyfeb3x2xejxnSIfNzFtXC7oFZMbAk8TmmnlKJ3xC94TBm4decNX2BPHcYwD6hf-5vILj-SCHaKmfvrIPWEMLqyO2J5Fn-j45x6y55vrp8u74uHx9v7y4qHQc7HIBdQtIjYSADthdbeYiwZrasgQWEkWjNFt3RJaWy2xq6Q2TUeyaeuuhs5U80N2tt3VcUgpklXaZcyTxKTgvAKhvmOqtdrGVN8xlQA1xZzg6hc8xilUfP8PO99iNIm9OooqaUdBk3GRdFZmcH8PfAEZlZf9
CitedBy_id crossref_primary_10_1109_JAS_2024_124950
crossref_primary_10_3390_ijgi12120475
crossref_primary_10_1109_TNSE_2024_3386678
crossref_primary_10_1016_j_neucom_2025_131550
crossref_primary_10_1016_j_cja_2025_103480
crossref_primary_10_1016_j_ifacol_2022_07_617
crossref_primary_10_1109_TSMC_2024_3481351
crossref_primary_10_1109_JAS_2023_123996
crossref_primary_10_1109_TRO_2023_3292514
crossref_primary_10_1016_j_neucom_2025_130834
crossref_primary_10_1109_TSMC_2023_3347044
crossref_primary_10_1016_j_eswa_2025_128139
crossref_primary_10_1109_TIE_2023_3301511
crossref_primary_10_1002_oca_2997
crossref_primary_10_1016_j_neucom_2025_131080
crossref_primary_10_1080_08839514_2024_2355023
crossref_primary_10_1109_TITS_2023_3341034
crossref_primary_10_1007_s10846_023_01996_y
crossref_primary_10_1109_TCDS_2025_3540071
crossref_primary_10_1109_TIE_2022_3187577
crossref_primary_10_1109_TIV_2023_3237790
crossref_primary_10_1016_j_neucom_2022_10_038
crossref_primary_10_1016_j_neucom_2024_127701
crossref_primary_10_1080_00207179_2024_2409308
crossref_primary_10_1109_TCSII_2024_3354120
Cites_doi 10.1109/COMST.2016.2532458
10.1016/j.neucom.2020.06.058
10.1109/TNNLS.2020.2969215
10.1016/j.neucom.2020.07.006
10.1016/j.automatica.2004.11.034
10.1109/ADPRL.2009.4927523
10.1007/978-3-642-14435-6_7
10.1007/s13235-013-0099-2
10.1016/j.engappai.2018.01.011
10.1109/MVT.2019.2921208
10.1109/MCAS.2009.933854
10.1109/ROBOT.2001.933069
10.1109/TCYB.2017.2685425
10.1007/978-3-319-71682-4_5
10.1016/j.neucom.2020.07.067
10.1016/j.neucom.2020.04.119
10.1016/j.neucom.2020.06.031
10.1145/3337065
10.1109/TRO.2010.2095570
10.1016/j.neucom.2020.05.060
10.1109/MCS.2012.2214134
10.1007/s10458-005-2631-2
10.1109/TAC.2014.2365073
10.1109/TAES.2014.130569
10.1016/j.jfranklin.2019.07.022
10.1109/MWC.2018.1700215
10.1016/j.neucom.2020.07.032
10.1016/j.automatica.2014.10.022
10.1002/9781118122631
10.1073/pnas.1718942115
10.1016/j.neucom.2017.12.045
10.2514/1.G000461
10.1007/s11537-007-0657-8
10.1109/TAC.2019.2926554
10.1016/j.automatica.2011.03.005
10.1109/TAC.2020.3003840
10.1016/j.neucom.2020.07.052
10.1109/TNNLS.2019.2934648
ContentType Journal Article
Copyright 2021 Elsevier B.V.
Copyright_xml – notice: 2021 Elsevier B.V.
DBID AAYXX
CITATION
DOI 10.1016/j.neucom.2021.01.141
DatabaseName CrossRef
DatabaseTitle CrossRef
DatabaseTitleList
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISSN 1872-8286
EndPage 58
ExternalDocumentID 10_1016_j_neucom_2021_01_141
S0925231221015769
GroupedDBID ---
--K
--M
.DC
.~1
0R~
123
1B1
1~.
1~5
4.4
457
4G.
53G
5VS
7-5
71M
8P~
9JM
9JN
AABNK
AACTN
AADPK
AAEDT
AAEDW
AAIAV
AAIKJ
AAKOC
AALRI
AAOAW
AAQFI
AAXLA
AAXUO
AAYFN
ABBOA
ABCQJ
ABFNM
ABJNI
ABMAC
ABYKQ
ACDAQ
ACGFS
ACRLP
ACZNC
ADBBV
ADEZE
AEBSH
AEKER
AENEX
AFKWA
AFTJW
AFXIZ
AGHFR
AGUBO
AGWIK
AGYEJ
AHHHB
AHZHX
AIALX
AIEXJ
AIKHN
AITUG
AJOXV
ALMA_UNASSIGNED_HOLDINGS
AMFUW
AMRAJ
AOUOD
AXJTR
BKOJK
BLXMC
CS3
DU5
EBS
EFJIC
EFLBG
EO8
EO9
EP2
EP3
F5P
FDB
FIRID
FNPLU
FYGXN
G-Q
GBLVA
GBOLZ
IHE
J1W
KOM
LG9
M41
MO0
MOBAO
N9A
O-L
O9-
OAUVE
OZT
P-8
P-9
P2P
PC.
Q38
ROL
RPZ
SDF
SDG
SDP
SES
SPC
SPCBC
SSN
SSV
SSZ
T5K
ZMT
~G-
29N
9DU
AAQXK
AATTM
AAXKI
AAYWO
AAYXX
ABWVN
ABXDB
ACLOT
ACNNM
ACRPL
ACVFH
ADCNI
ADJOM
ADMUD
ADNMO
AEIPS
AEUPX
AFJKZ
AFPUW
AGQPQ
AIGII
AIIUN
AKBMS
AKRWK
AKYEP
ANKPU
APXCP
ASPBG
AVWKF
AZFZN
CITATION
EFKBS
EJD
FEDTE
FGOYB
HLZ
HVGLF
HZ~
R2-
SBC
SEW
WUQ
XPP
~HD
ID FETCH-LOGICAL-c306t-158aaa7211ab0fcb6307a5e7ede1f2ef1ddc858eaff49ab42cd7be2785b51bd43
ISICitedReferencesCount 31
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000772751100005&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 0925-2312
IngestDate Sat Nov 29 07:16:18 EST 2025
Tue Nov 18 21:08:38 EST 2025
Fri Feb 23 02:41:14 EST 2024
IsPeerReviewed true
IsScholarly true
Keywords Mean field game theory
Pursuit-evasion game
Approximate dynamic programming
Optimal control
Reinforcement learning
Language English
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-c306t-158aaa7211ab0fcb6307a5e7ede1f2ef1ddc858eaff49ab42cd7be2785b51bd43
PageCount 13
ParticipantIDs crossref_citationtrail_10_1016_j_neucom_2021_01_141
crossref_primary_10_1016_j_neucom_2021_01_141
elsevier_sciencedirect_doi_10_1016_j_neucom_2021_01_141
PublicationCentury 2000
PublicationDate 2022-05-01
2022-05-00
PublicationDateYYYYMMDD 2022-05-01
PublicationDate_xml – month: 05
  year: 2022
  text: 2022-05-01
  day: 01
PublicationDecade 2020
PublicationTitle Neurocomputing (Amsterdam)
PublicationYear 2022
Publisher Elsevier B.V
Publisher_xml – name: Elsevier B.V
References Li, Zhang, Sun, Zhang (b0020) 2020; 418
J. Han, A. Jentzen, E. Weinan, Solving high-dimensional partial differential equations using deep learning, Proceedings of the National Academy of Sciences of the United States of America 115 (34) (2018) 8505–8510, arXiv: 1707.02568. doi:10.1073/pnas.1718942115. URL:www.pnas.org/cgi/doi/10.1073/pnas.1718942115.
Wang, Liu, Wang, Wu, Lü (b0025) 2020; 413
H. Li, X. Liao, T. Huang, W. Zhu, Event-Triggering Sampling Based Leader-Following Consensus in Second-Order Multi-Agent Systems, IEEE Transactions on Automatic Control 60 (7) (2015) 1998–2003, conference Name: IEEE Transactions on Automatic Control. doi:10.1109/TAC.2014.2365073.
Caines, Huang, Malhamé (b0220) 2017
V.G. Lopez, F.L. Lewis, Y. Wan, E.N. Sanchez, L. Fan, Solutions for Multiagent Pursuit-Evasion Games on Communication Graphs: Finite-Time Capture and Asymptotic Behaviors, IEEE Transactions on Automatic Control 65 (5) (2020) 1911–1923, conference Name: IEEE Transactions on Automatic Control. doi:10.1109/TAC.2019.2926554.
L. Busoniu, R. Babuška, B. De Schutter, Multi-agent reinforcement learning: An overview, Studies in Computational Intelligence 310 (2010) 183–221, publisher: Springer, Berlin, Heidelberg ISBN: 9783642144349. doi:10.1007/978-3-642-14435-6_7.
Wang, Zhang, Leung, Guizani, Jiang (b0200) 2018; 25
Sun, Wang (b0010) 2020; 413
Liao, Han, Dong, Li, Ren (b0015) 2020; 415
F.L. Lewis, D. Vrabie, K.G. Vamvoudakis, Reinforcement learning and feedback control: Using natural decision methods to design optimal adaptive controllers, IEEE Control Systems 32 (6) (2012) 76–105, publisher: IEEE. doi:10.1109/MCS.2012.2214134.
Oh, Park, Ahn (b0075) 2015; 53
Yao, Dou, Yue, Zhao, Zhang (b0005) 2020; 415
Bensoussan, Frehse, Yam (b0210) 2013
Z. Zhang, Y. Xiao, Z. Ma, M. Xiao, Z. Ding, X. Lei, G.K. Karagiannidis, P. Fan, 6G Wireless Networks: Vision, Requirements, Architecture, and Key Technologies, IEEE Vehicular Technology Magazine 14 (3) (2019) 28–41, conference Name: IEEE Vehicular Technology Magazine. doi:10.1109/MVT.2019.2921208.
Consensus Control of Time-Varying Multiagent Systems With Stochastic Communication Protocol, IEEE Transactions on Cybernetics 47 (8) (2017) 1830–1840, conference Name: IEEE Transactions on Cybernetics. doi:10.1109/TCYB.2017.2685425.
V. Turetsky, T. Shima, Target Evasion from a Missile Performing Multiple Switches in Guidance Law, Journal of Guidance, Control, and Dynamics 39 (10) (2016) 2364–2373, publisher: American Institute of Aeronautics and Astronautics _eprint: doi: 10.2514/1.G000461. doi:10.2514/1.G000461. URL:https://doi.org/10.2514/1.G000461.
E. Garcia, D.W. Casbeer, A.V. Moll, M. Pachter, Multiple Pursuer Multiple Evader Differential Games, IEEE Transactions on Automatic Control (2020) 1–1Conference Name: IEEE Transactions on Automatic Control. doi:10.1109/TAC.2020.3003840.
Camci, Kayacan (b0035) 2016
Vrabie, Vamvoudakis, Lewis (b0155) 2012
J. Chen, B. Chen, Z. Zeng, Synchronization and Consensus in Networks of Linear Fractional-Order Multi-Agent Systems via Sampled-Data Control, IEEE Transactions on Neural Networks and Learning Systems 31 (8) (2020) 2955–2964, conference Name: IEEE Transactions on Neural Networks and Learning Systems. doi:10.1109/TNNLS.2019.2934648.
L. Panait, S. Luke, Cooperative multi-agent learning: The state of the art, Autonomous Agents and Multi-Agent Systems 11 (3) (2005) 387–434, publisher: Springer. doi:10.1007/s10458-005-2631-2.
H.-N. Dai, R.C.-W. Wong, H. Wang, Z. Zheng, A.V. Vasilakos, Big Data Analytics for Large-scale Wireless Networks: Challenges and Opportunities, ACM Computing Surveys 52 (5) (2019) 99:1–99:36. doi:10.1145/3337065. URL:https://doi.org/10.1145/3337065.
K.G. Vamvoudakis, F.L. Lewis, Multi-player non-zero-sum games: Online adaptive learning solution of coupled Hamilton-Jacobi equations, Automatica 47 (8) (2011) 1556–1569, publisher: Pergamon. doi:10.1016/J.AUTOMATICA.2011.03.005. URL:https://www.sciencedirect.com/science/article/pii/S0005109811001774.
J.K. Gupta, M. Egorov, M. Kochenderfer, Cooperative Multi-agent Control Using Deep Reinforcement Learning, in: G. Sukthankar, J.A. Rodriguez-Aguilar (Eds.), Autonomous Agents and Multiagent Systems, Lecture Notes in Computer Science, Springer International Publishing, Cham, 2017, pp. 66–83. doi:10.1007/978-3-319-71682-4_5.
Lasry, Lions (b0140) 2007; 2
Gunasekaran, Zhai, Yu (b0030) 2020; 413
M. Katsev, A. Yershova, B. Tovar, R. Ghrist, S.M. LaValle, Mapping and Pursuit-Evasion Strategies For a Simple Wall-Following Robot, IEEE Transactions on Robotics 27 (1) (2011) 113–128, conference Name: IEEE Transactions on Robotics. doi:10.1109/TRO.2010.2095570.
M. Liu, Y. Wan, F.L. Lewis, V.G. Lopez, Adaptive Optimal Control for Stochastic Multiplayer Differential Games Using On-Policy and Off-Policy Reinforcement Learning, IEEE Transactions on Neural Networks and Learning Systems 31 (12) (2020) 5522–5533, conference Name: IEEE Transactions on Neural Networks and Learning Systems. doi:10.1109/TNNLS.2020.2969215.
L. Zou, Z. Wang, H. Gao, F.E. Alsaadi, Finite-Horizon
Lewis, Vrabie (b0215) 2009; 9
Wang, Dong, Sun (b0130) 2020; 412
L. Búrdalo, A. Terrasa, V. Julián, A. García-Fornes, The Information Flow Problem in multi-agent systems, Engineering Applications of Artificial Intelligence 70 (2018) 130–141, publisher: Elsevier Ltd. doi:10.1016/j.engappai.2018.01.011.
Zhou, Xu, Game, Strategy (b0145) 2020; 2020
M. Abu-Khalaf, F.L. Lewis, Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach, Automatica doi:10.1016/j.automatica.2004.11.034.
Guéant, Lasry, Lions (b0135) 2011
R. Vidal, S. Rashid, C. Sharp, O. Shakernia, J. Kim, S. Sastry, Pursuit-evasion games with unmanned ground and aerial vehicles, in: Proceedings 2001 ICRA. IEEE International Conference on Robotics and Automation (Cat. No.01CH37164), Vol. 3, 2001, pp. 2948–2955 vol 3, iSSN: 1050–4729. doi:10.1109/ROBOT.2001.933069.
K. Vamvoudakis, D. Vrabie, F. Lewis, Online policy iteration based algorithms to solve the continuous- time infinite horizon optimal control problem, in: 2009 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, ADPRL 2009 - Proceedings, 2009, pp. 36–41. doi:10.1109/ADPRL.2009.4927523.
M. Pechoucek, V. Marik, O. Stepankova, Towards Reducing Communication Traffic In Multi-Agent Systems, Journal of Applied Systems Science: Special Issue 2 (1) (2001) 211–245. URL:http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.58.8980.
F.L. Lewis, D. Vrabie, V.L. Syrmos, Optimal Control, 3rd Edition., John Wiley & Sons, 2012, oCLC: 940552625.
W. Lin, Z. Qu, M.A. Simaan, Nash strategies for pursuit-evasion differential games involving limited observations, IEEE Transactions on Aerospace and Electronic Systems 51 (2) (2015) 1347–1356, conference Name: IEEE Transactions on Aerospace and Electronic Systems. doi:10.1109/TAES.2014.130569.
Zhou, Xu (b0170) 2019
Liu, Li, Shan, Yu, Wu, Chen (b0065) 2020; 404
Gomes, Saúde (b0205) 2014; 4
Lv, Ren, Na (b0125) 2019; 356
Lv, Ren, Na (b0160) 2018; 283
M. Agiwal, A. Roy, N. Saxena, Next Generation 5G Wireless Networks: A Comprehensive Survey, IEEE Communications Surveys Tutorials 18 (3) (2016) 1617–1655, conference Name: IEEE Communications Surveys Tutorials. doi:10.1109/COMST.2016.2532458.
Camci (10.1016/j.neucom.2021.01.141_b0035) 2016
10.1016/j.neucom.2021.01.141_b0175
10.1016/j.neucom.2021.01.141_b0230
10.1016/j.neucom.2021.01.141_b0055
10.1016/j.neucom.2021.01.141_b0110
10.1016/j.neucom.2021.01.141_b0115
Lasry (10.1016/j.neucom.2021.01.141_b0140) 2007; 2
10.1016/j.neucom.2021.01.141_b0090
Bensoussan (10.1016/j.neucom.2021.01.141_b0210) 2013
10.1016/j.neucom.2021.01.141_b0190
10.1016/j.neucom.2021.01.141_b0070
Sun (10.1016/j.neucom.2021.01.141_b0010) 2020; 413
Gunasekaran (10.1016/j.neucom.2021.01.141_b0030) 2020; 413
10.1016/j.neucom.2021.01.141_b0050
10.1016/j.neucom.2021.01.141_b0095
10.1016/j.neucom.2021.01.141_b0150
Gomes (10.1016/j.neucom.2021.01.141_b0205) 2014; 4
Wang (10.1016/j.neucom.2021.01.141_b0025) 2020; 413
Zhou (10.1016/j.neucom.2021.01.141_b0170) 2019
10.1016/j.neucom.2021.01.141_b0195
Oh (10.1016/j.neucom.2021.01.141_b0075) 2015; 53
Zhou (10.1016/j.neucom.2021.01.141_b0145) 2020; 2020
Caines (10.1016/j.neucom.2021.01.141_b0220) 2017
Lewis (10.1016/j.neucom.2021.01.141_b0215) 2009; 9
Guéant (10.1016/j.neucom.2021.01.141_b0135) 2011
10.1016/j.neucom.2021.01.141_b0185
10.1016/j.neucom.2021.01.141_b0120
10.1016/j.neucom.2021.01.141_b0165
10.1016/j.neucom.2021.01.141_b0045
10.1016/j.neucom.2021.01.141_b0100
Yao (10.1016/j.neucom.2021.01.141_b0005) 2020; 415
Li (10.1016/j.neucom.2021.01.141_b0020) 2020; 418
Vrabie (10.1016/j.neucom.2021.01.141_b0155) 2012
10.1016/j.neucom.2021.01.141_b0225
10.1016/j.neucom.2021.01.141_b0080
10.1016/j.neucom.2021.01.141_b0180
10.1016/j.neucom.2021.01.141_b0060
Liao (10.1016/j.neucom.2021.01.141_b0015) 2020; 415
10.1016/j.neucom.2021.01.141_b0040
10.1016/j.neucom.2021.01.141_b0085
Lv (10.1016/j.neucom.2021.01.141_b0160) 2018; 283
Lv (10.1016/j.neucom.2021.01.141_b0125) 2019; 356
Wang (10.1016/j.neucom.2021.01.141_b0200) 2018; 25
10.1016/j.neucom.2021.01.141_b0105
Wang (10.1016/j.neucom.2021.01.141_b0130) 2020; 412
Liu (10.1016/j.neucom.2021.01.141_b0065) 2020; 404
References_xml – volume: 415
  start-page: 234
  year: 2020
  end-page: 246
  ident: b0015
  article-title: Finite-time formation-containment tracking for second-order multi-agent systems with a virtual leader of fully unknown input
  publication-title: Neurocomputing
– start-page: 205
  year: 2011
  end-page: 266
  ident: b0135
  article-title: Mean Field Games and Applications, in: A. Cousin, S. Crépey, O. Guéant, D. Hobson, M. Jeanblanc, J.-M. Lasry, J.-P. Laurent, P.-L. Lions, P. Tankov (Eds.), Paris-Princeton Lectures on Mathematical Finance 2010
– volume: 418
  start-page: 191
  year: 2020
  end-page: 199
  ident: b0020
  article-title: Fully distributed event-triggered consensus protocols for multi-agent systems with physically interconnected network
  publication-title: Neurocomputing
– reference: L. Panait, S. Luke, Cooperative multi-agent learning: The state of the art, Autonomous Agents and Multi-Agent Systems 11 (3) (2005) 387–434, publisher: Springer. doi:10.1007/s10458-005-2631-2.
– reference: M. Liu, Y. Wan, F.L. Lewis, V.G. Lopez, Adaptive Optimal Control for Stochastic Multiplayer Differential Games Using On-Policy and Off-Policy Reinforcement Learning, IEEE Transactions on Neural Networks and Learning Systems 31 (12) (2020) 5522–5533, conference Name: IEEE Transactions on Neural Networks and Learning Systems. doi:10.1109/TNNLS.2020.2969215.
– reference: H. Li, X. Liao, T. Huang, W. Zhu, Event-Triggering Sampling Based Leader-Following Consensus in Second-Order Multi-Agent Systems, IEEE Transactions on Automatic Control 60 (7) (2015) 1998–2003, conference Name: IEEE Transactions on Automatic Control. doi:10.1109/TAC.2014.2365073.
– reference: J. Chen, B. Chen, Z. Zeng, Synchronization and Consensus in Networks of Linear Fractional-Order Multi-Agent Systems via Sampled-Data Control, IEEE Transactions on Neural Networks and Learning Systems 31 (8) (2020) 2955–2964, conference Name: IEEE Transactions on Neural Networks and Learning Systems. doi:10.1109/TNNLS.2019.2934648.
– reference: R. Vidal, S. Rashid, C. Sharp, O. Shakernia, J. Kim, S. Sastry, Pursuit-evasion games with unmanned ground and aerial vehicles, in: Proceedings 2001 ICRA. IEEE International Conference on Robotics and Automation (Cat. No.01CH37164), Vol. 3, 2001, pp. 2948–2955 vol 3, iSSN: 1050–4729. doi:10.1109/ROBOT.2001.933069.
– volume: 283
  start-page: 87
  year: 2018
  end-page: 97
  ident: b0160
  article-title: Online optimal solutions for multi-player nonzero-sum game with completely unknown dynamics
  publication-title: Neurocomputing
– reference: L. Zou, Z. Wang, H. Gao, F.E. Alsaadi, Finite-Horizon
– reference: M. Katsev, A. Yershova, B. Tovar, R. Ghrist, S.M. LaValle, Mapping and Pursuit-Evasion Strategies For a Simple Wall-Following Robot, IEEE Transactions on Robotics 27 (1) (2011) 113–128, conference Name: IEEE Transactions on Robotics. doi:10.1109/TRO.2010.2095570.
– reference: J.K. Gupta, M. Egorov, M. Kochenderfer, Cooperative Multi-agent Control Using Deep Reinforcement Learning, in: G. Sukthankar, J.A. Rodriguez-Aguilar (Eds.), Autonomous Agents and Multiagent Systems, Lecture Notes in Computer Science, Springer International Publishing, Cham, 2017, pp. 66–83. doi:10.1007/978-3-319-71682-4_5.
– volume: 53
  start-page: 424
  year: 2015
  end-page: 440
  ident: b0075
  article-title: A survey of multi-agent formation control
  publication-title: Automatica
– reference: L. Busoniu, R. Babuška, B. De Schutter, Multi-agent reinforcement learning: An overview, Studies in Computational Intelligence 310 (2010) 183–221, publisher: Springer, Berlin, Heidelberg ISBN: 9783642144349. doi:10.1007/978-3-642-14435-6_7.
– reference: M. Abu-Khalaf, F.L. Lewis, Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach, Automatica doi:10.1016/j.automatica.2004.11.034.
– volume: 9
  start-page: 32
  year: 2009
  end-page: 50
  ident: b0215
  article-title: Reinforcement learning and adaptive dynamic programming for feedback control
  publication-title: IEEE Circuits and Systems Magazine
– start-page: 1
  year: 2017
  end-page: 28
  ident: b0220
  article-title: Mean Field Games, in: T. Basar, G. Zaccour (Eds.), Handbook of Dynamic Game Theory
– start-page: 618
  year: 2016
  end-page: 625
  ident: b0035
  article-title: Game of drones: UAV pursuit-evasion game with type-2 fuzzy logic controllers tuned by reinforcement learning
  publication-title: 2016 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE)
– reference: L. Búrdalo, A. Terrasa, V. Julián, A. García-Fornes, The Information Flow Problem in multi-agent systems, Engineering Applications of Artificial Intelligence 70 (2018) 130–141, publisher: Elsevier Ltd. doi:10.1016/j.engappai.2018.01.011.
– volume: 413
  start-page: 14
  year: 2020
  end-page: 22
  ident: b0010
  article-title: Event-triggered consensus control of high-order multi-agent systems with arbitrary switching topologies via model partitioning approach
  publication-title: Neurocomputing
– reference: Consensus Control of Time-Varying Multiagent Systems With Stochastic Communication Protocol, IEEE Transactions on Cybernetics 47 (8) (2017) 1830–1840, conference Name: IEEE Transactions on Cybernetics. doi:10.1109/TCYB.2017.2685425.
– reference: K.G. Vamvoudakis, F.L. Lewis, Multi-player non-zero-sum games: Online adaptive learning solution of coupled Hamilton-Jacobi equations, Automatica 47 (8) (2011) 1556–1569, publisher: Pergamon. doi:10.1016/J.AUTOMATICA.2011.03.005. URL:https://www.sciencedirect.com/science/article/pii/S0005109811001774.
– year: 2012
  ident: b0155
  article-title: Optimal Adaptive Control and Differential Games by Reinforcement Learning Principles
– reference: F.L. Lewis, D. Vrabie, V.L. Syrmos, Optimal Control, 3rd Edition., John Wiley & Sons, 2012, oCLC: 940552625.
– reference: J. Han, A. Jentzen, E. Weinan, Solving high-dimensional partial differential equations using deep learning, Proceedings of the National Academy of Sciences of the United States of America 115 (34) (2018) 8505–8510, arXiv: 1707.02568. doi:10.1073/pnas.1718942115. URL:www.pnas.org/cgi/doi/10.1073/pnas.1718942115.
– volume: 2020
  start-page: 5382
  year: 2020
  end-page: 5387
  ident: b0145
  article-title: Mean Field Game and Decentralized Intelligent Adaptive Pursuit Evasion Strategy for Massive Multi-Agent System under Uncertain Environment, in: 2020 American Control Conference (ACC), IEEE, Denver, CO, USA, 2020
  publication-title: IEEE, Denver, CO, USA
– volume: 404
  start-page: 137
  year: 2020
  end-page: 144
  ident: b0065
  article-title: Online optimal consensus control of unknown linear multi-agent systems via time-based adaptive dynamic programming
  publication-title: Neurocomputing
– volume: 356
  start-page: 8255
  year: 2019
  end-page: 8277
  ident: b0125
  article-title: Adaptive optimal tracking controls of unknown multi-input systems based on nonzero-sum game theory
  publication-title: Journal of the Franklin Institute
– reference: F.L. Lewis, D. Vrabie, K.G. Vamvoudakis, Reinforcement learning and feedback control: Using natural decision methods to design optimal adaptive controllers, IEEE Control Systems 32 (6) (2012) 76–105, publisher: IEEE. doi:10.1109/MCS.2012.2214134.
– volume: 4
  start-page: 110
  year: 2014
  end-page: 154
  ident: b0205
  article-title: Mean Field Games Models–A Brief Survey
  publication-title: Dynamic Games and Applications
– reference: H.-N. Dai, R.C.-W. Wong, H. Wang, Z. Zheng, A.V. Vasilakos, Big Data Analytics for Large-scale Wireless Networks: Challenges and Opportunities, ACM Computing Surveys 52 (5) (2019) 99:1–99:36. doi:10.1145/3337065. URL:https://doi.org/10.1145/3337065.
– reference: M. Pechoucek, V. Marik, O. Stepankova, Towards Reducing Communication Traffic In Multi-Agent Systems, Journal of Applied Systems Science: Special Issue 2 (1) (2001) 211–245. URL:http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.58.8980.
– reference: K. Vamvoudakis, D. Vrabie, F. Lewis, Online policy iteration based algorithms to solve the continuous- time infinite horizon optimal control problem, in: 2009 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, ADPRL 2009 - Proceedings, 2009, pp. 36–41. doi:10.1109/ADPRL.2009.4927523.
– volume: 413
  start-page: 339
  year: 2020
  end-page: 347
  ident: b0025
  article-title: Leader-following consensus of multi-agent systems under antagonistic networks
  publication-title: Neurocomputing
– reference: V. Turetsky, T. Shima, Target Evasion from a Missile Performing Multiple Switches in Guidance Law, Journal of Guidance, Control, and Dynamics 39 (10) (2016) 2364–2373, publisher: American Institute of Aeronautics and Astronautics _eprint: doi: 10.2514/1.G000461. doi:10.2514/1.G000461. URL:https://doi.org/10.2514/1.G000461.
– reference: E. Garcia, D.W. Casbeer, A.V. Moll, M. Pachter, Multiple Pursuer Multiple Evader Differential Games, IEEE Transactions on Automatic Control (2020) 1–1Conference Name: IEEE Transactions on Automatic Control. doi:10.1109/TAC.2020.3003840.
– volume: 25
  start-page: 32
  year: 2018
  end-page: 38
  ident: b0200
  article-title: D2D Big Data: Content Deliveries over Wireless Device-to-Device Sharing in Large-Scale Mobile Networks
  publication-title: IEEE Wireless Communications
– start-page: 1231
  year: 2019
  end-page: 1236
  ident: b0170
  article-title: Decentralized Adaptive Optimal Tracking Control for Massive Multi-agent Systems: An Actor-Critic-Mass Algorithm
  publication-title: in: 2019 IEEE 58th Conference on Decision and Control (CDC)
– volume: 415
  start-page: 157
  year: 2020
  end-page: 164
  ident: b0005
  article-title: Event-triggered adaptive consensus tracking control for nonlinear switching multi-agent systems
  publication-title: Neurocomputing
– reference: M. Agiwal, A. Roy, N. Saxena, Next Generation 5G Wireless Networks: A Comprehensive Survey, IEEE Communications Surveys Tutorials 18 (3) (2016) 1617–1655, conference Name: IEEE Communications Surveys Tutorials. doi:10.1109/COMST.2016.2532458.
– reference: W. Lin, Z. Qu, M.A. Simaan, Nash strategies for pursuit-evasion differential games involving limited observations, IEEE Transactions on Aerospace and Electronic Systems 51 (2) (2015) 1347–1356, conference Name: IEEE Transactions on Aerospace and Electronic Systems. doi:10.1109/TAES.2014.130569.
– reference: Z. Zhang, Y. Xiao, Z. Ma, M. Xiao, Z. Ding, X. Lei, G.K. Karagiannidis, P. Fan, 6G Wireless Networks: Vision, Requirements, Architecture, and Key Technologies, IEEE Vehicular Technology Magazine 14 (3) (2019) 28–41, conference Name: IEEE Vehicular Technology Magazine. doi:10.1109/MVT.2019.2921208.
– year: 2013
  ident: b0210
  article-title: Mean Field Games and Mean Field Type Control Theory, SpringerBriefs in Mathematics, Springer-Verlag
  publication-title: New York
– volume: 412
  start-page: 101
  year: 2020
  end-page: 114
  ident: b0130
  article-title: Cooperative control for multi-player pursuit-evasion games with reinforcement learning
  publication-title: Neurocomputing
– reference: V.G. Lopez, F.L. Lewis, Y. Wan, E.N. Sanchez, L. Fan, Solutions for Multiagent Pursuit-Evasion Games on Communication Graphs: Finite-Time Capture and Asymptotic Behaviors, IEEE Transactions on Automatic Control 65 (5) (2020) 1911–1923, conference Name: IEEE Transactions on Automatic Control. doi:10.1109/TAC.2019.2926554.
– volume: 2
  start-page: 229
  year: 2007
  end-page: 260
  ident: b0140
  article-title: Mean field games
  publication-title: Japanese Journal of Mathematics
– volume: 413
  start-page: 499
  year: 2020
  end-page: 511
  ident: b0030
  article-title: Sampled-data synchronization of delayed multi-agent networks and its application to coupled circuit
  publication-title: Neurocomputing
– ident: 10.1016/j.neucom.2021.01.141_b0060
  doi: 10.1109/COMST.2016.2532458
– volume: 413
  start-page: 14
  year: 2020
  ident: 10.1016/j.neucom.2021.01.141_b0010
  article-title: Event-triggered consensus control of high-order multi-agent systems with arbitrary switching topologies via model partitioning approach
  publication-title: Neurocomputing
  doi: 10.1016/j.neucom.2020.06.058
– start-page: 1231
  year: 2019
  ident: 10.1016/j.neucom.2021.01.141_b0170
  article-title: Decentralized Adaptive Optimal Tracking Control for Massive Multi-agent Systems: An Actor-Critic-Mass Algorithm
– ident: 10.1016/j.neucom.2021.01.141_b0230
  doi: 10.1109/TNNLS.2020.2969215
– volume: 413
  start-page: 339
  year: 2020
  ident: 10.1016/j.neucom.2021.01.141_b0025
  article-title: Leader-following consensus of multi-agent systems under antagonistic networks
  publication-title: Neurocomputing
  doi: 10.1016/j.neucom.2020.07.006
– volume: 2020
  start-page: 5382
  year: 2020
  ident: 10.1016/j.neucom.2021.01.141_b0145
  article-title: Mean Field Game and Decentralized Intelligent Adaptive Pursuit Evasion Strategy for Massive Multi-Agent System under Uncertain Environment, in: 2020 American Control Conference (ACC), IEEE, Denver, CO, USA, 2020
  publication-title: IEEE, Denver, CO, USA
– ident: 10.1016/j.neucom.2021.01.141_b0165
  doi: 10.1016/j.automatica.2004.11.034
– ident: 10.1016/j.neucom.2021.01.141_b0225
  doi: 10.1109/ADPRL.2009.4927523
– ident: 10.1016/j.neucom.2021.01.141_b0110
  doi: 10.1007/978-3-642-14435-6_7
– volume: 4
  start-page: 110
  issue: 2
  year: 2014
  ident: 10.1016/j.neucom.2021.01.141_b0205
  article-title: Mean Field Games Models–A Brief Survey
  publication-title: Dynamic Games and Applications
  doi: 10.1007/s13235-013-0099-2
– ident: 10.1016/j.neucom.2021.01.141_b0095
  doi: 10.1016/j.engappai.2018.01.011
– ident: 10.1016/j.neucom.2021.01.141_b0055
  doi: 10.1109/MVT.2019.2921208
– volume: 9
  start-page: 32
  issue: 3
  year: 2009
  ident: 10.1016/j.neucom.2021.01.141_b0215
  article-title: Reinforcement learning and adaptive dynamic programming for feedback control
  publication-title: IEEE Circuits and Systems Magazine
  doi: 10.1109/MCAS.2009.933854
– start-page: 1
  year: 2017
  ident: 10.1016/j.neucom.2021.01.141_b0220
– ident: 10.1016/j.neucom.2021.01.141_b0040
  doi: 10.1109/ROBOT.2001.933069
– ident: 10.1016/j.neucom.2021.01.141_b0070
  doi: 10.1109/TCYB.2017.2685425
– ident: 10.1016/j.neucom.2021.01.141_b0120
  doi: 10.1007/978-3-319-71682-4_5
– volume: 415
  start-page: 234
  year: 2020
  ident: 10.1016/j.neucom.2021.01.141_b0015
  article-title: Finite-time formation-containment tracking for second-order multi-agent systems with a virtual leader of fully unknown input
  publication-title: Neurocomputing
  doi: 10.1016/j.neucom.2020.07.067
– volume: 404
  start-page: 137
  year: 2020
  ident: 10.1016/j.neucom.2021.01.141_b0065
  article-title: Online optimal consensus control of unknown linear multi-agent systems via time-based adaptive dynamic programming
  publication-title: Neurocomputing
  doi: 10.1016/j.neucom.2020.04.119
– volume: 412
  start-page: 101
  year: 2020
  ident: 10.1016/j.neucom.2021.01.141_b0130
  article-title: Cooperative control for multi-player pursuit-evasion games with reinforcement learning
  publication-title: Neurocomputing
  doi: 10.1016/j.neucom.2020.06.031
– ident: 10.1016/j.neucom.2021.01.141_b0050
  doi: 10.1145/3337065
– ident: 10.1016/j.neucom.2021.01.141_b0105
  doi: 10.1109/TRO.2010.2095570
– volume: 413
  start-page: 499
  year: 2020
  ident: 10.1016/j.neucom.2021.01.141_b0030
  article-title: Sampled-data synchronization of delayed multi-agent networks and its application to coupled circuit
  publication-title: Neurocomputing
  doi: 10.1016/j.neucom.2020.05.060
– ident: 10.1016/j.neucom.2021.01.141_b0090
– ident: 10.1016/j.neucom.2021.01.141_b0180
  doi: 10.1109/MCS.2012.2214134
– ident: 10.1016/j.neucom.2021.01.141_b0115
  doi: 10.1007/s10458-005-2631-2
– ident: 10.1016/j.neucom.2021.01.141_b0085
  doi: 10.1109/TAC.2014.2365073
– ident: 10.1016/j.neucom.2021.01.141_b0100
  doi: 10.1109/TAES.2014.130569
– volume: 356
  start-page: 8255
  issue: 15
  year: 2019
  ident: 10.1016/j.neucom.2021.01.141_b0125
  article-title: Adaptive optimal tracking controls of unknown multi-input systems based on nonzero-sum game theory
  publication-title: Journal of the Franklin Institute
  doi: 10.1016/j.jfranklin.2019.07.022
– volume: 25
  start-page: 32
  issue: 1
  year: 2018
  ident: 10.1016/j.neucom.2021.01.141_b0200
  article-title: D2D Big Data: Content Deliveries over Wireless Device-to-Device Sharing in Large-Scale Mobile Networks
  publication-title: IEEE Wireless Communications
  doi: 10.1109/MWC.2018.1700215
– volume: 415
  start-page: 157
  year: 2020
  ident: 10.1016/j.neucom.2021.01.141_b0005
  article-title: Event-triggered adaptive consensus tracking control for nonlinear switching multi-agent systems
  publication-title: Neurocomputing
  doi: 10.1016/j.neucom.2020.07.032
– start-page: 618
  year: 2016
  ident: 10.1016/j.neucom.2021.01.141_b0035
  article-title: Game of drones: UAV pursuit-evasion game with type-2 fuzzy logic controllers tuned by reinforcement learning
– volume: 53
  start-page: 424
  year: 2015
  ident: 10.1016/j.neucom.2021.01.141_b0075
  article-title: A survey of multi-agent formation control
  publication-title: Automatica
  doi: 10.1016/j.automatica.2014.10.022
– ident: 10.1016/j.neucom.2021.01.141_b0150
  doi: 10.1002/9781118122631
– ident: 10.1016/j.neucom.2021.01.141_b0175
  doi: 10.1073/pnas.1718942115
– year: 2012
  ident: 10.1016/j.neucom.2021.01.141_b0155
– volume: 283
  start-page: 87
  year: 2018
  ident: 10.1016/j.neucom.2021.01.141_b0160
  article-title: Online optimal solutions for multi-player nonzero-sum game with completely unknown dynamics
  publication-title: Neurocomputing
  doi: 10.1016/j.neucom.2017.12.045
– ident: 10.1016/j.neucom.2021.01.141_b0045
  doi: 10.2514/1.G000461
– start-page: 205
  year: 2011
  ident: 10.1016/j.neucom.2021.01.141_b0135
– volume: 2
  start-page: 229
  issue: 1
  year: 2007
  ident: 10.1016/j.neucom.2021.01.141_b0140
  article-title: Mean field games
  publication-title: Japanese Journal of Mathematics
  doi: 10.1007/s11537-007-0657-8
– ident: 10.1016/j.neucom.2021.01.141_b0190
  doi: 10.1109/TAC.2019.2926554
– ident: 10.1016/j.neucom.2021.01.141_b0195
  doi: 10.1016/j.automatica.2011.03.005
– ident: 10.1016/j.neucom.2021.01.141_b0185
  doi: 10.1109/TAC.2020.3003840
– year: 2013
  ident: 10.1016/j.neucom.2021.01.141_b0210
  article-title: Mean Field Games and Mean Field Type Control Theory, SpringerBriefs in Mathematics, Springer-Verlag
  publication-title: New York
– volume: 418
  start-page: 191
  year: 2020
  ident: 10.1016/j.neucom.2021.01.141_b0020
  article-title: Fully distributed event-triggered consensus protocols for multi-agent systems with physically interconnected network
  publication-title: Neurocomputing
  doi: 10.1016/j.neucom.2020.07.052
– ident: 10.1016/j.neucom.2021.01.141_b0080
  doi: 10.1109/TNNLS.2019.2934648
SSID ssj0017129
Score 2.5243165
Snippet In this paper, the intelligent design for the pursuit-evasion game with large scale multi-pursuer and multi-evader has been investigated. Due to the vast...
SourceID crossref
elsevier
SourceType Enrichment Source
Index Database
Publisher
StartPage 46
SubjectTerms Approximate dynamic programming
Mean field game theory
Optimal control
Pursuit-evasion game
Reinforcement learning
Title Decentralized optimal large scale multi-player pursuit-evasion strategies: A mean field game approach with reinforcement learning
URI https://dx.doi.org/10.1016/j.neucom.2021.01.141
Volume 484
WOSCitedRecordID wos000772751100005&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVESC
  databaseName: ScienceDirect database
  customDbUrl:
  eissn: 1872-8286
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0017129
  issn: 0925-2312
  databaseCode: AIEXJ
  dateStart: 19950101
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Lb9NAEF6FlgMXKC_RQtEeuEWLunactblFUFQQqpAoUsTFWu-OS6LEiRI7qnpD4ocz-3LdBvGSuFjRypusMp9nZz7PfEvIC6liFWdQsmGEKcoAAOwzxwrcO2AozI5kTy35IE5P0_E4-9jrfQ-9MJuZqKr04iJb_ldT4xga27TO_oW52y_FAfyMRscrmh2vf2T4N-ALLieXGEwu0CXM0QwzU_HdX6NFwBURsuVMYrjdXzardTOpGWykIc766zqIR7im9bmh6m2dW_9czqEVIXcM7gqs8qqyJGM4guK8G_Fa9Q9lz47wrMRobsQZtEFiy0J8-bpo7HsSmHbwOrZjJ3LR5SYwrW0rAR1httU045jHKGEYVjonDM7vpiKyHe1dx-xUToNr9Uyl26Sd3PuW-3dMxPRlBY2pBcIlcSPKyp221g1h7U9mIWYdmPVyTLuyW2Q3EkmGvnF39O54_L59GyV45DQb_cJDC6atE9z-rZ-HOJ2w5WyP3PX5Bh05nNwnPagekHvhLA_qXftD8u0abKiHDbWwoRY2tAsbegM29Ao2r-iIGtBQCxpqQEMDaKgBDb0GGhpA84h8fnt89vqE-eM5mMI8s2Y8SaWUhkGQxVGpiiFuFzIBARp4GUHJtVZpkoIsy0Emi0GktCgAXUJSJLzQg_gx2akWFTwhNNJZWqRponkMeB-GpaAwth8qo85UlHqfxOEfzZXXrjdHqMzyUKQ4zZ0dcmOH_IhjSsv3CWtnLZ12y2_uF8FYuY8_XVyZI75-OfPgn2c-JXeuHpxnZKdeNXBIbqtNPVmvnnsg_gAbarIM
linkProvider Elsevier
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Decentralized+optimal+large+scale+multi-player+pursuit-evasion+strategies%3A+A+mean+field+game+approach+with+reinforcement+learning&rft.jtitle=Neurocomputing+%28Amsterdam%29&rft.au=Zhou%2C+Zejian&rft.au=Xu%2C+Hao&rft.date=2022-05-01&rft.pub=Elsevier+B.V&rft.issn=0925-2312&rft.eissn=1872-8286&rft.volume=484&rft.spage=46&rft.epage=58&rft_id=info:doi/10.1016%2Fj.neucom.2021.01.141&rft.externalDocID=S0925231221015769
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0925-2312&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0925-2312&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0925-2312&client=summon