Deep deterministic policy gradient algorithm for crowd-evacuation path planning

•We propose E-MADDPG algorithm with higher learning efficiency based on MADDPG.•We extract the motion trajectory from the pedestrian video to reduce the state space.•We propose a hierarchical crowd evacuation path planning method based on DRL. In existing evacuation methods, the large number of pede...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:Computers & industrial engineering Ročník 161; s. 107621
Hlavní autori: Li, Xinjin, Liu, Hong, Li, Junqing, Li, Yan
Médium: Journal Article
Jazyk:English
Vydavateľské údaje: Elsevier Ltd 01.11.2021
Predmet:
ISSN:0360-8352, 1879-0550
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Abstract •We propose E-MADDPG algorithm with higher learning efficiency based on MADDPG.•We extract the motion trajectory from the pedestrian video to reduce the state space.•We propose a hierarchical crowd evacuation path planning method based on DRL. In existing evacuation methods, the large number of pedestrians and the complex environment will affect the efficiency of evacuation. Therefore, we propose a hierarchical evacuation method based on multi-agent deep reinforcement learning (MADRL) to solve the above problem. First, we use a two-level evacuation mechanism to guide evacuations, the crowd is divided into leaders and followers. Second, in the upper level, leaders perform path planning to guide the evacuation. To obtain the best evacuation path, we propose the efficient multi-agent deep deterministic policy gradient (E-MADDPG) algorithm for crowd-evacuation path planning. E-MADDPG algorithm combines learning curves to improve the fixed experience pool of MADDPG algorithm and uses high-priority experience playback strategy to improve the sampling strategy. The improvement increases the learning efficiency of the algorithm. Meanwhile we extract pedestrian motion trajectories from real motion videos to reduce the state space of algorithm. Third, in the bottom layer, followers use the relative velocity obstacle (RVO) algorithm to avoid collisions and follow leaders to evacuate. Finally, experimental results illustrate that the E-MADDPG algorithm can improve path planning efficiency, while the proposed method can improve the efficiency of crowd evacuation.
AbstractList •We propose E-MADDPG algorithm with higher learning efficiency based on MADDPG.•We extract the motion trajectory from the pedestrian video to reduce the state space.•We propose a hierarchical crowd evacuation path planning method based on DRL. In existing evacuation methods, the large number of pedestrians and the complex environment will affect the efficiency of evacuation. Therefore, we propose a hierarchical evacuation method based on multi-agent deep reinforcement learning (MADRL) to solve the above problem. First, we use a two-level evacuation mechanism to guide evacuations, the crowd is divided into leaders and followers. Second, in the upper level, leaders perform path planning to guide the evacuation. To obtain the best evacuation path, we propose the efficient multi-agent deep deterministic policy gradient (E-MADDPG) algorithm for crowd-evacuation path planning. E-MADDPG algorithm combines learning curves to improve the fixed experience pool of MADDPG algorithm and uses high-priority experience playback strategy to improve the sampling strategy. The improvement increases the learning efficiency of the algorithm. Meanwhile we extract pedestrian motion trajectories from real motion videos to reduce the state space of algorithm. Third, in the bottom layer, followers use the relative velocity obstacle (RVO) algorithm to avoid collisions and follow leaders to evacuate. Finally, experimental results illustrate that the E-MADDPG algorithm can improve path planning efficiency, while the proposed method can improve the efficiency of crowd evacuation.
ArticleNumber 107621
Author Li, Xinjin
Liu, Hong
Li, Junqing
Li, Yan
Author_xml – sequence: 1
  givenname: Xinjin
  surname: Li
  fullname: Li, Xinjin
  organization: School of Information Science and Engineering, Shandong Normal University, Jinan 250014, China
– sequence: 2
  givenname: Hong
  surname: Liu
  fullname: Liu, Hong
  email: lhsdcn@126.com
  organization: School of Information Science and Engineering, Shandong Normal University, Jinan 250014, China
– sequence: 3
  givenname: Junqing
  surname: Li
  fullname: Li, Junqing
  organization: School of Information Science and Engineering, Shandong Normal University, Jinan 250014, China
– sequence: 4
  givenname: Yan
  surname: Li
  fullname: Li, Yan
  organization: School of Information Science and Engineering, Shandong Normal University, Jinan 250014, China
BookMark eNp9kM1OwzAQhC0EEm3hAbj5BVLWThMn4oTKr1SpFzhbW3vTukqdyDFFfXtcyolDT6sd7beamTG79J0nxu4ETAWI8n47NY6mEqRIuyqluGAjUak6g6KASzaCvISsygt5zcbDsAWAWVGLEVs-EfXcUqSwc94N0Rned60zB74OaB35yLFdd8HFzY43XeAmdN82oz2aL4yu87zHuOF9i947v75hVw22A93-zQn7fHn-mL9li-Xr-_xxkRlZq5hJU0srm6KQucwrrEoLxihU5UyqGhVKAokzkNaKRoGdVTmt1ApXokpCnpgJU6e_yc4wBGq0cfHXTwzoWi1AH3vR26STPvaiT70kUvwj--B2GA5nmYcTQynS3lHQQzrxhqwLZKK2nTtD_wC4lH3E
CitedBy_id crossref_primary_10_1016_j_jobe_2025_112626
crossref_primary_10_1016_j_cie_2023_109014
crossref_primary_10_1109_TVT_2024_3423430
crossref_primary_10_3389_fninf_2023_1096053
crossref_primary_10_1016_j_cjph_2024_09_008
crossref_primary_10_1080_17509653_2025_2475759
crossref_primary_10_3390_ijgi11040255
crossref_primary_10_3390_math10010164
crossref_primary_10_1016_j_jobe_2025_113345
crossref_primary_10_1016_j_physa_2025_130561
crossref_primary_10_1016_j_dt_2025_08_011
crossref_primary_10_1016_j_jobe_2024_110408
crossref_primary_10_1016_j_cja_2023_09_033
crossref_primary_10_3390_machines13020162
crossref_primary_10_1109_JSEN_2023_3255217
crossref_primary_10_1109_TITS_2022_3180743
crossref_primary_10_1016_j_ssci_2022_105955
crossref_primary_10_1007_s10489_024_06074_w
crossref_primary_10_1016_j_physa_2023_129011
crossref_primary_10_1016_j_eswa_2025_128485
crossref_primary_10_3390_agriculture13071324
Cites_doi 10.1016/0378-7206(90)90014-9
10.1109/TAFFC.2018.2836462
10.1109/JAS.2019.1911732
10.1007/s12369-015-0310-2
10.1007/s10489-018-1241-z
10.1109/TWC.2019.2935201
10.1016/j.simpat.2018.02.007
10.1038/s41586-019-1724-z
10.1109/TFUZZ.2020.3016225
10.1016/j.jocs.2017.12.012
10.1016/j.automatica.2009.07.008
10.3390/en6041887
10.1016/j.neucom.2019.08.021
10.1109/TSMC.2018.2884725
10.1016/j.patcog.2010.12.012
10.1038/nature14236
10.1002/cav.1636
10.1007/s10489-020-01711-6
10.1016/j.asoc.2019.105741
10.1162/neco.2006.18.7.1527
10.1007/s10462-021-09997-9
10.1016/j.physa.2021.125845
10.1109/TEVC.2019.2916183
10.1016/j.aei.2016.04.005
10.1109/ACCESS.2019.2946659
10.1016/j.robot.2019.02.013
10.1109/TASE.2021.3062979
10.3390/en9020070
10.1016/j.neucom.2019.06.099
10.1016/j.ins.2018.01.023
10.1016/j.physa.2017.11.041
10.1007/s11071-020-06111-6
10.1016/j.neucom.2020.04.141
10.1016/j.neucom.2020.05.097
10.1109/TSMCC.2007.913919
10.1016/j.asoc.2018.04.015
10.1016/j.knosys.2020.106451
10.1016/j.neucom.2016.08.108
10.1016/j.swevo.2019.100600
ContentType Journal Article
Copyright 2021 Elsevier Ltd
Copyright_xml – notice: 2021 Elsevier Ltd
DBID AAYXX
CITATION
DOI 10.1016/j.cie.2021.107621
DatabaseName CrossRef
DatabaseTitle CrossRef
DatabaseTitleList
DeliveryMethod fulltext_linktorsrc
Discipline Applied Sciences
Engineering
EISSN 1879-0550
ExternalDocumentID 10_1016_j_cie_2021_107621
S0360835221005258
GroupedDBID --K
--M
-~X
.DC
.~1
0R~
1B1
1RT
1~.
1~5
29F
4.4
457
4G.
5GY
5VS
7-5
71M
8P~
9JN
9JO
AAAKG
AABNK
AACTN
AAEDT
AAEDW
AAFWJ
AAIKC
AAIKJ
AAKOC
AALRI
AAMNW
AAOAW
AAQFI
AAQXK
AARIN
AAXUO
ABAOU
ABMAC
ABUCO
ABXDB
ACDAQ
ACGFO
ACGFS
ACNCT
ACNNM
ACRLP
ADBBV
ADEZE
ADGUI
ADMUD
ADRHT
ADTZH
AEBSH
AECPX
AEKER
AENEX
AFKWA
AFTJW
AGHFR
AGUBO
AGYEJ
AHHHB
AHJVU
AIEXJ
AIGVJ
AIKHN
AITUG
AJOXV
AKRWK
ALMA_UNASSIGNED_HOLDINGS
AMFUW
AMRAJ
APLSM
ARUGR
ASPBG
AVWKF
AXJTR
AZFZN
BJAXD
BKOJK
BKOMP
BLXMC
CS3
DU5
EBS
EFJIC
EJD
EO8
EO9
EP2
EP3
FDB
FEDTE
FGOYB
FIRID
FNPLU
FYGXN
G-2
G-Q
G8K
GBLVA
HAMUX
HLZ
HVGLF
HZ~
H~9
IHE
J1W
JJJVA
KOM
LX9
LY1
LY7
M41
MHUIS
MO0
MS~
N9A
O-L
O9-
OAUVE
OZT
P-8
P-9
P2P
PC.
PQQKQ
Q38
R2-
RIG
RNS
ROL
RPZ
RXW
SBC
SDF
SDG
SDP
SDS
SES
SET
SEW
SPC
SPCBC
SSB
SSD
SST
SSW
SSZ
T5K
TAE
TN5
WUQ
XPP
ZMT
~G-
9DU
AATTM
AAXKI
AAYWO
AAYXX
ABDPE
ABJNI
ABWVN
ACLOT
ACRPL
ACVFH
ADCNI
ADNMO
AEIPS
AEUPX
AFJKZ
AFPUW
AGQPQ
AIGII
AIIUN
AKBMS
AKYEP
ANKPU
APXCP
CITATION
EFKBS
EFLBG
~HD
ID FETCH-LOGICAL-c297t-2c92d2f5523238a86d0cc7a764279a7a2e02a402dd1f70d483eb7bab18d1f3323
ISICitedReferencesCount 26
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000704356000015&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 0360-8352
IngestDate Tue Nov 18 22:27:25 EST 2025
Sat Nov 29 04:15:47 EST 2025
Sat Aug 10 15:30:31 EDT 2024
IsPeerReviewed true
IsScholarly true
Keywords Crowd simulation for evacuation
Multi-agent reinforcement learning
Deep reinforcement learning
Path planning
Language English
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-c297t-2c92d2f5523238a86d0cc7a764279a7a2e02a402dd1f70d483eb7bab18d1f3323
ParticipantIDs crossref_citationtrail_10_1016_j_cie_2021_107621
crossref_primary_10_1016_j_cie_2021_107621
elsevier_sciencedirect_doi_10_1016_j_cie_2021_107621
PublicationCentury 2000
PublicationDate November 2021
2021-11-00
PublicationDateYYYYMMDD 2021-11-01
PublicationDate_xml – month: 11
  year: 2021
  text: November 2021
PublicationDecade 2020
PublicationTitle Computers & industrial engineering
PublicationYear 2021
Publisher Elsevier Ltd
Publisher_xml – name: Elsevier Ltd
References Zhao, Ding, An, Jia (b0220) 2018; 48
Zhang, Li, Li (b0205) 2020; 411
Bhatnagar, S., Sutton, R. S., Ghavamzadeh, M., & Lee, M. (2009). Natural actor-critic algorithms. Automatica. 10.1016/j.automatica.2009.07.008.
Liu, M., Zhang, F., Ma, Y., Pota, H. R., & Shen, W. (2016). Evacuation path optimization based on quantum ant colony algorithm. Advanced Engineering Informatics. 10.1016/j.aei.2016.04.005.
Li, J. qing, Liu, Z.-M., Li, C., & Zheng, Z. (2020). Improved artificial immune system algorithm for Type-2 fuzzy flexible job shop scheduling problem. IEEE Transactions on Fuzzy Systems. 10.1109/tfuzz.2020.3016225.
Vinyals, O., Babuschkin, I., Czarnecki, W. M., Mathieu, M., Dudzik, A., Chung, J., Choi, D. H., Powell, R., Ewalds, T., Georgiev, P., Oh, J., Horgan, D., Kroiss, M., Danihelka, I., Huang, A., Sifre, L., Cai, T., Agapiou, J. P., Jaderberg, M., … Silver, D. (2019). Grandmaster level in StarCraft II using multi-agent reinforcement learning. Nature. 10.1038/s41586-019-1724-z.
Wang, Schaul, Hessel, Van Hasselt, Lanctot, De Frcitas (b0175) 2016
Li, Wang, Geng, Hong (b0065) 2021; 103
Van Hasselt, Guez, Silver (b0160) 2016
Zhang, Chai, Lykotrafitis (b0200) 2021; 571
Hinton, Osindero, Teh (b0050) 2006; 18
Yao, Z., Zhang, G., Lu, D., & Liu, H. (2019). Data-driven crowd evacuation: A reinforcement learning method. Neurocomputing. 10.1016/j.neucom.2019.08.021.
Kim, Pineau (b0060) 2016; 8
Liu, Xu, Lu, Zhang (b0110) 2018; 68
Sharma, Andersen, Granmo, Goodwin (b0140) 2020; 1–19
Singh, B., Kumar, R., & Singh, V. P. (2021). Reinforcement learning in robotic applications: a comprehensive survey. In Artificial Intelligence Review (Issue 0123456789). Springer Netherlands. 10.1007/s10462-021-09997-9.
Jiang, Huang, Ding (b0055) 2020; 7
Yao, Zhang, Lu, Liu (b0195) 2020; 404
Mnih, Kavukcuoglu, Silver, Rusu, Veness, Bellemare, Graves, Riedmiller, Fidjeland, Ostrovski, Petersen, Beattie, Sadik, Antonoglou, King, Kumaran, Wierstra, Legg, Hassabis (b0125) 2015; 518
Liu, Liu, Zhang, Qin (b0105) 2018; 84
Saraswat, S. P., & Gorgone, J. T. (1990). Organizational learning curve in software installation: An empirical investigation. Information and Management. 10.1016/0378-7206(90)90014-9.
Wang, Sun, He, Sun (b0180) 2020
Li, J. qing, Tao, X. rui, Jia, B. xian, Han, Y. yan, Liu, C., Duan, P., Zheng, Z. xin, & Sang, H. yan. (2020). Efficient multi-objective algorithm for the lot-streaming hybrid flowshop with variable sub-lots. Swarm and Evolutionary Computation. 10.1016/j.swevo.2019.100600.
Liu, Y., Zhou, S., & Chen, Q. (2011). Discriminative deep belief networks for visual data classification. Pattern Recognition. 10.1016/j.patcog.2010.12.012.
Bi, C., Pan, G., Yang, L., Lin, C. C., Hou, M., & Huang, Y. (2019). Evacuation route recommendation using auto-encoder and Markov decision process. Applied Soft Computing Journal, 84, 105741. 10.1016/j.asoc.2019.105741.
Goel, R., & Maini, R. (2018). A hybrid of ant colony and firefly algorithms (HAFA) for solving vehicle routing problems. Journal of Computational Science. 10.1016/j.jocs.2017.12.012.
Zhang, Liu, Qin, Liu (b0210) 2018; 492
Lowe, Wu, Tamar, Harb, Abbeel, Mordatch (b0120) 2017; 30
Wang, Z., Schaul, T., Hessel, M., Van Hasselt, H., Lanctot, M., & De Frcitas, N. (2016). Dueling Network Architectures for Deep Reinforcement Learning. 33rd International Conference on Machine Learning, ICML 2016.
Zhang, Wang, Lu, Liu (b0215) 2020; 11
Lillicrap, T. P., Hunt, J. J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., & Wierstra, D. (2016). Continuous control with deep reinforcement learning. 4th International Conference on Learning Representations, ICLR 2016 .
Cruz, D. L., & Yu, W. (2017). Path planning of multi-agent systems in unknown environment with neural kernel smoothing and reinforcement learning. Neurocomputing. 10.1016/j.neucom.2016.08.108.
Fan, Qing, Wang, Hong, Li (b0040) 2013; 6
Den Van Berg, Lin, Manocha (b0035) 2008
Wong, Tang, Li, Wang, Yu (b0185) 2015; 26
Li, J. qing, Du, Y., Gao, K., Duan, P., Gong, D., & Pan, Q. (2021). A hybrid iterated greedy algorithm for a crane transportation flexible job shop problem. IEEE Transactions on Automation Science and Engineering. 10.1109/TASE.2021.3062979.
Liu, Liu, Zhang, Li, Qin, Zhang (b0100) 2018; 436–437
Tian, Z., Zhang, G., Hu, C., Lu, D., & Liu, H. (2020). Knowledge and emotion dual-driven method for crowd evacuation. Knowledge-Based Systems, 208. 10.1016/j.knosys.2020.106451.
Buşoniu, L., Babuška, R., & De Schutter, B. (2008). A comprehensive survey of multiagent reinforcement learning. In IEEE Transactions on Systems, Man and Cybernetics Part C: Applications and Reviews. 10.1109/TSMCC.2007.913919.
Peng, Li, Hu (b0130) 2019; 365
Zheng, S., & Liu, H. (2019). Improved multi-Agent deep deterministic policy gradient for path planning-based crowd simulation. IEEE Access. 10.1109/ACCESS.2019.2946659.
Chen, Hong, Shen, Huang (b0020) 2016; 9
Cui, J., Liu, Y., & Nallanathan, A. (2020). Multi-Agent Reinforcement Learning-Based Resource Allocation for UAV Networks. IEEE Transactions on Wireless Communications. 10.1109/TWC.2019.2935201.
Low, E. S., Ong, P., & Cheah, K. C. (2019). Solving the optimal path planning of a mobile robot using improved Q-learning. Robotics and Autonomous Systems. 10.1016/j.robot.2019.02.013.
Sun, Y., Xue, B., Zhang, M., & Yen, G. G. (2020). Evolving Deep Convolutional Neural Networks for Image Classification. IEEE Transactions on Evolutionary Computation. 10.1109/TEVC.2019.2916183.
Zhao, Liu, Gao (b0225) 2021; 51
Zhang (10.1016/j.cie.2021.107621_b0210) 2018; 492
Jiang (10.1016/j.cie.2021.107621_b0055) 2020; 7
Peng (10.1016/j.cie.2021.107621_b0130) 2019; 365
Van Hasselt (10.1016/j.cie.2021.107621_b0160) 2016
10.1016/j.cie.2021.107621_b0095
10.1016/j.cie.2021.107621_b0150
Zhang (10.1016/j.cie.2021.107621_b0200) 2021; 571
10.1016/j.cie.2021.107621_b0030
Fan (10.1016/j.cie.2021.107621_b0040) 2013; 6
10.1016/j.cie.2021.107621_b0170
10.1016/j.cie.2021.107621_b0190
10.1016/j.cie.2021.107621_b0070
10.1016/j.cie.2021.107621_b0090
Yao (10.1016/j.cie.2021.107621_b0195) 2020; 404
10.1016/j.cie.2021.107621_b0005
Hinton (10.1016/j.cie.2021.107621_b0050) 2006; 18
10.1016/j.cie.2021.107621_b0145
10.1016/j.cie.2021.107621_b0025
10.1016/j.cie.2021.107621_b0165
Den Van Berg (10.1016/j.cie.2021.107621_b0035) 2008
10.1016/j.cie.2021.107621_b0045
Lowe (10.1016/j.cie.2021.107621_b0120) 2017; 30
Zhang (10.1016/j.cie.2021.107621_b0205) 2020; 411
Liu (10.1016/j.cie.2021.107621_b0105) 2018; 84
Wang (10.1016/j.cie.2021.107621_b0175) 2016
Kim (10.1016/j.cie.2021.107621_b0060) 2016; 8
Chen (10.1016/j.cie.2021.107621_b0020) 2016; 9
Li (10.1016/j.cie.2021.107621_b0065) 2021; 103
Sharma (10.1016/j.cie.2021.107621_b0140) 2020; 1–19
Zhang (10.1016/j.cie.2021.107621_b0215) 2020; 11
Liu (10.1016/j.cie.2021.107621_b0110) 2018; 68
10.1016/j.cie.2021.107621_b0085
Wang (10.1016/j.cie.2021.107621_b0180) 2020
10.1016/j.cie.2021.107621_b0080
Wong (10.1016/j.cie.2021.107621_b0185) 2015; 26
10.1016/j.cie.2021.107621_b0015
Liu (10.1016/j.cie.2021.107621_b0100) 2018; 436–437
10.1016/j.cie.2021.107621_b0115
10.1016/j.cie.2021.107621_b0135
10.1016/j.cie.2021.107621_b0155
Zhao (10.1016/j.cie.2021.107621_b0220) 2018; 48
10.1016/j.cie.2021.107621_b0075
10.1016/j.cie.2021.107621_b0010
10.1016/j.cie.2021.107621_b0230
Zhao (10.1016/j.cie.2021.107621_b0225) 2021; 51
Mnih (10.1016/j.cie.2021.107621_b0125) 2015; 518
References_xml – volume: 51
  start-page: 100
  year: 2021
  end-page: 123
  ident: b0225
  article-title: An evacuation simulation method based on an improved artificial bee colony algorithm and a social force model
  publication-title: Applied Intelligence
– reference: Bi, C., Pan, G., Yang, L., Lin, C. C., Hou, M., & Huang, Y. (2019). Evacuation route recommendation using auto-encoder and Markov decision process. Applied Soft Computing Journal, 84, 105741. 10.1016/j.asoc.2019.105741.
– volume: 30
  start-page: 6379
  year: 2017
  end-page: 6390
  ident: b0120
  article-title: Multi-agent actor-critic for mixed cooperative-competitive environments
  publication-title: Advances in Neural Information Processing Systems.
– year: 2016
  ident: b0160
  article-title: Deep reinforcement learning with double Q-Learning
  publication-title: 30th AAAI Conference on Artificial Intelligence
– reference: Low, E. S., Ong, P., & Cheah, K. C. (2019). Solving the optimal path planning of a mobile robot using improved Q-learning. Robotics and Autonomous Systems. 10.1016/j.robot.2019.02.013.
– reference: Li, J. qing, Tao, X. rui, Jia, B. xian, Han, Y. yan, Liu, C., Duan, P., Zheng, Z. xin, & Sang, H. yan. (2020). Efficient multi-objective algorithm for the lot-streaming hybrid flowshop with variable sub-lots. Swarm and Evolutionary Computation. 10.1016/j.swevo.2019.100600.
– volume: 411
  start-page: 206
  year: 2020
  end-page: 215
  ident: b0205
  article-title: A TD3-based multi-agent deep reinforcement learning method in mixed cooperation-competition environment
  publication-title: Neurocomputing
– reference: Liu, M., Zhang, F., Ma, Y., Pota, H. R., & Shen, W. (2016). Evacuation path optimization based on quantum ant colony algorithm. Advanced Engineering Informatics. 10.1016/j.aei.2016.04.005.
– volume: 84
  start-page: 190
  year: 2018
  end-page: 203
  ident: b0105
  article-title: A social force evacuation model driven by video data
  publication-title: Simulation Modelling Practice and Theory
– volume: 492
  start-page: 1107
  year: 2018
  end-page: 1119
  ident: b0210
  article-title: Modified two-layer social force model for emergency earthquake evacuation
  publication-title: Physica A: Statistical Mechanics and Its Applications
– reference: Cui, J., Liu, Y., & Nallanathan, A. (2020). Multi-Agent Reinforcement Learning-Based Resource Allocation for UAV Networks. IEEE Transactions on Wireless Communications. 10.1109/TWC.2019.2935201.
– reference: Liu, Y., Zhou, S., & Chen, Q. (2011). Discriminative deep belief networks for visual data classification. Pattern Recognition. 10.1016/j.patcog.2010.12.012.
– volume: 404
  start-page: 173
  year: 2020
  end-page: 185
  ident: b0195
  article-title: Learning crowd behavior from real data: A residual network method for crowd simulation
  publication-title: Neurocomputing
– reference: Cruz, D. L., & Yu, W. (2017). Path planning of multi-agent systems in unknown environment with neural kernel smoothing and reinforcement learning. Neurocomputing. 10.1016/j.neucom.2016.08.108.
– reference: Vinyals, O., Babuschkin, I., Czarnecki, W. M., Mathieu, M., Dudzik, A., Chung, J., Choi, D. H., Powell, R., Ewalds, T., Georgiev, P., Oh, J., Horgan, D., Kroiss, M., Danihelka, I., Huang, A., Sifre, L., Cai, T., Agapiou, J. P., Jaderberg, M., … Silver, D. (2019). Grandmaster level in StarCraft II using multi-agent reinforcement learning. Nature. 10.1038/s41586-019-1724-z.
– volume: 365
  start-page: 71
  year: 2019
  end-page: 85
  ident: b0130
  article-title: A self-learning dynamic path planning method for evacuation in large public buildings based on neural networks
  publication-title: Neurocomputing
– volume: 18
  start-page: 1527
  year: 2006
  end-page: 1554
  ident: b0050
  article-title: A fast learning algorithm for deep belief nets
  publication-title: Neural Computation
– reference: Yao, Z., Zhang, G., Lu, D., & Liu, H. (2019). Data-driven crowd evacuation: A reinforcement learning method. Neurocomputing. 10.1016/j.neucom.2019.08.021.
– reference: Bhatnagar, S., Sutton, R. S., Ghavamzadeh, M., & Lee, M. (2009). Natural actor-critic algorithms. Automatica. 10.1016/j.automatica.2009.07.008.
– volume: 6
  start-page: 1887
  year: 2013
  end-page: 1901
  ident: b0040
  article-title: Support vector regression model based on empirical mode decomposition and auto regression for electric load forecasting
  publication-title: Energies
– reference: Li, J. qing, Liu, Z.-M., Li, C., & Zheng, Z. (2020). Improved artificial immune system algorithm for Type-2 fuzzy flexible job shop scheduling problem. IEEE Transactions on Fuzzy Systems. 10.1109/tfuzz.2020.3016225.
– volume: 518
  start-page: 529
  year: 2015
  end-page: 533
  ident: b0125
  article-title: Human-level control through deep reinforcement learning
  publication-title: Nature
– reference: Wang, Z., Schaul, T., Hessel, M., Van Hasselt, H., Lanctot, M., & De Frcitas, N. (2016). Dueling Network Architectures for Deep Reinforcement Learning. 33rd International Conference on Machine Learning, ICML 2016.
– volume: 1–19
  year: 2020
  ident: b0140
  article-title: Deep Q-learning with Q-matrix transfer learning for novel fire evacuation environment
  publication-title: IEEE Transactions on Systems, Man, and Cybernetics: Systems
– volume: 11
  start-page: 708
  year: 2020
  end-page: 721
  ident: b0215
  article-title: Strategies to utilize the positive emotional contagion optimally in crowd evacuation
  publication-title: IEEE Transactions on Affective Computing
– reference: Zheng, S., & Liu, H. (2019). Improved multi-Agent deep deterministic policy gradient for path planning-based crowd simulation. IEEE Access. 10.1109/ACCESS.2019.2946659.
– year: 2020
  ident: b0180
  article-title: Deterministic policy gradient with integral compensator for robust quadrotor control
  publication-title: IEEE Transactions on Systems, Man, and Cybernetics: Systems.
– reference: Saraswat, S. P., & Gorgone, J. T. (1990). Organizational learning curve in software installation: An empirical investigation. Information and Management. 10.1016/0378-7206(90)90014-9.
– volume: 103
  start-page: 1167
  year: 2021
  end-page: 1193
  ident: b0065
  article-title: Chaos cloud quantum bat hybrid optimization algorithm
  publication-title: Nonlinear Dynamics
– reference: Sun, Y., Xue, B., Zhang, M., & Yen, G. G. (2020). Evolving Deep Convolutional Neural Networks for Image Classification. IEEE Transactions on Evolutionary Computation. 10.1109/TEVC.2019.2916183.
– reference: Tian, Z., Zhang, G., Hu, C., Lu, D., & Liu, H. (2020). Knowledge and emotion dual-driven method for crowd evacuation. Knowledge-Based Systems, 208. 10.1016/j.knosys.2020.106451.
– volume: 436–437
  start-page: 247
  year: 2018
  end-page: 267
  ident: b0100
  article-title: Crowd evacuation simulation approach based on navigation knowledge and two-layer control mechanism
  publication-title: Information Sciences
– volume: 68
  start-page: 360
  year: 2018
  end-page: 376
  ident: b0110
  article-title: A path planning approach for crowd evacuation in buildings based on improved artificial bee colony algorithm
  publication-title: Applied Soft Computing Journal
– reference: Goel, R., & Maini, R. (2018). A hybrid of ant colony and firefly algorithms (HAFA) for solving vehicle routing problems. Journal of Computational Science. 10.1016/j.jocs.2017.12.012.
– volume: 9
  start-page: 1
  year: 2016
  end-page: 13
  ident: b0020
  article-title: Electric load forecasting based on a least squares support vector machine with fuzzy time series and global harmony search algorithm
  publication-title: Energies
– reference: Singh, B., Kumar, R., & Singh, V. P. (2021). Reinforcement learning in robotic applications: a comprehensive survey. In Artificial Intelligence Review (Issue 0123456789). Springer Netherlands. 10.1007/s10462-021-09997-9.
– volume: 8
  start-page: 51
  year: 2016
  end-page: 66
  ident: b0060
  article-title: Socially Adaptive Path Planning in Human Environments Using Inverse Reinforcement Learning
  publication-title: International Journal of Social Robotics
– reference: Li, J. qing, Du, Y., Gao, K., Duan, P., Gong, D., & Pan, Q. (2021). A hybrid iterated greedy algorithm for a crane transportation flexible job shop problem. IEEE Transactions on Automation Science and Engineering. 10.1109/TASE.2021.3062979.
– volume: 7
  start-page: 1179
  year: 2020
  end-page: 1189
  ident: b0055
  article-title: Path planning for intelligent robots based on deep Q-learning with experience replay and heuristic knowledge
  publication-title: IEEE/CAA Journal of Automatica Sinica
– year: 2008
  ident: b0035
  article-title: Reciprocal velocity obstacles for real-time multi-agent navigation
  publication-title: Proceedings - IEEE International Conference on Robotics and Automation
– year: 2016
  ident: b0175
  article-title: Dueling Network Architectures for Deep Reinforcement Learning
  publication-title: 33rd International Conference on Machine Learning
– volume: 48
  start-page: 4889
  year: 2018
  end-page: 4904
  ident: b0220
  article-title: Asynchronous reinforcement learning algorithms for solving discrete space path planning problems
  publication-title: Applied Intelligence
– volume: 26
  start-page: 387
  year: 2015
  end-page: 395
  ident: b0185
  article-title: Guidance path scheduling using particle swarm optimization in crowd simulation
  publication-title: Computer Animation and Virtual Worlds
– volume: 571
  year: 2021
  ident: b0200
  article-title: Deep reinforcement learning with a particle dynamics environment applied to emergency evacuation of a room with obstacles
  publication-title: Physica A: Statistical Mechanics and Its Applications
– reference: Buşoniu, L., Babuška, R., & De Schutter, B. (2008). A comprehensive survey of multiagent reinforcement learning. In IEEE Transactions on Systems, Man and Cybernetics Part C: Applications and Reviews. 10.1109/TSMCC.2007.913919.
– reference: Lillicrap, T. P., Hunt, J. J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., & Wierstra, D. (2016). Continuous control with deep reinforcement learning. 4th International Conference on Learning Representations, ICLR 2016 .
– ident: 10.1016/j.cie.2021.107621_b0135
  doi: 10.1016/0378-7206(90)90014-9
– volume: 11
  start-page: 708
  issue: 4
  year: 2020
  ident: 10.1016/j.cie.2021.107621_b0215
  article-title: Strategies to utilize the positive emotional contagion optimally in crowd evacuation
  publication-title: IEEE Transactions on Affective Computing
  doi: 10.1109/TAFFC.2018.2836462
– volume: 7
  start-page: 1179
  issue: 4
  year: 2020
  ident: 10.1016/j.cie.2021.107621_b0055
  article-title: Path planning for intelligent robots based on deep Q-learning with experience replay and heuristic knowledge
  publication-title: IEEE/CAA Journal of Automatica Sinica
  doi: 10.1109/JAS.2019.1911732
– volume: 8
  start-page: 51
  issue: 1
  year: 2016
  ident: 10.1016/j.cie.2021.107621_b0060
  article-title: Socially Adaptive Path Planning in Human Environments Using Inverse Reinforcement Learning
  publication-title: International Journal of Social Robotics
  doi: 10.1007/s12369-015-0310-2
– volume: 48
  start-page: 4889
  issue: 12
  year: 2018
  ident: 10.1016/j.cie.2021.107621_b0220
  article-title: Asynchronous reinforcement learning algorithms for solving discrete space path planning problems
  publication-title: Applied Intelligence
  doi: 10.1007/s10489-018-1241-z
– ident: 10.1016/j.cie.2021.107621_b0030
  doi: 10.1109/TWC.2019.2935201
– volume: 84
  start-page: 190
  year: 2018
  ident: 10.1016/j.cie.2021.107621_b0105
  article-title: A social force evacuation model driven by video data
  publication-title: Simulation Modelling Practice and Theory
  doi: 10.1016/j.simpat.2018.02.007
– ident: 10.1016/j.cie.2021.107621_b0165
  doi: 10.1038/s41586-019-1724-z
– ident: 10.1016/j.cie.2021.107621_b0070
  doi: 10.1109/TFUZZ.2020.3016225
– ident: 10.1016/j.cie.2021.107621_b0045
  doi: 10.1016/j.jocs.2017.12.012
– ident: 10.1016/j.cie.2021.107621_b0005
  doi: 10.1016/j.automatica.2009.07.008
– volume: 6
  start-page: 1887
  issue: 4
  year: 2013
  ident: 10.1016/j.cie.2021.107621_b0040
  article-title: Support vector regression model based on empirical mode decomposition and auto regression for electric load forecasting
  publication-title: Energies
  doi: 10.3390/en6041887
– ident: 10.1016/j.cie.2021.107621_b0190
  doi: 10.1016/j.neucom.2019.08.021
– year: 2020
  ident: 10.1016/j.cie.2021.107621_b0180
  article-title: Deterministic policy gradient with integral compensator for robust quadrotor control
  publication-title: IEEE Transactions on Systems, Man, and Cybernetics: Systems.
  doi: 10.1109/TSMC.2018.2884725
– ident: 10.1016/j.cie.2021.107621_b0085
– ident: 10.1016/j.cie.2021.107621_b0090
  doi: 10.1016/j.patcog.2010.12.012
– volume: 518
  start-page: 529
  issue: 7540
  year: 2015
  ident: 10.1016/j.cie.2021.107621_b0125
  article-title: Human-level control through deep reinforcement learning
  publication-title: Nature
  doi: 10.1038/nature14236
– volume: 26
  start-page: 387
  issue: 3-4
  year: 2015
  ident: 10.1016/j.cie.2021.107621_b0185
  article-title: Guidance path scheduling using particle swarm optimization in crowd simulation
  publication-title: Computer Animation and Virtual Worlds
  doi: 10.1002/cav.1636
– volume: 51
  start-page: 100
  issue: 1
  year: 2021
  ident: 10.1016/j.cie.2021.107621_b0225
  article-title: An evacuation simulation method based on an improved artificial bee colony algorithm and a social force model
  publication-title: Applied Intelligence
  doi: 10.1007/s10489-020-01711-6
– ident: 10.1016/j.cie.2021.107621_b0010
  doi: 10.1016/j.asoc.2019.105741
– volume: 18
  start-page: 1527
  issue: 7
  year: 2006
  ident: 10.1016/j.cie.2021.107621_b0050
  article-title: A fast learning algorithm for deep belief nets
  publication-title: Neural Computation
  doi: 10.1162/neco.2006.18.7.1527
– ident: 10.1016/j.cie.2021.107621_b0145
  doi: 10.1007/s10462-021-09997-9
– volume: 571
  year: 2021
  ident: 10.1016/j.cie.2021.107621_b0200
  article-title: Deep reinforcement learning with a particle dynamics environment applied to emergency evacuation of a room with obstacles
  publication-title: Physica A: Statistical Mechanics and Its Applications
  doi: 10.1016/j.physa.2021.125845
– year: 2016
  ident: 10.1016/j.cie.2021.107621_b0175
  article-title: Dueling Network Architectures for Deep Reinforcement Learning
– ident: 10.1016/j.cie.2021.107621_b0150
  doi: 10.1109/TEVC.2019.2916183
– volume: 1–19
  year: 2020
  ident: 10.1016/j.cie.2021.107621_b0140
  article-title: Deep Q-learning with Q-matrix transfer learning for novel fire evacuation environment
  publication-title: IEEE Transactions on Systems, Man, and Cybernetics: Systems
– ident: 10.1016/j.cie.2021.107621_b0170
– ident: 10.1016/j.cie.2021.107621_b0095
  doi: 10.1016/j.aei.2016.04.005
– ident: 10.1016/j.cie.2021.107621_b0230
  doi: 10.1109/ACCESS.2019.2946659
– year: 2008
  ident: 10.1016/j.cie.2021.107621_b0035
  article-title: Reciprocal velocity obstacles for real-time multi-agent navigation
  publication-title: Proceedings - IEEE International Conference on Robotics and Automation
– ident: 10.1016/j.cie.2021.107621_b0115
  doi: 10.1016/j.robot.2019.02.013
– ident: 10.1016/j.cie.2021.107621_b0080
  doi: 10.1109/TASE.2021.3062979
– volume: 9
  start-page: 1
  issue: 2
  year: 2016
  ident: 10.1016/j.cie.2021.107621_b0020
  article-title: Electric load forecasting based on a least squares support vector machine with fuzzy time series and global harmony search algorithm
  publication-title: Energies
  doi: 10.3390/en9020070
– volume: 365
  start-page: 71
  year: 2019
  ident: 10.1016/j.cie.2021.107621_b0130
  article-title: A self-learning dynamic path planning method for evacuation in large public buildings based on neural networks
  publication-title: Neurocomputing
  doi: 10.1016/j.neucom.2019.06.099
– volume: 436–437
  start-page: 247
  issue: 88
  year: 2018
  ident: 10.1016/j.cie.2021.107621_b0100
  article-title: Crowd evacuation simulation approach based on navigation knowledge and two-layer control mechanism
  publication-title: Information Sciences
  doi: 10.1016/j.ins.2018.01.023
– volume: 492
  start-page: 1107
  year: 2018
  ident: 10.1016/j.cie.2021.107621_b0210
  article-title: Modified two-layer social force model for emergency earthquake evacuation
  publication-title: Physica A: Statistical Mechanics and Its Applications
  doi: 10.1016/j.physa.2017.11.041
– volume: 103
  start-page: 1167
  issue: 1
  year: 2021
  ident: 10.1016/j.cie.2021.107621_b0065
  article-title: Chaos cloud quantum bat hybrid optimization algorithm
  publication-title: Nonlinear Dynamics
  doi: 10.1007/s11071-020-06111-6
– volume: 404
  start-page: 173
  year: 2020
  ident: 10.1016/j.cie.2021.107621_b0195
  article-title: Learning crowd behavior from real data: A residual network method for crowd simulation
  publication-title: Neurocomputing
  doi: 10.1016/j.neucom.2020.04.141
– volume: 30
  start-page: 6379
  year: 2017
  ident: 10.1016/j.cie.2021.107621_b0120
  article-title: Multi-agent actor-critic for mixed cooperative-competitive environments
  publication-title: Advances in Neural Information Processing Systems.
– volume: 411
  start-page: 206
  year: 2020
  ident: 10.1016/j.cie.2021.107621_b0205
  article-title: A TD3-based multi-agent deep reinforcement learning method in mixed cooperation-competition environment
  publication-title: Neurocomputing
  doi: 10.1016/j.neucom.2020.05.097
– ident: 10.1016/j.cie.2021.107621_b0015
  doi: 10.1109/TSMCC.2007.913919
– year: 2016
  ident: 10.1016/j.cie.2021.107621_b0160
  article-title: Deep reinforcement learning with double Q-Learning
– volume: 68
  start-page: 360
  year: 2018
  ident: 10.1016/j.cie.2021.107621_b0110
  article-title: A path planning approach for crowd evacuation in buildings based on improved artificial bee colony algorithm
  publication-title: Applied Soft Computing Journal
  doi: 10.1016/j.asoc.2018.04.015
– ident: 10.1016/j.cie.2021.107621_b0155
  doi: 10.1016/j.knosys.2020.106451
– ident: 10.1016/j.cie.2021.107621_b0025
  doi: 10.1016/j.neucom.2016.08.108
– ident: 10.1016/j.cie.2021.107621_b0075
  doi: 10.1016/j.swevo.2019.100600
SSID ssj0004591
Score 2.462459
Snippet •We propose E-MADDPG algorithm with higher learning efficiency based on MADDPG.•We extract the motion trajectory from the pedestrian video to reduce the state...
SourceID crossref
elsevier
SourceType Enrichment Source
Index Database
Publisher
StartPage 107621
SubjectTerms Crowd simulation for evacuation
Deep reinforcement learning
Multi-agent reinforcement learning
Path planning
Title Deep deterministic policy gradient algorithm for crowd-evacuation path planning
URI https://dx.doi.org/10.1016/j.cie.2021.107621
Volume 161
WOSCitedRecordID wos000704356000015&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVESC
  databaseName: Elsevier SD Freedom Collection Journals 2021
  customDbUrl:
  eissn: 1879-0550
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0004591
  issn: 0360-8352
  databaseCode: AIEXJ
  dateStart: 19950101
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Lb9QwELaWLQc48Cggyks-cKLKKnGStX2soAgqVDgUaTlFjmO3WZXsdl8t_57xK9ndUkQPXKyVZTvRzpfJjPP5G4TealopnZRZREpoMk50xHUmI6MOJ2Oiy1xandkv9PiYjUb8W6-nw1mY1TltGnZ1xaf_1dTQB8Y2R2dvYe52UeiA32B0aMHs0P6T4T8oNd2vPMvFyjDbSgzy1_7pzPK7zCbv6WRWL85-WpIhuOLLKlIrIZ3st9FaPTPlpW01o_XoNZSAmFvA1F3VD9WpGrYMH0sTGNXNuO5YP_XSvuom28OOls3Ftbk_PHD9ngRJ_OG8znWlwzgysd2Gn3Wq695TQto5dGejrzlxt58wHoBzG5jVB93YTcHsrRdZSy8MzLVxAUsUZonCLXEH7RCac9ZHOwefD0dHa7ryrrZiuO_w_dsyAbfu488RzFpUcvIIPfDpBD5wMHiMeqrZRQ99aoG9457vovtrupNP0FeDEbyBEewwggNGcIsRDBjB2xjBBiM4YOQp-v7x8OT9p8hX1ogk4XQREclJRXSeQzydMsGGVSwlFRSSUcoFFUTFRGQxqapE07jKWKpKWooyYdCRwpxnqN9MGvUcYZ4xpeCZFgoCU5ZmXKQa0m6VKDnMIfffQ3H4t8Aajl5iqp-cFzdaaQ-9a6dMnebK3wZnwQSFDxpdMFgAnG6e9uI213iJ7nUof4X6i9lSvUZ35WpRz2dvPJZ-A84EjDE
linkProvider Elsevier
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Deep+deterministic+policy+gradient+algorithm+for+crowd-evacuation+path+planning&rft.jtitle=Computers+%26+industrial+engineering&rft.au=Li%2C+Xinjin&rft.au=Liu%2C+Hong&rft.au=Li%2C+Junqing&rft.au=Li%2C+Yan&rft.date=2021-11-01&rft.issn=0360-8352&rft.volume=161&rft.spage=107621&rft_id=info:doi/10.1016%2Fj.cie.2021.107621&rft.externalDBID=n%2Fa&rft.externalDocID=10_1016_j_cie_2021_107621
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0360-8352&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0360-8352&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0360-8352&client=summon