Improved deep reinforcement learning for car-following decision-making

Accuracy improvement of Car-following (CF) model has attracted much attention in recent years. Although a few studies incorporate deep reinforcement learning (DRL) to describe CF behaviors, proper design of reward function is still an intractable problem. This study improves the deep deterministic p...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:Physica A Ročník 624; s. 128912
Hlavní autori: Yang, Xiaoxue, Zou, Yajie, Zhang, Hao, Qu, Xiaobo, Chen, Lei
Médium: Journal Article
Jazyk:English
Vydavateľské údaje: Elsevier B.V 15.08.2023
Predmet:
ISSN:0378-4371, 1873-2119
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Abstract Accuracy improvement of Car-following (CF) model has attracted much attention in recent years. Although a few studies incorporate deep reinforcement learning (DRL) to describe CF behaviors, proper design of reward function is still an intractable problem. This study improves the deep deterministic policy gradient (DDPG) car-following model with stacked denoising autoencoders (SDAE), and proposes a data-driven reward representation function, which quantifies the implicit interaction between ego vehicle and preceding vehicle in car-following process. The experimental results demonstrate that DDPG-SDAE model has superior ability of imitating driving behavior: (1) validating effectiveness of the reward representation method with low deviation of trajectory; (2) demonstrating generalization ability on two different trajectory datasets (HighD and SPMD); (3) adapting to three traffic scenarios clustered by a dynamic time warping distance based k-medoids method. Compared with Recurrent Neural Networks (RNN) and intelligent driver model (IDM), DDPG-SDAE model shows better performance on the deviation of speed and relative distance. This study demonstrates superiority of a novel reward extraction method fusing SDAE into DDPG algorithm and provides inspiration for developing driving decision-making model.
AbstractList Accuracy improvement of Car-following (CF) model has attracted much attention in recent years. Although a few studies incorporate deep reinforcement learning (DRL) to describe CF behaviors, proper design of reward function is still an intractable problem. This study improves the deep deterministic policy gradient (DDPG) car-following model with stacked denoising autoencoders (SDAE), and proposes a data-driven reward representation function, which quantifies the implicit interaction between ego vehicle and preceding vehicle in car-following process. The experimental results demonstrate that DDPG-SDAE model has superior ability of imitating driving behavior: (1) validating effectiveness of the reward representation method with low deviation of trajectory; (2) demonstrating generalization ability on two different trajectory datasets (HighD and SPMD); (3) adapting to three traffic scenarios clustered by a dynamic time warping distance based k-medoids method. Compared with Recurrent Neural Networks (RNN) and intelligent driver model (IDM), DDPG-SDAE model shows better performance on the deviation of speed and relative distance. This study demonstrates superiority of a novel reward extraction method fusing SDAE into DDPG algorithm and provides inspiration for developing driving decision-making model. © 2023 Elsevier B.V.
Accuracy improvement of Car-following (CF) model has attracted much attention in recent years. Although a few studies incorporate deep reinforcement learning (DRL) to describe CF behaviors, proper design of reward function is still an intractable problem. This study improves the deep deterministic policy gradient (DDPG) car-following model with stacked denoising autoencoders (SDAE), and proposes a data-driven reward representation function, which quantifies the implicit interaction between ego vehicle and preceding vehicle in car-following process. The experimental results demonstrate that DDPG-SDAE model has superior ability of imitating driving behavior: (1) validating effectiveness of the reward representation method with low deviation of trajectory; (2) demonstrating generalization ability on two different trajectory datasets (HighD and SPMD); (3) adapting to three traffic scenarios clustered by a dynamic time warping distance based k-medoids method. Compared with Recurrent Neural Networks (RNN) and intelligent driver model (IDM), DDPG-SDAE model shows better performance on the deviation of speed and relative distance. This study demonstrates superiority of a novel reward extraction method fusing SDAE into DDPG algorithm and provides inspiration for developing driving decision-making model.
ArticleNumber 128912
Author Zou, Yajie
Qu, Xiaobo
Zhang, Hao
Yang, Xiaoxue
Chen, Lei
Author_xml – sequence: 1
  givenname: Xiaoxue
  orcidid: 0000-0002-8504-3207
  surname: Yang
  fullname: Yang, Xiaoxue
  organization: Key Laboratory of Road and Traffic Engineering of Ministry of Education, Tongji University, Shanghai, China
– sequence: 2
  givenname: Yajie
  surname: Zou
  fullname: Zou, Yajie
  email: yajiezou@hotmail.com
  organization: Key Laboratory of Road and Traffic Engineering of Ministry of Education, Tongji University, Shanghai, China
– sequence: 3
  givenname: Hao
  surname: Zhang
  fullname: Zhang, Hao
  organization: Key Laboratory of Road and Traffic Engineering of Ministry of Education, Tongji University, Shanghai, China
– sequence: 4
  givenname: Xiaobo
  surname: Qu
  fullname: Qu, Xiaobo
  organization: School of Vehicle and Mobility, Tsinghua University, China
– sequence: 5
  givenname: Lei
  orcidid: 0000-0001-9808-1483
  surname: Chen
  fullname: Chen, Lei
  organization: RISE Research Institutes of Sweden, Lindholmspiren 3 A, 417 56, Göteborg, Sweden
BackLink https://urn.kb.se/resolve?urn=urn:nbn:se:ri:diva-65535$$DView record from Swedish Publication Index
BookMark eNqFkD1PwzAQhj0UibbwC1iyoxR_1Ek8MFSFQqVKLMBqOfa5uCRxZIdW_fekBDEwwHS6u_c56Z4JGjW-AYSuCJ4RTLKb3ax9O0Y1o5iyGaGFIHSExpjlRTpnOTlHkxh3GGOSMzpGq3XdBr8HkxiANgngGuuDhhqaLqlAhcY126QfJVqF1Pqq8ofTxIB20fkmrdV731-gM6uqCJffdYpeVvfPy8d08_SwXi42qWZMdCkpCqEVs0Vhs5LzkkApMqNzy4UqFVFWUEHt3HCRm5wZwzEYriiwTNN5v2VTdD3cjQdoP0rZBlercJReOXnnXhfSh60MTmacM96n2ZDWwccYwP7kCZYnWXInv2TJkyw5yOop8YvSrlNd_2wXlKv-YW8HFnoJewdBRu2g0WBcAN1J492f_CdcYYzr
CitedBy_id crossref_primary_10_1155_2024_2442427
crossref_primary_10_3390_s25030644
crossref_primary_10_1007_s42154_024_00332_w
crossref_primary_10_1016_j_jatrs_2024_100001
crossref_primary_10_1016_j_commtr_2025_100164
crossref_primary_10_1016_j_trc_2024_104920
crossref_primary_10_1155_2023_8815106
crossref_primary_10_3390_su151813325
crossref_primary_10_1155_atr_5579549
crossref_primary_10_1007_s11071_024_10660_5
crossref_primary_10_1016_j_physa_2025_130904
crossref_primary_10_3390_su16146182
crossref_primary_10_12677_csa_2025_159224
Cites_doi 10.1016/S1369-8478(00)00005-X
10.3141/2188-10
10.1016/j.trb.2016.04.012
10.1016/j.trc.2017.08.027
10.1145/1390156.1390294
10.1016/j.physa.2019.122967
10.1016/j.physa.2022.127303
10.1109/TITS.2019.2942014
10.1016/j.trc.2019.08.011
10.1016/j.trc.2021.103165
10.1109/TSMCA.2012.2192262
10.1038/nature14236
10.1016/j.physa.2022.128196
10.1016/j.aap.2022.106780
10.1016/j.trc.2021.102980
10.1016/j.physa.2018.09.136
10.1016/j.trc.2014.09.008
10.1016/j.apenergy.2019.114030
10.1016/0893-6080(89)90020-8
10.1126/science.1127647
10.1162/NECO_a_00393
10.1016/j.trc.2018.10.024
10.1145/2339530.2339576
10.1145/3065386
10.1038/nature16961
10.1016/j.physa.2023.128747
10.1109/TITS.2017.2701846
10.1109/TITS.2018.2870525
10.1016/j.trb.2018.12.016
10.1137/1114019
10.1080/01441640600823940
ContentType Journal Article
Copyright 2023 Elsevier B.V.
Copyright_xml – notice: 2023 Elsevier B.V.
DBID AAYXX
CITATION
ADTPV
AOWAS
DOI 10.1016/j.physa.2023.128912
DatabaseName CrossRef
SwePub
SwePub Articles
DatabaseTitle CrossRef
DatabaseTitleList

DeliveryMethod fulltext_linktorsrc
Discipline Physics
ExternalDocumentID oai_DiVA_org_ri_65535
10_1016_j_physa_2023_128912
S0378437123004673
GroupedDBID --K
--M
-DZ
-~X
.~1
0R~
1B1
1RT
1~.
1~5
4.4
457
4G.
7-5
71M
8P~
9JN
9JO
AABNK
AACTN
AAEDT
AAEDW
AAIKJ
AAKOC
AALRI
AAOAW
AAPFB
AAQFI
AAXKI
AAXUO
ABAOU
ABMAC
ABNEU
ACDAQ
ACFVG
ACGFS
ACNCT
ACRLP
ADBBV
ADEZE
ADFHU
ADGUI
AEBSH
AEKER
AEYQN
AFFNX
AFJKZ
AFKWA
AFTJW
AGHFR
AGTHC
AGUBO
AGYEJ
AHHHB
AIEXJ
AIGVJ
AIIAU
AIKHN
AITUG
AIVDX
AJOXV
ALMA_UNASSIGNED_HOLDINGS
AMFUW
AMRAJ
ARUGR
AXJTR
AXLSJ
BKOJK
BLXMC
EBS
EFJIC
EO8
EO9
EP2
EP3
F5P
FDB
FIRID
FNPLU
FYGXN
G-Q
GBLVA
IHE
J1W
K-O
KOM
M38
M41
MHUIS
MO0
N9A
O-L
O9-
OAUVE
OGIMB
OZT
P-8
P-9
P2P
PC.
Q38
RNS
ROL
RPZ
SDF
SDG
SDP
SES
SPC
SPCBC
SPD
SSB
SSF
SSQ
SSW
SSZ
T5K
TN5
TWZ
WH7
XPP
YNT
ZMT
~02
~G-
29O
5VS
6TJ
9DU
AAFFL
AAQXK
AATTM
AAYWO
AAYXX
ABFNM
ABJNI
ABWVN
ABXDB
ACLOT
ACNNM
ACROA
ACRPL
ADMUD
ADNMO
ADVLN
AEIPS
AFODL
AGQPQ
AIIUN
AJWLA
ANKPU
APXCP
ASPBG
AVWKF
AZFZN
BBWZM
BEHZQ
BEZPJ
BGSCR
BNTGB
BPUDD
BULVW
BZJEE
CITATION
EFKBS
EFLBG
EJD
FEDTE
FGOYB
HMV
HVGLF
HZ~
MVM
NDZJH
R2-
SEW
SPG
VOH
WUQ
XOL
YYP
ZY4
~HD
ADTPV
AOWAS
ID FETCH-LOGICAL-c339t-1889ca3f88f6b55b1eb96dc7f59aba1af9292f4d597d73dd50ed5a2e36c24af93
ISICitedReferencesCount 12
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001159318300001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 0378-4371
1873-2119
IngestDate Tue Nov 04 16:09:40 EST 2025
Sat Nov 29 07:14:51 EST 2025
Tue Nov 18 22:07:57 EST 2025
Tue Dec 03 03:45:09 EST 2024
IsPeerReviewed true
IsScholarly true
Keywords Driving behavior imitation
Deep reinforcement learning
Stacked denoising autoencoders
Car-following model
Language English
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-c339t-1889ca3f88f6b55b1eb96dc7f59aba1af9292f4d597d73dd50ed5a2e36c24af93
ORCID 0000-0001-9808-1483
0000-0002-8504-3207
ParticipantIDs swepub_primary_oai_DiVA_org_ri_65535
crossref_primary_10_1016_j_physa_2023_128912
crossref_citationtrail_10_1016_j_physa_2023_128912
elsevier_sciencedirect_doi_10_1016_j_physa_2023_128912
PublicationCentury 2000
PublicationDate 2023-08-15
PublicationDateYYYYMMDD 2023-08-15
PublicationDate_xml – month: 08
  year: 2023
  text: 2023-08-15
  day: 15
PublicationDecade 2020
PublicationTitle Physica A
PublicationYear 2023
Publisher Elsevier B.V
Publisher_xml – name: Elsevier B.V
References Sutton, McAllester, Singh, Mansour (b27) 2000
Mnih, Kavukcuoglu, Silver, Rusu, Veness, Bellemare, Graves, Riedmiller, Fidjeland, Ostrovski (b12) 2015; 518
Saifuzzaman, Zheng (b5) 2014; 48
Brackstone, McDonald (b4) 1999; 2
Lillicrap, Hunt, Pritzel, Heess, Erez, Tassa (b30) 2015
Ossen (b48) 2008
Khodayari, Ghaffari, Kazemi, Braunstingl (b8) 2012; 42
Zhou, Yu, Qu (b17) 2020; 21
Guo, Angah, Liu, Ban (b22) 2021; 124
Kingma, Ba (b43) 2017
Yang, Zou, Chen (b3) 2022; 175
Punzo, Zheng, Montanino (b50) 2021; 128
P. Vincent, H. Larochelle, Y. Bengio, P.-A. Manzagol, Extracting and composing robust features with denoising autoencoders, in: Proc. 25th Int. Conf. Mach. Learn., 2008, pp. 1096–1103.
Ijspeert, Nakanishi, Hoffmann, Pastor, Schaal (b51) 2013; 25
Liao, Yu, Chen, Zhou, Li (b25) 2022
Lillicrap, Hunt, Pritzel, Heess, Erez, Tassa, Silver, Wierstra (b24) 2015
T. Rakthanmanon, B. Campana, A. Mueen, G. Batista, B. Westover, Q. Zhu, J. Zakaria, E. Keogh, Searching and mining trillions of time series subsequences under dynamic time warping, in: Proc. 18th ACM SIGKDD Int. Conf. Knowl. Discov. Data Min., 2012, pp. 262–270.
Qu, Yu, Zhou, Lin, Wang (b18) 2020; 257
Punzo, Montanino (b46) 2016; 91
Wang, Jiang, Li, Lin, Wang (b9) 2019; 514
Zhou, Qu, Li (b10) 2017; 84
Jiang, Xie, Chen, Li, Evans (b20) 2020
Silver, Huang, Maddison, Guez, Sifre, van den Driessche, Schrittwieser, Antonoglou, Panneershelvam, Lanctot, Dieleman, Grewe, Nham, Kalchbrenner, Sutskever, Lillicrap, Leach, Kavukcuoglu, Graepel, Hassabis (b15) 2016; 529
Mnih, Badia, Mirza, Graves, Lillicrap, Harley, Silver, Kavukcuoglu (b13) 2016
He, Lou, Yang, Lv (b21) 2022
Wu, Zou, Wu, Zhang (b47) 2023; 620
Wang, Xi, Zhao (b40) 2018; 20
Bezzina, Sayer (b35) 2014; 812
Krizhevsky, Sutskever, Hinton (b42) 2017; 60
Hinton, Salakhutdinov (b31) 2006; 313
Wang, Liu, Ci, Wu (b1) 2022; 607
Zhao, Huang, Peng, Lam, LeBlanc (b36) 2018; 19
Hausknecht, Stone (b14) 2016
Berndt, Clifford (b39) 1994
Aggarwal, Reddy (b37) 2013
V.R. Konda, J.N. Tsitsiklis, Actor-Critic Algorithms, (n.d.) 7.
Kreidieh, Wu, Bayen (b19) 2018
Sharma, Zheng, Bhaskar (b45) 2019; 120
Vincent, Larochelle, Lajoie, Bengio, Manzagol, Bottou (b33) 2010; 11
Lipton, Berkowitz, Elkan (b44) 2015
Silver, Lever, Heess, Degris, Wierstra, Riedmiller (b29) 2014
Krajewski, Bock, Kloeker, Eckstein (b34) 2018
Shi, Wu, Shi, Zhou, Ran (b7) 2022; 599
Ye, Zhang, Sun (b23) 2019; 107
Zhu, Wang, Wang (b26) 2018; 97
Mnih, Kavukcuoglu, Silver, Graves, Antonoglou, Wierstra, Riedmiller (b11) 2013
Wang, Wang, Chen, Jing (b49) 2010; 2188
Peng, Liu, Dennis (b2) 2020; 538
Toledo (b6) 2007; 27
Epanechnikov (b41) 1969; 14
Hornik, Stinchcombe, White (b16) 1989; 2
Ijspeert (10.1016/j.physa.2023.128912_b51) 2013; 25
Vincent (10.1016/j.physa.2023.128912_b33) 2010; 11
Hinton (10.1016/j.physa.2023.128912_b31) 2006; 313
Brackstone (10.1016/j.physa.2023.128912_b4) 1999; 2
Aggarwal (10.1016/j.physa.2023.128912_b37) 2013
Jiang (10.1016/j.physa.2023.128912_b20) 2020
Qu (10.1016/j.physa.2023.128912_b18) 2020; 257
Epanechnikov (10.1016/j.physa.2023.128912_b41) 1969; 14
Saifuzzaman (10.1016/j.physa.2023.128912_b5) 2014; 48
Mnih (10.1016/j.physa.2023.128912_b11) 2013
Liao (10.1016/j.physa.2023.128912_b25) 2022
Krizhevsky (10.1016/j.physa.2023.128912_b42) 2017; 60
Krajewski (10.1016/j.physa.2023.128912_b34) 2018
10.1016/j.physa.2023.128912_b32
Peng (10.1016/j.physa.2023.128912_b2) 2020; 538
Yang (10.1016/j.physa.2023.128912_b3) 2022; 175
Kreidieh (10.1016/j.physa.2023.128912_b19) 2018
10.1016/j.physa.2023.128912_b38
Khodayari (10.1016/j.physa.2023.128912_b8) 2012; 42
Zhu (10.1016/j.physa.2023.128912_b26) 2018; 97
Toledo (10.1016/j.physa.2023.128912_b6) 2007; 27
Silver (10.1016/j.physa.2023.128912_b15) 2016; 529
Wu (10.1016/j.physa.2023.128912_b47) 2023; 620
Guo (10.1016/j.physa.2023.128912_b22) 2021; 124
Mnih (10.1016/j.physa.2023.128912_b12) 2015; 518
Punzo (10.1016/j.physa.2023.128912_b46) 2016; 91
Shi (10.1016/j.physa.2023.128912_b7) 2022; 599
Sharma (10.1016/j.physa.2023.128912_b45) 2019; 120
Berndt (10.1016/j.physa.2023.128912_b39) 1994
Zhou (10.1016/j.physa.2023.128912_b17) 2020; 21
10.1016/j.physa.2023.128912_b28
Wang (10.1016/j.physa.2023.128912_b9) 2019; 514
Ossen (10.1016/j.physa.2023.128912_b48) 2008
Lillicrap (10.1016/j.physa.2023.128912_b30) 2015
Mnih (10.1016/j.physa.2023.128912_b13) 2016
Ye (10.1016/j.physa.2023.128912_b23) 2019; 107
Lipton (10.1016/j.physa.2023.128912_b44) 2015
Zhao (10.1016/j.physa.2023.128912_b36) 2018; 19
Wang (10.1016/j.physa.2023.128912_b49) 2010; 2188
He (10.1016/j.physa.2023.128912_b21) 2022
Wang (10.1016/j.physa.2023.128912_b40) 2018; 20
Hornik (10.1016/j.physa.2023.128912_b16) 1989; 2
Wang (10.1016/j.physa.2023.128912_b1) 2022; 607
Kingma (10.1016/j.physa.2023.128912_b43) 2017
Sutton (10.1016/j.physa.2023.128912_b27) 2000
Hausknecht (10.1016/j.physa.2023.128912_b14) 2016
Bezzina (10.1016/j.physa.2023.128912_b35) 2014; 812
Punzo (10.1016/j.physa.2023.128912_b50) 2021; 128
Zhou (10.1016/j.physa.2023.128912_b10) 2017; 84
Lillicrap (10.1016/j.physa.2023.128912_b24) 2015
Silver (10.1016/j.physa.2023.128912_b29) 2014
References_xml – volume: 60
  start-page: 84
  year: 2017
  end-page: 90
  ident: b42
  article-title: ImageNet classification with deep convolutional neural networks
  publication-title: Commun. ACM
– volume: 175
  year: 2022
  ident: b3
  article-title: Operation analysis of freeway mixed traffic flow based on catch-up coordination platoon
  publication-title: Accid. Anal. Prev.
– volume: 620
  year: 2023
  ident: b47
  article-title: Application of Bayesian model averaging for modeling time headway distribution
  publication-title: Phys. Stat. Mech. Appl.
– volume: 91
  start-page: 21
  year: 2016
  end-page: 33
  ident: b46
  article-title: Speed or spacing? Cumulative variables, and convolution of model errors and time in traffic flow models validation and calibration
  publication-title: Transp. Res. Part B Methodol.
– volume: 48
  start-page: 379
  year: 2014
  end-page: 403
  ident: b5
  article-title: Incorporating human-factors in car-following models: A review of recent developments and research needs
  publication-title: Transp. Res. Part C Emerg. Technol.
– start-page: 1928
  year: 2016
  end-page: 1937
  ident: b13
  article-title: Asynchronous methods for deep reinforcement learning
  publication-title: Int. Conf. Mach. Learn.
– volume: 97
  start-page: 348
  year: 2018
  end-page: 368
  ident: b26
  article-title: Human-like autonomous car-following model with deep reinforcement learning
  publication-title: Transp. Res. Part C Emerg. Technol.
– volume: 14
  start-page: 153
  year: 1969
  end-page: 158
  ident: b41
  article-title: Non-parametric estimation of a multivariate probability density
  publication-title: Theory Probab. Appl.
– volume: 538
  year: 2020
  ident: b2
  article-title: An improved car-following model with consideration of multiple preceding and following vehicles in a driver’s view
  publication-title: Phys. Stat. Mech. Appl.
– volume: 812
  start-page: 18
  year: 2014
  ident: b35
  article-title: Safety pilot model deployment: Test conductor team report
  publication-title: Rep. No DOT HS.
– volume: 529
  start-page: 484
  year: 2016
  end-page: 489
  ident: b15
  article-title: Mastering the game of go with deep neural networks and tree search
  publication-title: Nature
– volume: 25
  start-page: 328
  year: 2013
  end-page: 373
  ident: b51
  article-title: Dynamical movement primitives: learning attractor models for motor behaviors
  publication-title: Neural Comput.
– year: 2013
  ident: b37
  article-title: DATA CLUSTERING Algorithms and Applications
– start-page: 1057
  year: 2000
  end-page: 1063
  ident: b27
  article-title: Policy gradient methods for reinforcement learning with function approximation
  publication-title: Adv. Neural Inf. Process. Syst.
– year: 2015
  ident: b24
  article-title: Continuous control with deep reinforcement learning
– volume: 599
  year: 2022
  ident: b7
  article-title: An integrated car-following and lane changing vehicle trajectory prediction algorithm based on a deep neural network
  publication-title: Phys. Stat. Mech. Appl.
– volume: 11
  year: 2010
  ident: b33
  article-title: Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion
  publication-title: J. Mach. Learn. Res.
– year: 2022
  ident: b21
  article-title: Robust decision making for autonomous vehicles at highway on-ramps: A constrained adversarial reinforcement learning approach
  publication-title: IEEE Trans. Intell. Transp. Syst.
– volume: 27
  start-page: 65
  year: 2007
  end-page: 84
  ident: b6
  article-title: Driving behaviour: Models and challenges
  publication-title: Transp. Rev.
– volume: 42
  start-page: 1440
  year: 2012
  end-page: 1449
  ident: b8
  article-title: A modified car-following model based on a neural network model of the human driver effects
  publication-title: IEEE Trans. Syst. Man Cybern.-Part Syst. Hum.
– volume: 2
  start-page: 181
  year: 1999
  end-page: 196
  ident: b4
  article-title: Car-following: a historical review
  publication-title: Transp. Res. Part F Traffic Psychol. Behav.
– start-page: 359
  year: 1994
  end-page: 370
  ident: b39
  article-title: Using dynamic time warping to find patterns in time series
  publication-title: Proc. 3rd Int. Conf. Knowl. Discov. Data Min
– volume: 2
  start-page: 359
  year: 1989
  end-page: 366
  ident: b16
  article-title: Multilayer feedforward networks are universal approximators
  publication-title: Neural Netw.
– start-page: 387
  year: 2014
  end-page: 395
  ident: b29
  article-title: Deterministic policy gradient algorithms
  publication-title: Int. Conf. Mach. Learn.
– volume: 514
  start-page: 786
  year: 2019
  end-page: 795
  ident: b9
  article-title: Long memory is important: A test study on deep-learning based car-following model
  publication-title: Phys. Stat. Mech. Appl.
– reference: P. Vincent, H. Larochelle, Y. Bengio, P.-A. Manzagol, Extracting and composing robust features with denoising autoencoders, in: Proc. 25th Int. Conf. Mach. Learn., 2008, pp. 1096–1103.
– volume: 128
  year: 2021
  ident: b50
  article-title: About calibration of car-following dynamics of automated and human-driven vehicles: Methodology, guidelines and codes
  publication-title: Transp. Res. Part C Emerg. Technol.
– volume: 124
  year: 2021
  ident: b22
  article-title: Hybrid deep reinforcement learning based eco-driving for low-level connected and automated vehicles along signalized corridors
  publication-title: Transp. Res. Part C Emerg. Technol.
– year: 2015
  ident: b30
  article-title: Continuous control with deep reinforcement learning
– year: 2008
  ident: b48
  article-title: Longitudinal driving behavior: theory and empirics
– volume: 2188
  start-page: 85
  year: 2010
  end-page: 95
  ident: b49
  article-title: Using trajectory data to analyze intradriver heterogeneity in car-following
  publication-title: Transp. Res. Rec.
– volume: 313
  start-page: 504
  year: 2006
  end-page: 507
  ident: b31
  article-title: Reducing the dimensionality of data with neural networks
  publication-title: Science
– volume: 120
  start-page: 49
  year: 2019
  end-page: 75
  ident: b45
  article-title: Is more always better? The impact of vehicular trajectory completeness on car-following model calibration and validation
  publication-title: Transp. Res. Part B Methodol.
– reference: V.R. Konda, J.N. Tsitsiklis, Actor-Critic Algorithms, (n.d.) 7.
– year: 2017
  ident: b43
  article-title: Adam: A method for stochastic optimization
– volume: 21
  start-page: 433
  year: 2020
  end-page: 443
  ident: b17
  article-title: Development of an efficient driving strategy for connected and automated vehicles at signalized intersections: A reinforcement learning approach
  publication-title: IEEE Trans. Intell. Transp. Syst.
– volume: 607
  year: 2022
  ident: b1
  article-title: Effect of front two adjacent vehicles’ velocity information on car-following model construction and stability analysis
  publication-title: Phys. Stat. Mech. Appl.
– volume: 20
  start-page: 2986
  year: 2018
  end-page: 2998
  ident: b40
  article-title: Driving style analysis using primitive driving patterns with Bayesian nonparametric approaches
  publication-title: IEEE Trans. Intell. Transp. Syst.
– volume: 518
  start-page: 529
  year: 2015
  end-page: 533
  ident: b12
  article-title: Human-level control through deep reinforcement learning
  publication-title: Nature
– volume: 19
  start-page: 733
  year: 2018
  end-page: 744
  ident: b36
  article-title: Accelerated evaluation of automated vehicles in car-following maneuvers
  publication-title: IEEE Trans. Intell. Transp. Syst.
– volume: 107
  start-page: 155
  year: 2019
  end-page: 170
  ident: b23
  article-title: Automated vehicle’s behavior decision making using deep reinforcement learning and high-fidelity simulation environment
  publication-title: Transp. Res. Part C Emerg. Technol.
– volume: 84
  start-page: 245
  year: 2017
  end-page: 264
  ident: b10
  article-title: A recurrent neural network based microscopic car following model to predict traffic oscillation
  publication-title: Transp. Res. Part C Emerg. Technol.
– year: 2013
  ident: b11
  article-title: Playing atari with deep reinforcement learning
– year: 2015
  ident: b44
  article-title: A critical review of recurrent neural networks for sequence learning
– year: 2016
  ident: b14
  article-title: Deep reinforcement learning in parameterized action space
– reference: T. Rakthanmanon, B. Campana, A. Mueen, G. Batista, B. Westover, Q. Zhu, J. Zakaria, E. Keogh, Searching and mining trillions of time series subsequences under dynamic time warping, in: Proc. 18th ACM SIGKDD Int. Conf. Knowl. Discov. Data Min., 2012, pp. 262–270.
– start-page: 1475
  year: 2018
  end-page: 1480
  ident: b19
  article-title: Dissipating stop-and-go waves in closed and open networks via deep reinforcement learning
  publication-title: 2018 21st Int. Conf. Intell. Transp. Syst. ITSC
– start-page: 2118
  year: 2018
  end-page: 2125
  ident: b34
  article-title: The highd dataset: A drone dataset of naturalistic vehicle trajectories on german highways for validation of highly automated driving systems
  publication-title: 2018 21st Int. Conf. Intell. Transp. Syst. ITSC
– volume: 257
  year: 2020
  ident: b18
  article-title: Jointly dampening traffic oscillations and improving energy consumption with electric, connected and automated vehicles: A reinforcement learning based approach
  publication-title: Appl. Energy.
– year: 2020
  ident: b20
  article-title: Dampen the stop-and-go traffic with connected and automated vehicles–a deep reinforcement learning approach
– start-page: 1
  year: 2022
  end-page: 29
  ident: b25
  article-title: Modelling personalised car-following behaviour: a memory-based deep reinforcement learning approach
  publication-title: Transp. Transp. Sci.
– year: 2013
  ident: 10.1016/j.physa.2023.128912_b11
– volume: 2
  start-page: 181
  year: 1999
  ident: 10.1016/j.physa.2023.128912_b4
  article-title: Car-following: a historical review
  publication-title: Transp. Res. Part F Traffic Psychol. Behav.
  doi: 10.1016/S1369-8478(00)00005-X
– start-page: 387
  year: 2014
  ident: 10.1016/j.physa.2023.128912_b29
  article-title: Deterministic policy gradient algorithms
– start-page: 1928
  year: 2016
  ident: 10.1016/j.physa.2023.128912_b13
  article-title: Asynchronous methods for deep reinforcement learning
– start-page: 1
  year: 2022
  ident: 10.1016/j.physa.2023.128912_b25
  article-title: Modelling personalised car-following behaviour: a memory-based deep reinforcement learning approach
  publication-title: Transp. Transp. Sci.
– volume: 2188
  start-page: 85
  year: 2010
  ident: 10.1016/j.physa.2023.128912_b49
  article-title: Using trajectory data to analyze intradriver heterogeneity in car-following
  publication-title: Transp. Res. Rec.
  doi: 10.3141/2188-10
– volume: 11
  year: 2010
  ident: 10.1016/j.physa.2023.128912_b33
  article-title: Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion
  publication-title: J. Mach. Learn. Res.
– year: 2020
  ident: 10.1016/j.physa.2023.128912_b20
– volume: 812
  start-page: 18
  year: 2014
  ident: 10.1016/j.physa.2023.128912_b35
  article-title: Safety pilot model deployment: Test conductor team report
  publication-title: Rep. No DOT HS.
– volume: 91
  start-page: 21
  year: 2016
  ident: 10.1016/j.physa.2023.128912_b46
  article-title: Speed or spacing? Cumulative variables, and convolution of model errors and time in traffic flow models validation and calibration
  publication-title: Transp. Res. Part B Methodol.
  doi: 10.1016/j.trb.2016.04.012
– volume: 84
  start-page: 245
  year: 2017
  ident: 10.1016/j.physa.2023.128912_b10
  article-title: A recurrent neural network based microscopic car following model to predict traffic oscillation
  publication-title: Transp. Res. Part C Emerg. Technol.
  doi: 10.1016/j.trc.2017.08.027
– start-page: 1057
  year: 2000
  ident: 10.1016/j.physa.2023.128912_b27
  article-title: Policy gradient methods for reinforcement learning with function approximation
– ident: 10.1016/j.physa.2023.128912_b32
  doi: 10.1145/1390156.1390294
– volume: 538
  year: 2020
  ident: 10.1016/j.physa.2023.128912_b2
  article-title: An improved car-following model with consideration of multiple preceding and following vehicles in a driver’s view
  publication-title: Phys. Stat. Mech. Appl.
  doi: 10.1016/j.physa.2019.122967
– volume: 599
  year: 2022
  ident: 10.1016/j.physa.2023.128912_b7
  article-title: An integrated car-following and lane changing vehicle trajectory prediction algorithm based on a deep neural network
  publication-title: Phys. Stat. Mech. Appl.
  doi: 10.1016/j.physa.2022.127303
– volume: 21
  start-page: 433
  year: 2020
  ident: 10.1016/j.physa.2023.128912_b17
  article-title: Development of an efficient driving strategy for connected and automated vehicles at signalized intersections: A reinforcement learning approach
  publication-title: IEEE Trans. Intell. Transp. Syst.
  doi: 10.1109/TITS.2019.2942014
– start-page: 2118
  year: 2018
  ident: 10.1016/j.physa.2023.128912_b34
  article-title: The highd dataset: A drone dataset of naturalistic vehicle trajectories on german highways for validation of highly automated driving systems
– volume: 107
  start-page: 155
  year: 2019
  ident: 10.1016/j.physa.2023.128912_b23
  article-title: Automated vehicle’s behavior decision making using deep reinforcement learning and high-fidelity simulation environment
  publication-title: Transp. Res. Part C Emerg. Technol.
  doi: 10.1016/j.trc.2019.08.011
– volume: 128
  year: 2021
  ident: 10.1016/j.physa.2023.128912_b50
  article-title: About calibration of car-following dynamics of automated and human-driven vehicles: Methodology, guidelines and codes
  publication-title: Transp. Res. Part C Emerg. Technol.
  doi: 10.1016/j.trc.2021.103165
– volume: 42
  start-page: 1440
  year: 2012
  ident: 10.1016/j.physa.2023.128912_b8
  article-title: A modified car-following model based on a neural network model of the human driver effects
  publication-title: IEEE Trans. Syst. Man Cybern.-Part Syst. Hum.
  doi: 10.1109/TSMCA.2012.2192262
– volume: 518
  start-page: 529
  year: 2015
  ident: 10.1016/j.physa.2023.128912_b12
  article-title: Human-level control through deep reinforcement learning
  publication-title: Nature
  doi: 10.1038/nature14236
– volume: 607
  year: 2022
  ident: 10.1016/j.physa.2023.128912_b1
  article-title: Effect of front two adjacent vehicles’ velocity information on car-following model construction and stability analysis
  publication-title: Phys. Stat. Mech. Appl.
  doi: 10.1016/j.physa.2022.128196
– start-page: 359
  year: 1994
  ident: 10.1016/j.physa.2023.128912_b39
  article-title: Using dynamic time warping to find patterns in time series
– volume: 175
  year: 2022
  ident: 10.1016/j.physa.2023.128912_b3
  article-title: Operation analysis of freeway mixed traffic flow based on catch-up coordination platoon
  publication-title: Accid. Anal. Prev.
  doi: 10.1016/j.aap.2022.106780
– ident: 10.1016/j.physa.2023.128912_b28
– volume: 124
  year: 2021
  ident: 10.1016/j.physa.2023.128912_b22
  article-title: Hybrid deep reinforcement learning based eco-driving for low-level connected and automated vehicles along signalized corridors
  publication-title: Transp. Res. Part C Emerg. Technol.
  doi: 10.1016/j.trc.2021.102980
– volume: 514
  start-page: 786
  year: 2019
  ident: 10.1016/j.physa.2023.128912_b9
  article-title: Long memory is important: A test study on deep-learning based car-following model
  publication-title: Phys. Stat. Mech. Appl.
  doi: 10.1016/j.physa.2018.09.136
– year: 2015
  ident: 10.1016/j.physa.2023.128912_b24
– volume: 48
  start-page: 379
  year: 2014
  ident: 10.1016/j.physa.2023.128912_b5
  article-title: Incorporating human-factors in car-following models: A review of recent developments and research needs
  publication-title: Transp. Res. Part C Emerg. Technol.
  doi: 10.1016/j.trc.2014.09.008
– volume: 257
  year: 2020
  ident: 10.1016/j.physa.2023.128912_b18
  article-title: Jointly dampening traffic oscillations and improving energy consumption with electric, connected and automated vehicles: A reinforcement learning based approach
  publication-title: Appl. Energy.
  doi: 10.1016/j.apenergy.2019.114030
– volume: 2
  start-page: 359
  year: 1989
  ident: 10.1016/j.physa.2023.128912_b16
  article-title: Multilayer feedforward networks are universal approximators
  publication-title: Neural Netw.
  doi: 10.1016/0893-6080(89)90020-8
– volume: 313
  start-page: 504
  year: 2006
  ident: 10.1016/j.physa.2023.128912_b31
  article-title: Reducing the dimensionality of data with neural networks
  publication-title: Science
  doi: 10.1126/science.1127647
– volume: 25
  start-page: 328
  year: 2013
  ident: 10.1016/j.physa.2023.128912_b51
  article-title: Dynamical movement primitives: learning attractor models for motor behaviors
  publication-title: Neural Comput.
  doi: 10.1162/NECO_a_00393
– year: 2008
  ident: 10.1016/j.physa.2023.128912_b48
– year: 2015
  ident: 10.1016/j.physa.2023.128912_b30
– year: 2013
  ident: 10.1016/j.physa.2023.128912_b37
– year: 2016
  ident: 10.1016/j.physa.2023.128912_b14
– volume: 97
  start-page: 348
  year: 2018
  ident: 10.1016/j.physa.2023.128912_b26
  article-title: Human-like autonomous car-following model with deep reinforcement learning
  publication-title: Transp. Res. Part C Emerg. Technol.
  doi: 10.1016/j.trc.2018.10.024
– ident: 10.1016/j.physa.2023.128912_b38
  doi: 10.1145/2339530.2339576
– volume: 60
  start-page: 84
  year: 2017
  ident: 10.1016/j.physa.2023.128912_b42
  article-title: ImageNet classification with deep convolutional neural networks
  publication-title: Commun. ACM
  doi: 10.1145/3065386
– volume: 529
  start-page: 484
  year: 2016
  ident: 10.1016/j.physa.2023.128912_b15
  article-title: Mastering the game of go with deep neural networks and tree search
  publication-title: Nature
  doi: 10.1038/nature16961
– volume: 620
  year: 2023
  ident: 10.1016/j.physa.2023.128912_b47
  article-title: Application of Bayesian model averaging for modeling time headway distribution
  publication-title: Phys. Stat. Mech. Appl.
  doi: 10.1016/j.physa.2023.128747
– volume: 19
  start-page: 733
  year: 2018
  ident: 10.1016/j.physa.2023.128912_b36
  article-title: Accelerated evaluation of automated vehicles in car-following maneuvers
  publication-title: IEEE Trans. Intell. Transp. Syst.
  doi: 10.1109/TITS.2017.2701846
– volume: 20
  start-page: 2986
  year: 2018
  ident: 10.1016/j.physa.2023.128912_b40
  article-title: Driving style analysis using primitive driving patterns with Bayesian nonparametric approaches
  publication-title: IEEE Trans. Intell. Transp. Syst.
  doi: 10.1109/TITS.2018.2870525
– volume: 120
  start-page: 49
  year: 2019
  ident: 10.1016/j.physa.2023.128912_b45
  article-title: Is more always better? The impact of vehicular trajectory completeness on car-following model calibration and validation
  publication-title: Transp. Res. Part B Methodol.
  doi: 10.1016/j.trb.2018.12.016
– volume: 14
  start-page: 153
  year: 1969
  ident: 10.1016/j.physa.2023.128912_b41
  article-title: Non-parametric estimation of a multivariate probability density
  publication-title: Theory Probab. Appl.
  doi: 10.1137/1114019
– start-page: 1475
  year: 2018
  ident: 10.1016/j.physa.2023.128912_b19
  article-title: Dissipating stop-and-go waves in closed and open networks via deep reinforcement learning
– volume: 27
  start-page: 65
  year: 2007
  ident: 10.1016/j.physa.2023.128912_b6
  article-title: Driving behaviour: Models and challenges
  publication-title: Transp. Rev.
  doi: 10.1080/01441640600823940
– year: 2022
  ident: 10.1016/j.physa.2023.128912_b21
  article-title: Robust decision making for autonomous vehicles at highway on-ramps: A constrained adversarial reinforcement learning approach
  publication-title: IEEE Trans. Intell. Transp. Syst.
– year: 2015
  ident: 10.1016/j.physa.2023.128912_b44
– year: 2017
  ident: 10.1016/j.physa.2023.128912_b43
SSID ssj0001732
Score 2.4811473
Snippet Accuracy improvement of Car-following (CF) model has attracted much attention in recent years. Although a few studies incorporate deep reinforcement learning...
SourceID swepub
crossref
elsevier
SourceType Open Access Repository
Enrichment Source
Index Database
Publisher
StartPage 128912
SubjectTerms Auto encoders
Behavioral research
Car-following model
Car-following modeling
De-noising
Decision making
Deep reinforcement learning
Deterministics
Driving behavior imitation
Driving behaviour
Learning systems
Policy gradient
Recurrent neural networks
Reinforcement learning
Reinforcement learnings
Stacked denoising autoencoder
Stacked denoising autoencoders
Title Improved deep reinforcement learning for car-following decision-making
URI https://dx.doi.org/10.1016/j.physa.2023.128912
https://urn.kb.se/resolve?urn=urn:nbn:se:ri:diva-65535
Volume 624
WOSCitedRecordID wos001159318300001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVESC
  databaseName: Elsevier SD Freedom Collection Journals 2021
  issn: 1873-2119
  databaseCode: AIEXJ
  dateStart: 19950101
  customDbUrl:
  isFulltext: true
  dateEnd: 99991231
  titleUrlDefault: https://www.sciencedirect.com
  omitProxy: false
  ssIdentifier: ssj0001732
  providerName: Elsevier
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Lb9QwELaWFiQuiKcoBZRDb5AqGz8SH1fQqiBU9VDQwsXyKyirNlmlu2V_fsdxnOy2YkUPXKJoNp618jnj8efxDEIHElMlk4LGiaU4Joowt7-buD14mhisx9LKtthEdnqaT6f8bDT6Fs7CXF9kVZWvVnz-X6EGGYDtjs7eA-5eKQjgHkCHK8AO138C3tME4Ecaa-cfGtumRtUtCxhqRPjYSS2buIBhUP9xEtMV24kv2_pU607rmcdyoD1_diTztJT1atkPjV_1srXoclYOskBIn8h6IFlDY1Wvsw4pdjSqP3cZTlvB6pNgXz4lWFKWkjVbCDMf9yHSd8y0Zwxmh469ccmfUnw4PL2ZFPvWZNWHEIbotJlolQinRHglD9BumlEONm538uVo-rWfmccZ9rtKXd9DFqo23u9OX_7qqaynlG3dkPOn6Em3fogmHvdnaGSr5-iRR-jqBToO6EcO_WgD_SigH4Eo2kA_uoX-S_T9-Oj800nclcqINcZ8EY_znGuJizwvmKJUja3izOisoFwqOZYFeMFpQQwsH02GjaGJNVSmFjOdEvgVv0I7VV3Z1yiSqbSYW1JgkhOrlWLMGK0KwlluQMkeSsNrEbrLI-_KmVyILZDsoY99o7lPo7L9cRbet-g8Qe_hCRhB2xseeHT6f3Hp0z-XPyaibn6LphSMUkzf3K87--jx8A28RTuLZmnfoYf6elFeNe-7UXYDMymOPQ
linkProvider Elsevier
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Improved+deep+reinforcement+learning+for+car-following+decision-making&rft.jtitle=Physica+A&rft.au=Yang%2C+Xiaoxue&rft.au=Zou%2C+Yajie&rft.au=Zhang%2C+Hao&rft.au=Qu%2C+Xiaobo&rft.date=2023-08-15&rft.issn=0378-4371&rft.volume=624&rft.spage=128912&rft_id=info:doi/10.1016%2Fj.physa.2023.128912&rft.externalDBID=n%2Fa&rft.externalDocID=10_1016_j_physa_2023_128912
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0378-4371&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0378-4371&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0378-4371&client=summon