Improved deep reinforcement learning for car-following decision-making
Accuracy improvement of Car-following (CF) model has attracted much attention in recent years. Although a few studies incorporate deep reinforcement learning (DRL) to describe CF behaviors, proper design of reward function is still an intractable problem. This study improves the deep deterministic p...
Uložené v:
| Vydané v: | Physica A Ročník 624; s. 128912 |
|---|---|
| Hlavní autori: | , , , , |
| Médium: | Journal Article |
| Jazyk: | English |
| Vydavateľské údaje: |
Elsevier B.V
15.08.2023
|
| Predmet: | |
| ISSN: | 0378-4371, 1873-2119 |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Abstract | Accuracy improvement of Car-following (CF) model has attracted much attention in recent years. Although a few studies incorporate deep reinforcement learning (DRL) to describe CF behaviors, proper design of reward function is still an intractable problem. This study improves the deep deterministic policy gradient (DDPG) car-following model with stacked denoising autoencoders (SDAE), and proposes a data-driven reward representation function, which quantifies the implicit interaction between ego vehicle and preceding vehicle in car-following process. The experimental results demonstrate that DDPG-SDAE model has superior ability of imitating driving behavior: (1) validating effectiveness of the reward representation method with low deviation of trajectory; (2) demonstrating generalization ability on two different trajectory datasets (HighD and SPMD); (3) adapting to three traffic scenarios clustered by a dynamic time warping distance based k-medoids method. Compared with Recurrent Neural Networks (RNN) and intelligent driver model (IDM), DDPG-SDAE model shows better performance on the deviation of speed and relative distance. This study demonstrates superiority of a novel reward extraction method fusing SDAE into DDPG algorithm and provides inspiration for developing driving decision-making model. |
|---|---|
| AbstractList | Accuracy improvement of Car-following (CF) model has attracted much attention in recent years. Although a few studies incorporate deep reinforcement learning (DRL) to describe CF behaviors, proper design of reward function is still an intractable problem. This study improves the deep deterministic policy gradient (DDPG) car-following model with stacked denoising autoencoders (SDAE), and proposes a data-driven reward representation function, which quantifies the implicit interaction between ego vehicle and preceding vehicle in car-following process. The experimental results demonstrate that DDPG-SDAE model has superior ability of imitating driving behavior: (1) validating effectiveness of the reward representation method with low deviation of trajectory; (2) demonstrating generalization ability on two different trajectory datasets (HighD and SPMD); (3) adapting to three traffic scenarios clustered by a dynamic time warping distance based k-medoids method. Compared with Recurrent Neural Networks (RNN) and intelligent driver model (IDM), DDPG-SDAE model shows better performance on the deviation of speed and relative distance. This study demonstrates superiority of a novel reward extraction method fusing SDAE into DDPG algorithm and provides inspiration for developing driving decision-making model. © 2023 Elsevier B.V. Accuracy improvement of Car-following (CF) model has attracted much attention in recent years. Although a few studies incorporate deep reinforcement learning (DRL) to describe CF behaviors, proper design of reward function is still an intractable problem. This study improves the deep deterministic policy gradient (DDPG) car-following model with stacked denoising autoencoders (SDAE), and proposes a data-driven reward representation function, which quantifies the implicit interaction between ego vehicle and preceding vehicle in car-following process. The experimental results demonstrate that DDPG-SDAE model has superior ability of imitating driving behavior: (1) validating effectiveness of the reward representation method with low deviation of trajectory; (2) demonstrating generalization ability on two different trajectory datasets (HighD and SPMD); (3) adapting to three traffic scenarios clustered by a dynamic time warping distance based k-medoids method. Compared with Recurrent Neural Networks (RNN) and intelligent driver model (IDM), DDPG-SDAE model shows better performance on the deviation of speed and relative distance. This study demonstrates superiority of a novel reward extraction method fusing SDAE into DDPG algorithm and provides inspiration for developing driving decision-making model. |
| ArticleNumber | 128912 |
| Author | Zou, Yajie Qu, Xiaobo Zhang, Hao Yang, Xiaoxue Chen, Lei |
| Author_xml | – sequence: 1 givenname: Xiaoxue orcidid: 0000-0002-8504-3207 surname: Yang fullname: Yang, Xiaoxue organization: Key Laboratory of Road and Traffic Engineering of Ministry of Education, Tongji University, Shanghai, China – sequence: 2 givenname: Yajie surname: Zou fullname: Zou, Yajie email: yajiezou@hotmail.com organization: Key Laboratory of Road and Traffic Engineering of Ministry of Education, Tongji University, Shanghai, China – sequence: 3 givenname: Hao surname: Zhang fullname: Zhang, Hao organization: Key Laboratory of Road and Traffic Engineering of Ministry of Education, Tongji University, Shanghai, China – sequence: 4 givenname: Xiaobo surname: Qu fullname: Qu, Xiaobo organization: School of Vehicle and Mobility, Tsinghua University, China – sequence: 5 givenname: Lei orcidid: 0000-0001-9808-1483 surname: Chen fullname: Chen, Lei organization: RISE Research Institutes of Sweden, Lindholmspiren 3 A, 417 56, Göteborg, Sweden |
| BackLink | https://urn.kb.se/resolve?urn=urn:nbn:se:ri:diva-65535$$DView record from Swedish Publication Index |
| BookMark | eNqFkD1PwzAQhj0UibbwC1iyoxR_1Ek8MFSFQqVKLMBqOfa5uCRxZIdW_fekBDEwwHS6u_c56Z4JGjW-AYSuCJ4RTLKb3ax9O0Y1o5iyGaGFIHSExpjlRTpnOTlHkxh3GGOSMzpGq3XdBr8HkxiANgngGuuDhhqaLqlAhcY126QfJVqF1Pqq8ofTxIB20fkmrdV731-gM6uqCJffdYpeVvfPy8d08_SwXi42qWZMdCkpCqEVs0Vhs5LzkkApMqNzy4UqFVFWUEHt3HCRm5wZwzEYriiwTNN5v2VTdD3cjQdoP0rZBlercJReOXnnXhfSh60MTmacM96n2ZDWwccYwP7kCZYnWXInv2TJkyw5yOop8YvSrlNd_2wXlKv-YW8HFnoJewdBRu2g0WBcAN1J492f_CdcYYzr |
| CitedBy_id | crossref_primary_10_1155_2024_2442427 crossref_primary_10_3390_s25030644 crossref_primary_10_1007_s42154_024_00332_w crossref_primary_10_1016_j_jatrs_2024_100001 crossref_primary_10_1016_j_commtr_2025_100164 crossref_primary_10_1016_j_trc_2024_104920 crossref_primary_10_1155_2023_8815106 crossref_primary_10_3390_su151813325 crossref_primary_10_1155_atr_5579549 crossref_primary_10_1007_s11071_024_10660_5 crossref_primary_10_1016_j_physa_2025_130904 crossref_primary_10_3390_su16146182 crossref_primary_10_12677_csa_2025_159224 |
| Cites_doi | 10.1016/S1369-8478(00)00005-X 10.3141/2188-10 10.1016/j.trb.2016.04.012 10.1016/j.trc.2017.08.027 10.1145/1390156.1390294 10.1016/j.physa.2019.122967 10.1016/j.physa.2022.127303 10.1109/TITS.2019.2942014 10.1016/j.trc.2019.08.011 10.1016/j.trc.2021.103165 10.1109/TSMCA.2012.2192262 10.1038/nature14236 10.1016/j.physa.2022.128196 10.1016/j.aap.2022.106780 10.1016/j.trc.2021.102980 10.1016/j.physa.2018.09.136 10.1016/j.trc.2014.09.008 10.1016/j.apenergy.2019.114030 10.1016/0893-6080(89)90020-8 10.1126/science.1127647 10.1162/NECO_a_00393 10.1016/j.trc.2018.10.024 10.1145/2339530.2339576 10.1145/3065386 10.1038/nature16961 10.1016/j.physa.2023.128747 10.1109/TITS.2017.2701846 10.1109/TITS.2018.2870525 10.1016/j.trb.2018.12.016 10.1137/1114019 10.1080/01441640600823940 |
| ContentType | Journal Article |
| Copyright | 2023 Elsevier B.V. |
| Copyright_xml | – notice: 2023 Elsevier B.V. |
| DBID | AAYXX CITATION ADTPV AOWAS |
| DOI | 10.1016/j.physa.2023.128912 |
| DatabaseName | CrossRef SwePub SwePub Articles |
| DatabaseTitle | CrossRef |
| DatabaseTitleList | |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Physics |
| ExternalDocumentID | oai_DiVA_org_ri_65535 10_1016_j_physa_2023_128912 S0378437123004673 |
| GroupedDBID | --K --M -DZ -~X .~1 0R~ 1B1 1RT 1~. 1~5 4.4 457 4G. 7-5 71M 8P~ 9JN 9JO AABNK AACTN AAEDT AAEDW AAIKJ AAKOC AALRI AAOAW AAPFB AAQFI AAXKI AAXUO ABAOU ABMAC ABNEU ACDAQ ACFVG ACGFS ACNCT ACRLP ADBBV ADEZE ADFHU ADGUI AEBSH AEKER AEYQN AFFNX AFJKZ AFKWA AFTJW AGHFR AGTHC AGUBO AGYEJ AHHHB AIEXJ AIGVJ AIIAU AIKHN AITUG AIVDX AJOXV ALMA_UNASSIGNED_HOLDINGS AMFUW AMRAJ ARUGR AXJTR AXLSJ BKOJK BLXMC EBS EFJIC EO8 EO9 EP2 EP3 F5P FDB FIRID FNPLU FYGXN G-Q GBLVA IHE J1W K-O KOM M38 M41 MHUIS MO0 N9A O-L O9- OAUVE OGIMB OZT P-8 P-9 P2P PC. Q38 RNS ROL RPZ SDF SDG SDP SES SPC SPCBC SPD SSB SSF SSQ SSW SSZ T5K TN5 TWZ WH7 XPP YNT ZMT ~02 ~G- 29O 5VS 6TJ 9DU AAFFL AAQXK AATTM AAYWO AAYXX ABFNM ABJNI ABWVN ABXDB ACLOT ACNNM ACROA ACRPL ADMUD ADNMO ADVLN AEIPS AFODL AGQPQ AIIUN AJWLA ANKPU APXCP ASPBG AVWKF AZFZN BBWZM BEHZQ BEZPJ BGSCR BNTGB BPUDD BULVW BZJEE CITATION EFKBS EFLBG EJD FEDTE FGOYB HMV HVGLF HZ~ MVM NDZJH R2- SEW SPG VOH WUQ XOL YYP ZY4 ~HD ADTPV AOWAS |
| ID | FETCH-LOGICAL-c339t-1889ca3f88f6b55b1eb96dc7f59aba1af9292f4d597d73dd50ed5a2e36c24af93 |
| ISICitedReferencesCount | 12 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001159318300001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 0378-4371 1873-2119 |
| IngestDate | Tue Nov 04 16:09:40 EST 2025 Sat Nov 29 07:14:51 EST 2025 Tue Nov 18 22:07:57 EST 2025 Tue Dec 03 03:45:09 EST 2024 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Keywords | Driving behavior imitation Deep reinforcement learning Stacked denoising autoencoders Car-following model |
| Language | English |
| LinkModel | OpenURL |
| MergedId | FETCHMERGED-LOGICAL-c339t-1889ca3f88f6b55b1eb96dc7f59aba1af9292f4d597d73dd50ed5a2e36c24af93 |
| ORCID | 0000-0001-9808-1483 0000-0002-8504-3207 |
| ParticipantIDs | swepub_primary_oai_DiVA_org_ri_65535 crossref_primary_10_1016_j_physa_2023_128912 crossref_citationtrail_10_1016_j_physa_2023_128912 elsevier_sciencedirect_doi_10_1016_j_physa_2023_128912 |
| PublicationCentury | 2000 |
| PublicationDate | 2023-08-15 |
| PublicationDateYYYYMMDD | 2023-08-15 |
| PublicationDate_xml | – month: 08 year: 2023 text: 2023-08-15 day: 15 |
| PublicationDecade | 2020 |
| PublicationTitle | Physica A |
| PublicationYear | 2023 |
| Publisher | Elsevier B.V |
| Publisher_xml | – name: Elsevier B.V |
| References | Sutton, McAllester, Singh, Mansour (b27) 2000 Mnih, Kavukcuoglu, Silver, Rusu, Veness, Bellemare, Graves, Riedmiller, Fidjeland, Ostrovski (b12) 2015; 518 Saifuzzaman, Zheng (b5) 2014; 48 Brackstone, McDonald (b4) 1999; 2 Lillicrap, Hunt, Pritzel, Heess, Erez, Tassa (b30) 2015 Ossen (b48) 2008 Khodayari, Ghaffari, Kazemi, Braunstingl (b8) 2012; 42 Zhou, Yu, Qu (b17) 2020; 21 Guo, Angah, Liu, Ban (b22) 2021; 124 Kingma, Ba (b43) 2017 Yang, Zou, Chen (b3) 2022; 175 Punzo, Zheng, Montanino (b50) 2021; 128 P. Vincent, H. Larochelle, Y. Bengio, P.-A. Manzagol, Extracting and composing robust features with denoising autoencoders, in: Proc. 25th Int. Conf. Mach. Learn., 2008, pp. 1096–1103. Ijspeert, Nakanishi, Hoffmann, Pastor, Schaal (b51) 2013; 25 Liao, Yu, Chen, Zhou, Li (b25) 2022 Lillicrap, Hunt, Pritzel, Heess, Erez, Tassa, Silver, Wierstra (b24) 2015 T. Rakthanmanon, B. Campana, A. Mueen, G. Batista, B. Westover, Q. Zhu, J. Zakaria, E. Keogh, Searching and mining trillions of time series subsequences under dynamic time warping, in: Proc. 18th ACM SIGKDD Int. Conf. Knowl. Discov. Data Min., 2012, pp. 262–270. Qu, Yu, Zhou, Lin, Wang (b18) 2020; 257 Punzo, Montanino (b46) 2016; 91 Wang, Jiang, Li, Lin, Wang (b9) 2019; 514 Zhou, Qu, Li (b10) 2017; 84 Jiang, Xie, Chen, Li, Evans (b20) 2020 Silver, Huang, Maddison, Guez, Sifre, van den Driessche, Schrittwieser, Antonoglou, Panneershelvam, Lanctot, Dieleman, Grewe, Nham, Kalchbrenner, Sutskever, Lillicrap, Leach, Kavukcuoglu, Graepel, Hassabis (b15) 2016; 529 Mnih, Badia, Mirza, Graves, Lillicrap, Harley, Silver, Kavukcuoglu (b13) 2016 He, Lou, Yang, Lv (b21) 2022 Wu, Zou, Wu, Zhang (b47) 2023; 620 Wang, Xi, Zhao (b40) 2018; 20 Bezzina, Sayer (b35) 2014; 812 Krizhevsky, Sutskever, Hinton (b42) 2017; 60 Hinton, Salakhutdinov (b31) 2006; 313 Wang, Liu, Ci, Wu (b1) 2022; 607 Zhao, Huang, Peng, Lam, LeBlanc (b36) 2018; 19 Hausknecht, Stone (b14) 2016 Berndt, Clifford (b39) 1994 Aggarwal, Reddy (b37) 2013 V.R. Konda, J.N. Tsitsiklis, Actor-Critic Algorithms, (n.d.) 7. Kreidieh, Wu, Bayen (b19) 2018 Sharma, Zheng, Bhaskar (b45) 2019; 120 Vincent, Larochelle, Lajoie, Bengio, Manzagol, Bottou (b33) 2010; 11 Lipton, Berkowitz, Elkan (b44) 2015 Silver, Lever, Heess, Degris, Wierstra, Riedmiller (b29) 2014 Krajewski, Bock, Kloeker, Eckstein (b34) 2018 Shi, Wu, Shi, Zhou, Ran (b7) 2022; 599 Ye, Zhang, Sun (b23) 2019; 107 Zhu, Wang, Wang (b26) 2018; 97 Mnih, Kavukcuoglu, Silver, Graves, Antonoglou, Wierstra, Riedmiller (b11) 2013 Wang, Wang, Chen, Jing (b49) 2010; 2188 Peng, Liu, Dennis (b2) 2020; 538 Toledo (b6) 2007; 27 Epanechnikov (b41) 1969; 14 Hornik, Stinchcombe, White (b16) 1989; 2 Ijspeert (10.1016/j.physa.2023.128912_b51) 2013; 25 Vincent (10.1016/j.physa.2023.128912_b33) 2010; 11 Hinton (10.1016/j.physa.2023.128912_b31) 2006; 313 Brackstone (10.1016/j.physa.2023.128912_b4) 1999; 2 Aggarwal (10.1016/j.physa.2023.128912_b37) 2013 Jiang (10.1016/j.physa.2023.128912_b20) 2020 Qu (10.1016/j.physa.2023.128912_b18) 2020; 257 Epanechnikov (10.1016/j.physa.2023.128912_b41) 1969; 14 Saifuzzaman (10.1016/j.physa.2023.128912_b5) 2014; 48 Mnih (10.1016/j.physa.2023.128912_b11) 2013 Liao (10.1016/j.physa.2023.128912_b25) 2022 Krizhevsky (10.1016/j.physa.2023.128912_b42) 2017; 60 Krajewski (10.1016/j.physa.2023.128912_b34) 2018 10.1016/j.physa.2023.128912_b32 Peng (10.1016/j.physa.2023.128912_b2) 2020; 538 Yang (10.1016/j.physa.2023.128912_b3) 2022; 175 Kreidieh (10.1016/j.physa.2023.128912_b19) 2018 10.1016/j.physa.2023.128912_b38 Khodayari (10.1016/j.physa.2023.128912_b8) 2012; 42 Zhu (10.1016/j.physa.2023.128912_b26) 2018; 97 Toledo (10.1016/j.physa.2023.128912_b6) 2007; 27 Silver (10.1016/j.physa.2023.128912_b15) 2016; 529 Wu (10.1016/j.physa.2023.128912_b47) 2023; 620 Guo (10.1016/j.physa.2023.128912_b22) 2021; 124 Mnih (10.1016/j.physa.2023.128912_b12) 2015; 518 Punzo (10.1016/j.physa.2023.128912_b46) 2016; 91 Shi (10.1016/j.physa.2023.128912_b7) 2022; 599 Sharma (10.1016/j.physa.2023.128912_b45) 2019; 120 Berndt (10.1016/j.physa.2023.128912_b39) 1994 Zhou (10.1016/j.physa.2023.128912_b17) 2020; 21 10.1016/j.physa.2023.128912_b28 Wang (10.1016/j.physa.2023.128912_b9) 2019; 514 Ossen (10.1016/j.physa.2023.128912_b48) 2008 Lillicrap (10.1016/j.physa.2023.128912_b30) 2015 Mnih (10.1016/j.physa.2023.128912_b13) 2016 Ye (10.1016/j.physa.2023.128912_b23) 2019; 107 Lipton (10.1016/j.physa.2023.128912_b44) 2015 Zhao (10.1016/j.physa.2023.128912_b36) 2018; 19 Wang (10.1016/j.physa.2023.128912_b49) 2010; 2188 He (10.1016/j.physa.2023.128912_b21) 2022 Wang (10.1016/j.physa.2023.128912_b40) 2018; 20 Hornik (10.1016/j.physa.2023.128912_b16) 1989; 2 Wang (10.1016/j.physa.2023.128912_b1) 2022; 607 Kingma (10.1016/j.physa.2023.128912_b43) 2017 Sutton (10.1016/j.physa.2023.128912_b27) 2000 Hausknecht (10.1016/j.physa.2023.128912_b14) 2016 Bezzina (10.1016/j.physa.2023.128912_b35) 2014; 812 Punzo (10.1016/j.physa.2023.128912_b50) 2021; 128 Zhou (10.1016/j.physa.2023.128912_b10) 2017; 84 Lillicrap (10.1016/j.physa.2023.128912_b24) 2015 Silver (10.1016/j.physa.2023.128912_b29) 2014 |
| References_xml | – volume: 60 start-page: 84 year: 2017 end-page: 90 ident: b42 article-title: ImageNet classification with deep convolutional neural networks publication-title: Commun. ACM – volume: 175 year: 2022 ident: b3 article-title: Operation analysis of freeway mixed traffic flow based on catch-up coordination platoon publication-title: Accid. Anal. Prev. – volume: 620 year: 2023 ident: b47 article-title: Application of Bayesian model averaging for modeling time headway distribution publication-title: Phys. Stat. Mech. Appl. – volume: 91 start-page: 21 year: 2016 end-page: 33 ident: b46 article-title: Speed or spacing? Cumulative variables, and convolution of model errors and time in traffic flow models validation and calibration publication-title: Transp. Res. Part B Methodol. – volume: 48 start-page: 379 year: 2014 end-page: 403 ident: b5 article-title: Incorporating human-factors in car-following models: A review of recent developments and research needs publication-title: Transp. Res. Part C Emerg. Technol. – start-page: 1928 year: 2016 end-page: 1937 ident: b13 article-title: Asynchronous methods for deep reinforcement learning publication-title: Int. Conf. Mach. Learn. – volume: 97 start-page: 348 year: 2018 end-page: 368 ident: b26 article-title: Human-like autonomous car-following model with deep reinforcement learning publication-title: Transp. Res. Part C Emerg. Technol. – volume: 14 start-page: 153 year: 1969 end-page: 158 ident: b41 article-title: Non-parametric estimation of a multivariate probability density publication-title: Theory Probab. Appl. – volume: 538 year: 2020 ident: b2 article-title: An improved car-following model with consideration of multiple preceding and following vehicles in a driver’s view publication-title: Phys. Stat. Mech. Appl. – volume: 812 start-page: 18 year: 2014 ident: b35 article-title: Safety pilot model deployment: Test conductor team report publication-title: Rep. No DOT HS. – volume: 529 start-page: 484 year: 2016 end-page: 489 ident: b15 article-title: Mastering the game of go with deep neural networks and tree search publication-title: Nature – volume: 25 start-page: 328 year: 2013 end-page: 373 ident: b51 article-title: Dynamical movement primitives: learning attractor models for motor behaviors publication-title: Neural Comput. – year: 2013 ident: b37 article-title: DATA CLUSTERING Algorithms and Applications – start-page: 1057 year: 2000 end-page: 1063 ident: b27 article-title: Policy gradient methods for reinforcement learning with function approximation publication-title: Adv. Neural Inf. Process. Syst. – year: 2015 ident: b24 article-title: Continuous control with deep reinforcement learning – volume: 599 year: 2022 ident: b7 article-title: An integrated car-following and lane changing vehicle trajectory prediction algorithm based on a deep neural network publication-title: Phys. Stat. Mech. Appl. – volume: 11 year: 2010 ident: b33 article-title: Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion publication-title: J. Mach. Learn. Res. – year: 2022 ident: b21 article-title: Robust decision making for autonomous vehicles at highway on-ramps: A constrained adversarial reinforcement learning approach publication-title: IEEE Trans. Intell. Transp. Syst. – volume: 27 start-page: 65 year: 2007 end-page: 84 ident: b6 article-title: Driving behaviour: Models and challenges publication-title: Transp. Rev. – volume: 42 start-page: 1440 year: 2012 end-page: 1449 ident: b8 article-title: A modified car-following model based on a neural network model of the human driver effects publication-title: IEEE Trans. Syst. Man Cybern.-Part Syst. Hum. – volume: 2 start-page: 181 year: 1999 end-page: 196 ident: b4 article-title: Car-following: a historical review publication-title: Transp. Res. Part F Traffic Psychol. Behav. – start-page: 359 year: 1994 end-page: 370 ident: b39 article-title: Using dynamic time warping to find patterns in time series publication-title: Proc. 3rd Int. Conf. Knowl. Discov. Data Min – volume: 2 start-page: 359 year: 1989 end-page: 366 ident: b16 article-title: Multilayer feedforward networks are universal approximators publication-title: Neural Netw. – start-page: 387 year: 2014 end-page: 395 ident: b29 article-title: Deterministic policy gradient algorithms publication-title: Int. Conf. Mach. Learn. – volume: 514 start-page: 786 year: 2019 end-page: 795 ident: b9 article-title: Long memory is important: A test study on deep-learning based car-following model publication-title: Phys. Stat. Mech. Appl. – reference: P. Vincent, H. Larochelle, Y. Bengio, P.-A. Manzagol, Extracting and composing robust features with denoising autoencoders, in: Proc. 25th Int. Conf. Mach. Learn., 2008, pp. 1096–1103. – volume: 128 year: 2021 ident: b50 article-title: About calibration of car-following dynamics of automated and human-driven vehicles: Methodology, guidelines and codes publication-title: Transp. Res. Part C Emerg. Technol. – volume: 124 year: 2021 ident: b22 article-title: Hybrid deep reinforcement learning based eco-driving for low-level connected and automated vehicles along signalized corridors publication-title: Transp. Res. Part C Emerg. Technol. – year: 2015 ident: b30 article-title: Continuous control with deep reinforcement learning – year: 2008 ident: b48 article-title: Longitudinal driving behavior: theory and empirics – volume: 2188 start-page: 85 year: 2010 end-page: 95 ident: b49 article-title: Using trajectory data to analyze intradriver heterogeneity in car-following publication-title: Transp. Res. Rec. – volume: 313 start-page: 504 year: 2006 end-page: 507 ident: b31 article-title: Reducing the dimensionality of data with neural networks publication-title: Science – volume: 120 start-page: 49 year: 2019 end-page: 75 ident: b45 article-title: Is more always better? The impact of vehicular trajectory completeness on car-following model calibration and validation publication-title: Transp. Res. Part B Methodol. – reference: V.R. Konda, J.N. Tsitsiklis, Actor-Critic Algorithms, (n.d.) 7. – year: 2017 ident: b43 article-title: Adam: A method for stochastic optimization – volume: 21 start-page: 433 year: 2020 end-page: 443 ident: b17 article-title: Development of an efficient driving strategy for connected and automated vehicles at signalized intersections: A reinforcement learning approach publication-title: IEEE Trans. Intell. Transp. Syst. – volume: 607 year: 2022 ident: b1 article-title: Effect of front two adjacent vehicles’ velocity information on car-following model construction and stability analysis publication-title: Phys. Stat. Mech. Appl. – volume: 20 start-page: 2986 year: 2018 end-page: 2998 ident: b40 article-title: Driving style analysis using primitive driving patterns with Bayesian nonparametric approaches publication-title: IEEE Trans. Intell. Transp. Syst. – volume: 518 start-page: 529 year: 2015 end-page: 533 ident: b12 article-title: Human-level control through deep reinforcement learning publication-title: Nature – volume: 19 start-page: 733 year: 2018 end-page: 744 ident: b36 article-title: Accelerated evaluation of automated vehicles in car-following maneuvers publication-title: IEEE Trans. Intell. Transp. Syst. – volume: 107 start-page: 155 year: 2019 end-page: 170 ident: b23 article-title: Automated vehicle’s behavior decision making using deep reinforcement learning and high-fidelity simulation environment publication-title: Transp. Res. Part C Emerg. Technol. – volume: 84 start-page: 245 year: 2017 end-page: 264 ident: b10 article-title: A recurrent neural network based microscopic car following model to predict traffic oscillation publication-title: Transp. Res. Part C Emerg. Technol. – year: 2013 ident: b11 article-title: Playing atari with deep reinforcement learning – year: 2015 ident: b44 article-title: A critical review of recurrent neural networks for sequence learning – year: 2016 ident: b14 article-title: Deep reinforcement learning in parameterized action space – reference: T. Rakthanmanon, B. Campana, A. Mueen, G. Batista, B. Westover, Q. Zhu, J. Zakaria, E. Keogh, Searching and mining trillions of time series subsequences under dynamic time warping, in: Proc. 18th ACM SIGKDD Int. Conf. Knowl. Discov. Data Min., 2012, pp. 262–270. – start-page: 1475 year: 2018 end-page: 1480 ident: b19 article-title: Dissipating stop-and-go waves in closed and open networks via deep reinforcement learning publication-title: 2018 21st Int. Conf. Intell. Transp. Syst. ITSC – start-page: 2118 year: 2018 end-page: 2125 ident: b34 article-title: The highd dataset: A drone dataset of naturalistic vehicle trajectories on german highways for validation of highly automated driving systems publication-title: 2018 21st Int. Conf. Intell. Transp. Syst. ITSC – volume: 257 year: 2020 ident: b18 article-title: Jointly dampening traffic oscillations and improving energy consumption with electric, connected and automated vehicles: A reinforcement learning based approach publication-title: Appl. Energy. – year: 2020 ident: b20 article-title: Dampen the stop-and-go traffic with connected and automated vehicles–a deep reinforcement learning approach – start-page: 1 year: 2022 end-page: 29 ident: b25 article-title: Modelling personalised car-following behaviour: a memory-based deep reinforcement learning approach publication-title: Transp. Transp. Sci. – year: 2013 ident: 10.1016/j.physa.2023.128912_b11 – volume: 2 start-page: 181 year: 1999 ident: 10.1016/j.physa.2023.128912_b4 article-title: Car-following: a historical review publication-title: Transp. Res. Part F Traffic Psychol. Behav. doi: 10.1016/S1369-8478(00)00005-X – start-page: 387 year: 2014 ident: 10.1016/j.physa.2023.128912_b29 article-title: Deterministic policy gradient algorithms – start-page: 1928 year: 2016 ident: 10.1016/j.physa.2023.128912_b13 article-title: Asynchronous methods for deep reinforcement learning – start-page: 1 year: 2022 ident: 10.1016/j.physa.2023.128912_b25 article-title: Modelling personalised car-following behaviour: a memory-based deep reinforcement learning approach publication-title: Transp. Transp. Sci. – volume: 2188 start-page: 85 year: 2010 ident: 10.1016/j.physa.2023.128912_b49 article-title: Using trajectory data to analyze intradriver heterogeneity in car-following publication-title: Transp. Res. Rec. doi: 10.3141/2188-10 – volume: 11 year: 2010 ident: 10.1016/j.physa.2023.128912_b33 article-title: Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion publication-title: J. Mach. Learn. Res. – year: 2020 ident: 10.1016/j.physa.2023.128912_b20 – volume: 812 start-page: 18 year: 2014 ident: 10.1016/j.physa.2023.128912_b35 article-title: Safety pilot model deployment: Test conductor team report publication-title: Rep. No DOT HS. – volume: 91 start-page: 21 year: 2016 ident: 10.1016/j.physa.2023.128912_b46 article-title: Speed or spacing? Cumulative variables, and convolution of model errors and time in traffic flow models validation and calibration publication-title: Transp. Res. Part B Methodol. doi: 10.1016/j.trb.2016.04.012 – volume: 84 start-page: 245 year: 2017 ident: 10.1016/j.physa.2023.128912_b10 article-title: A recurrent neural network based microscopic car following model to predict traffic oscillation publication-title: Transp. Res. Part C Emerg. Technol. doi: 10.1016/j.trc.2017.08.027 – start-page: 1057 year: 2000 ident: 10.1016/j.physa.2023.128912_b27 article-title: Policy gradient methods for reinforcement learning with function approximation – ident: 10.1016/j.physa.2023.128912_b32 doi: 10.1145/1390156.1390294 – volume: 538 year: 2020 ident: 10.1016/j.physa.2023.128912_b2 article-title: An improved car-following model with consideration of multiple preceding and following vehicles in a driver’s view publication-title: Phys. Stat. Mech. Appl. doi: 10.1016/j.physa.2019.122967 – volume: 599 year: 2022 ident: 10.1016/j.physa.2023.128912_b7 article-title: An integrated car-following and lane changing vehicle trajectory prediction algorithm based on a deep neural network publication-title: Phys. Stat. Mech. Appl. doi: 10.1016/j.physa.2022.127303 – volume: 21 start-page: 433 year: 2020 ident: 10.1016/j.physa.2023.128912_b17 article-title: Development of an efficient driving strategy for connected and automated vehicles at signalized intersections: A reinforcement learning approach publication-title: IEEE Trans. Intell. Transp. Syst. doi: 10.1109/TITS.2019.2942014 – start-page: 2118 year: 2018 ident: 10.1016/j.physa.2023.128912_b34 article-title: The highd dataset: A drone dataset of naturalistic vehicle trajectories on german highways for validation of highly automated driving systems – volume: 107 start-page: 155 year: 2019 ident: 10.1016/j.physa.2023.128912_b23 article-title: Automated vehicle’s behavior decision making using deep reinforcement learning and high-fidelity simulation environment publication-title: Transp. Res. Part C Emerg. Technol. doi: 10.1016/j.trc.2019.08.011 – volume: 128 year: 2021 ident: 10.1016/j.physa.2023.128912_b50 article-title: About calibration of car-following dynamics of automated and human-driven vehicles: Methodology, guidelines and codes publication-title: Transp. Res. Part C Emerg. Technol. doi: 10.1016/j.trc.2021.103165 – volume: 42 start-page: 1440 year: 2012 ident: 10.1016/j.physa.2023.128912_b8 article-title: A modified car-following model based on a neural network model of the human driver effects publication-title: IEEE Trans. Syst. Man Cybern.-Part Syst. Hum. doi: 10.1109/TSMCA.2012.2192262 – volume: 518 start-page: 529 year: 2015 ident: 10.1016/j.physa.2023.128912_b12 article-title: Human-level control through deep reinforcement learning publication-title: Nature doi: 10.1038/nature14236 – volume: 607 year: 2022 ident: 10.1016/j.physa.2023.128912_b1 article-title: Effect of front two adjacent vehicles’ velocity information on car-following model construction and stability analysis publication-title: Phys. Stat. Mech. Appl. doi: 10.1016/j.physa.2022.128196 – start-page: 359 year: 1994 ident: 10.1016/j.physa.2023.128912_b39 article-title: Using dynamic time warping to find patterns in time series – volume: 175 year: 2022 ident: 10.1016/j.physa.2023.128912_b3 article-title: Operation analysis of freeway mixed traffic flow based on catch-up coordination platoon publication-title: Accid. Anal. Prev. doi: 10.1016/j.aap.2022.106780 – ident: 10.1016/j.physa.2023.128912_b28 – volume: 124 year: 2021 ident: 10.1016/j.physa.2023.128912_b22 article-title: Hybrid deep reinforcement learning based eco-driving for low-level connected and automated vehicles along signalized corridors publication-title: Transp. Res. Part C Emerg. Technol. doi: 10.1016/j.trc.2021.102980 – volume: 514 start-page: 786 year: 2019 ident: 10.1016/j.physa.2023.128912_b9 article-title: Long memory is important: A test study on deep-learning based car-following model publication-title: Phys. Stat. Mech. Appl. doi: 10.1016/j.physa.2018.09.136 – year: 2015 ident: 10.1016/j.physa.2023.128912_b24 – volume: 48 start-page: 379 year: 2014 ident: 10.1016/j.physa.2023.128912_b5 article-title: Incorporating human-factors in car-following models: A review of recent developments and research needs publication-title: Transp. Res. Part C Emerg. Technol. doi: 10.1016/j.trc.2014.09.008 – volume: 257 year: 2020 ident: 10.1016/j.physa.2023.128912_b18 article-title: Jointly dampening traffic oscillations and improving energy consumption with electric, connected and automated vehicles: A reinforcement learning based approach publication-title: Appl. Energy. doi: 10.1016/j.apenergy.2019.114030 – volume: 2 start-page: 359 year: 1989 ident: 10.1016/j.physa.2023.128912_b16 article-title: Multilayer feedforward networks are universal approximators publication-title: Neural Netw. doi: 10.1016/0893-6080(89)90020-8 – volume: 313 start-page: 504 year: 2006 ident: 10.1016/j.physa.2023.128912_b31 article-title: Reducing the dimensionality of data with neural networks publication-title: Science doi: 10.1126/science.1127647 – volume: 25 start-page: 328 year: 2013 ident: 10.1016/j.physa.2023.128912_b51 article-title: Dynamical movement primitives: learning attractor models for motor behaviors publication-title: Neural Comput. doi: 10.1162/NECO_a_00393 – year: 2008 ident: 10.1016/j.physa.2023.128912_b48 – year: 2015 ident: 10.1016/j.physa.2023.128912_b30 – year: 2013 ident: 10.1016/j.physa.2023.128912_b37 – year: 2016 ident: 10.1016/j.physa.2023.128912_b14 – volume: 97 start-page: 348 year: 2018 ident: 10.1016/j.physa.2023.128912_b26 article-title: Human-like autonomous car-following model with deep reinforcement learning publication-title: Transp. Res. Part C Emerg. Technol. doi: 10.1016/j.trc.2018.10.024 – ident: 10.1016/j.physa.2023.128912_b38 doi: 10.1145/2339530.2339576 – volume: 60 start-page: 84 year: 2017 ident: 10.1016/j.physa.2023.128912_b42 article-title: ImageNet classification with deep convolutional neural networks publication-title: Commun. ACM doi: 10.1145/3065386 – volume: 529 start-page: 484 year: 2016 ident: 10.1016/j.physa.2023.128912_b15 article-title: Mastering the game of go with deep neural networks and tree search publication-title: Nature doi: 10.1038/nature16961 – volume: 620 year: 2023 ident: 10.1016/j.physa.2023.128912_b47 article-title: Application of Bayesian model averaging for modeling time headway distribution publication-title: Phys. Stat. Mech. Appl. doi: 10.1016/j.physa.2023.128747 – volume: 19 start-page: 733 year: 2018 ident: 10.1016/j.physa.2023.128912_b36 article-title: Accelerated evaluation of automated vehicles in car-following maneuvers publication-title: IEEE Trans. Intell. Transp. Syst. doi: 10.1109/TITS.2017.2701846 – volume: 20 start-page: 2986 year: 2018 ident: 10.1016/j.physa.2023.128912_b40 article-title: Driving style analysis using primitive driving patterns with Bayesian nonparametric approaches publication-title: IEEE Trans. Intell. Transp. Syst. doi: 10.1109/TITS.2018.2870525 – volume: 120 start-page: 49 year: 2019 ident: 10.1016/j.physa.2023.128912_b45 article-title: Is more always better? The impact of vehicular trajectory completeness on car-following model calibration and validation publication-title: Transp. Res. Part B Methodol. doi: 10.1016/j.trb.2018.12.016 – volume: 14 start-page: 153 year: 1969 ident: 10.1016/j.physa.2023.128912_b41 article-title: Non-parametric estimation of a multivariate probability density publication-title: Theory Probab. Appl. doi: 10.1137/1114019 – start-page: 1475 year: 2018 ident: 10.1016/j.physa.2023.128912_b19 article-title: Dissipating stop-and-go waves in closed and open networks via deep reinforcement learning – volume: 27 start-page: 65 year: 2007 ident: 10.1016/j.physa.2023.128912_b6 article-title: Driving behaviour: Models and challenges publication-title: Transp. Rev. doi: 10.1080/01441640600823940 – year: 2022 ident: 10.1016/j.physa.2023.128912_b21 article-title: Robust decision making for autonomous vehicles at highway on-ramps: A constrained adversarial reinforcement learning approach publication-title: IEEE Trans. Intell. Transp. Syst. – year: 2015 ident: 10.1016/j.physa.2023.128912_b44 – year: 2017 ident: 10.1016/j.physa.2023.128912_b43 |
| SSID | ssj0001732 |
| Score | 2.4811473 |
| Snippet | Accuracy improvement of Car-following (CF) model has attracted much attention in recent years. Although a few studies incorporate deep reinforcement learning... |
| SourceID | swepub crossref elsevier |
| SourceType | Open Access Repository Enrichment Source Index Database Publisher |
| StartPage | 128912 |
| SubjectTerms | Auto encoders Behavioral research Car-following model Car-following modeling De-noising Decision making Deep reinforcement learning Deterministics Driving behavior imitation Driving behaviour Learning systems Policy gradient Recurrent neural networks Reinforcement learning Reinforcement learnings Stacked denoising autoencoder Stacked denoising autoencoders |
| Title | Improved deep reinforcement learning for car-following decision-making |
| URI | https://dx.doi.org/10.1016/j.physa.2023.128912 https://urn.kb.se/resolve?urn=urn:nbn:se:ri:diva-65535 |
| Volume | 624 |
| WOSCitedRecordID | wos001159318300001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVESC databaseName: Elsevier SD Freedom Collection Journals 2021 issn: 1873-2119 databaseCode: AIEXJ dateStart: 19950101 customDbUrl: isFulltext: true dateEnd: 99991231 titleUrlDefault: https://www.sciencedirect.com omitProxy: false ssIdentifier: ssj0001732 providerName: Elsevier |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Lb9QwELaWFiQuiKcoBZRDb5AqGz8SH1fQqiBU9VDQwsXyKyirNlmlu2V_fsdxnOy2YkUPXKJoNp618jnj8efxDEIHElMlk4LGiaU4Joowt7-buD14mhisx9LKtthEdnqaT6f8bDT6Fs7CXF9kVZWvVnz-X6EGGYDtjs7eA-5eKQjgHkCHK8AO138C3tME4Ecaa-cfGtumRtUtCxhqRPjYSS2buIBhUP9xEtMV24kv2_pU607rmcdyoD1_diTztJT1atkPjV_1srXoclYOskBIn8h6IFlDY1Wvsw4pdjSqP3cZTlvB6pNgXz4lWFKWkjVbCDMf9yHSd8y0Zwxmh469ccmfUnw4PL2ZFPvWZNWHEIbotJlolQinRHglD9BumlEONm538uVo-rWfmccZ9rtKXd9DFqo23u9OX_7qqaynlG3dkPOn6Em3fogmHvdnaGSr5-iRR-jqBToO6EcO_WgD_SigH4Eo2kA_uoX-S_T9-Oj800nclcqINcZ8EY_znGuJizwvmKJUja3izOisoFwqOZYFeMFpQQwsH02GjaGJNVSmFjOdEvgVv0I7VV3Z1yiSqbSYW1JgkhOrlWLMGK0KwlluQMkeSsNrEbrLI-_KmVyILZDsoY99o7lPo7L9cRbet-g8Qe_hCRhB2xseeHT6f3Hp0z-XPyaibn6LphSMUkzf3K87--jx8A28RTuLZmnfoYf6elFeNe-7UXYDMymOPQ |
| linkProvider | Elsevier |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Improved+deep+reinforcement+learning+for+car-following+decision-making&rft.jtitle=Physica+A&rft.au=Yang%2C+Xiaoxue&rft.au=Zou%2C+Yajie&rft.au=Zhang%2C+Hao&rft.au=Qu%2C+Xiaobo&rft.date=2023-08-15&rft.issn=0378-4371&rft.volume=624&rft.spage=128912&rft_id=info:doi/10.1016%2Fj.physa.2023.128912&rft.externalDBID=n%2Fa&rft.externalDocID=10_1016_j_physa_2023_128912 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0378-4371&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0378-4371&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0378-4371&client=summon |