Research on Behavioral Decision at an Unsignalized Roundabout for Automatic Driving Based on Proximal Policy Optimization Algorithm

Gespeichert in:
Bibliographische Detailangaben
Titel: Research on Behavioral Decision at an Unsignalized Roundabout for Automatic Driving Based on Proximal Policy Optimization Algorithm
Autoren: Jingpeng Gan, Jiancheng Zhang, Yuansheng Liu
Quelle: Applied Sciences, Vol 14, Iss 7, p 2889 (2024)
Verlagsinformationen: MDPI AG
Publikationsjahr: 2024
Bestand: Directory of Open Access Journals: DOAJ Articles
Schlagwörter: autonomous vehicle, deep reinforcement learning, optimized PPO algorithm, unsignalized roundabout, gap acceptance theory, Technology, Engineering (General). Civil engineering (General), TA1-2040, Biology (General), QH301-705.5, Physics, QC1-999, Chemistry, QD1-999
Beschreibung: Unsignalized roundabouts have a significant impact on traffic flow and vehicle safety. To address the challenge of autonomous vehicles passing through roundabouts with low penetration, improve their efficiency, and ensure safety and stability, we propose the proximal policy optimization (PPO) algorithm to enhance decision-making behavior. We develop an optimization-based behavioral choice model for autonomous vehicles that incorporates gap acceptance theory and deep reinforcement learning using the PPO algorithm. Additionally, we employ the CoordConv network to establish an aerial view for spatial perception information gathering. Furthermore, a dynamic multi-objective reward mechanism is introduced to maximize the PPO algorithm’s reward pool function while quantifying environmental factors. Through simulation experiments, we demonstrate that our optimized PPO algorithm significantly improves training efficiency by enhancing the reward value function by 2.85%, 7.17%, and 19.58% in scenarios with 20, 100, and 200 social vehicles, respectively, compared to the PPO+CCMR algorithm. The effectiveness of simulation training also increases by 11.1%, 13.8%, and 7.4%. Moreover, there is a reduction in crossing time by 2.37%, 2.62%, and 13.96%. Our optimized PPO algorithm enhances path selection during autonomous vehicle simulation training as they tend to drive in the inner ring over time; however, the influence of social vehicles on path selection diminishes as their quantity increases. The safety of autonomous vehicles remains largely unaffected by our optimized PPO algorithm.
Publikationsart: article in journal/newspaper
Sprache: English
Relation: https://www.mdpi.com/2076-3417/14/7/2889; https://doaj.org/toc/2076-3417; https://doaj.org/article/3147b64605354ccd9cc86e06039bdec7
DOI: 10.3390/app14072889
Verfügbarkeit: https://doi.org/10.3390/app14072889
https://doaj.org/article/3147b64605354ccd9cc86e06039bdec7
Dokumentencode: edsbas.12A5C5E2
Datenbank: BASE
FullText Text:
  Availability: 0
CustomLinks:
  – Url: https://doi.org/10.3390/app14072889#
    Name: EDS - BASE (s4221598)
    Category: fullText
    Text: View record from BASE
  – Url: https://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=EBSCO&SrcAuth=EBSCO&DestApp=WOS&ServiceName=TransferToWoS&DestLinkType=GeneralSearchSummary&Func=Links&author=Gan%20J
    Name: ISI
    Category: fullText
    Text: Nájsť tento článok vo Web of Science
    Icon: https://imagesrvr.epnet.com/ls/20docs.gif
    MouseOverText: Nájsť tento článok vo Web of Science
Header DbId: edsbas
DbLabel: BASE
An: edsbas.12A5C5E2
RelevancyScore: 962
AccessLevel: 3
PubType: Academic Journal
PubTypeId: academicJournal
PreciseRelevancyScore: 962.306396484375
IllustrationInfo
Items – Name: Title
  Label: Title
  Group: Ti
  Data: Research on Behavioral Decision at an Unsignalized Roundabout for Automatic Driving Based on Proximal Policy Optimization Algorithm
– Name: Author
  Label: Authors
  Group: Au
  Data: <searchLink fieldCode="AR" term="%22Jingpeng+Gan%22">Jingpeng Gan</searchLink><br /><searchLink fieldCode="AR" term="%22Jiancheng+Zhang%22">Jiancheng Zhang</searchLink><br /><searchLink fieldCode="AR" term="%22Yuansheng+Liu%22">Yuansheng Liu</searchLink>
– Name: TitleSource
  Label: Source
  Group: Src
  Data: Applied Sciences, Vol 14, Iss 7, p 2889 (2024)
– Name: Publisher
  Label: Publisher Information
  Group: PubInfo
  Data: MDPI AG
– Name: DatePubCY
  Label: Publication Year
  Group: Date
  Data: 2024
– Name: Subset
  Label: Collection
  Group: HoldingsInfo
  Data: Directory of Open Access Journals: DOAJ Articles
– Name: Subject
  Label: Subject Terms
  Group: Su
  Data: <searchLink fieldCode="DE" term="%22autonomous+vehicle%22">autonomous vehicle</searchLink><br /><searchLink fieldCode="DE" term="%22deep+reinforcement+learning%22">deep reinforcement learning</searchLink><br /><searchLink fieldCode="DE" term="%22optimized+PPO+algorithm%22">optimized PPO algorithm</searchLink><br /><searchLink fieldCode="DE" term="%22unsignalized+roundabout%22">unsignalized roundabout</searchLink><br /><searchLink fieldCode="DE" term="%22gap+acceptance+theory%22">gap acceptance theory</searchLink><br /><searchLink fieldCode="DE" term="%22Technology%22">Technology</searchLink><br /><searchLink fieldCode="DE" term="%22Engineering+%28General%29%2E+Civil+engineering+%28General%29%22">Engineering (General). Civil engineering (General)</searchLink><br /><searchLink fieldCode="DE" term="%22TA1-2040%22">TA1-2040</searchLink><br /><searchLink fieldCode="DE" term="%22Biology+%28General%29%22">Biology (General)</searchLink><br /><searchLink fieldCode="DE" term="%22QH301-705%2E5%22">QH301-705.5</searchLink><br /><searchLink fieldCode="DE" term="%22Physics%22">Physics</searchLink><br /><searchLink fieldCode="DE" term="%22QC1-999%22">QC1-999</searchLink><br /><searchLink fieldCode="DE" term="%22Chemistry%22">Chemistry</searchLink><br /><searchLink fieldCode="DE" term="%22QD1-999%22">QD1-999</searchLink>
– Name: Abstract
  Label: Description
  Group: Ab
  Data: Unsignalized roundabouts have a significant impact on traffic flow and vehicle safety. To address the challenge of autonomous vehicles passing through roundabouts with low penetration, improve their efficiency, and ensure safety and stability, we propose the proximal policy optimization (PPO) algorithm to enhance decision-making behavior. We develop an optimization-based behavioral choice model for autonomous vehicles that incorporates gap acceptance theory and deep reinforcement learning using the PPO algorithm. Additionally, we employ the CoordConv network to establish an aerial view for spatial perception information gathering. Furthermore, a dynamic multi-objective reward mechanism is introduced to maximize the PPO algorithm’s reward pool function while quantifying environmental factors. Through simulation experiments, we demonstrate that our optimized PPO algorithm significantly improves training efficiency by enhancing the reward value function by 2.85%, 7.17%, and 19.58% in scenarios with 20, 100, and 200 social vehicles, respectively, compared to the PPO+CCMR algorithm. The effectiveness of simulation training also increases by 11.1%, 13.8%, and 7.4%. Moreover, there is a reduction in crossing time by 2.37%, 2.62%, and 13.96%. Our optimized PPO algorithm enhances path selection during autonomous vehicle simulation training as they tend to drive in the inner ring over time; however, the influence of social vehicles on path selection diminishes as their quantity increases. The safety of autonomous vehicles remains largely unaffected by our optimized PPO algorithm.
– Name: TypeDocument
  Label: Document Type
  Group: TypDoc
  Data: article in journal/newspaper
– Name: Language
  Label: Language
  Group: Lang
  Data: English
– Name: NoteTitleSource
  Label: Relation
  Group: SrcInfo
  Data: https://www.mdpi.com/2076-3417/14/7/2889; https://doaj.org/toc/2076-3417; https://doaj.org/article/3147b64605354ccd9cc86e06039bdec7
– Name: DOI
  Label: DOI
  Group: ID
  Data: 10.3390/app14072889
– Name: URL
  Label: Availability
  Group: URL
  Data: https://doi.org/10.3390/app14072889<br />https://doaj.org/article/3147b64605354ccd9cc86e06039bdec7
– Name: AN
  Label: Accession Number
  Group: ID
  Data: edsbas.12A5C5E2
PLink https://erproxy.cvtisr.sk/sfx/access?url=https://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edsbas&AN=edsbas.12A5C5E2
RecordInfo BibRecord:
  BibEntity:
    Identifiers:
      – Type: doi
        Value: 10.3390/app14072889
    Languages:
      – Text: English
    Subjects:
      – SubjectFull: autonomous vehicle
        Type: general
      – SubjectFull: deep reinforcement learning
        Type: general
      – SubjectFull: optimized PPO algorithm
        Type: general
      – SubjectFull: unsignalized roundabout
        Type: general
      – SubjectFull: gap acceptance theory
        Type: general
      – SubjectFull: Technology
        Type: general
      – SubjectFull: Engineering (General). Civil engineering (General)
        Type: general
      – SubjectFull: TA1-2040
        Type: general
      – SubjectFull: Biology (General)
        Type: general
      – SubjectFull: QH301-705.5
        Type: general
      – SubjectFull: Physics
        Type: general
      – SubjectFull: QC1-999
        Type: general
      – SubjectFull: Chemistry
        Type: general
      – SubjectFull: QD1-999
        Type: general
    Titles:
      – TitleFull: Research on Behavioral Decision at an Unsignalized Roundabout for Automatic Driving Based on Proximal Policy Optimization Algorithm
        Type: main
  BibRelationships:
    HasContributorRelationships:
      – PersonEntity:
          Name:
            NameFull: Jingpeng Gan
      – PersonEntity:
          Name:
            NameFull: Jiancheng Zhang
      – PersonEntity:
          Name:
            NameFull: Yuansheng Liu
    IsPartOfRelationships:
      – BibEntity:
          Dates:
            – D: 01
              M: 01
              Type: published
              Y: 2024
          Identifiers:
            – Type: issn-locals
              Value: edsbas
            – Type: issn-locals
              Value: edsbas.oa
          Titles:
            – TitleFull: Applied Sciences, Vol 14, Iss 7, p 2889 (2024
              Type: main
ResultId 1