Toward Self-Driving Bicycles Using State-of-the-Art Deep Reinforcement Learning Algorithms

In this paper, we propose a controller for a bicycle using the DDPG (Deep Deterministic Policy Gradient) algorithm, which is a state-of-the-art deep reinforcement learning algorithm. We use a reward function and a deep neural network to build the controller. By using the proposed controller, a bicyc...

Full description

Saved in:
Bibliographic Details
Published in:Symmetry (Basel) Vol. 11; no. 2; p. 290
Main Authors: Choi, SeungYoon, Le, Tuyen P., Nguyen, Quang D., Layek, Md Abu, Lee, SeungGwan, Chung, TaeChoong
Format: Journal Article
Language:English
Published: Basel MDPI AG 01.02.2019
Subjects:
ISSN:2073-8994, 2073-8994
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Abstract In this paper, we propose a controller for a bicycle using the DDPG (Deep Deterministic Policy Gradient) algorithm, which is a state-of-the-art deep reinforcement learning algorithm. We use a reward function and a deep neural network to build the controller. By using the proposed controller, a bicycle can not only be stably balanced but also travel to any specified location. We confirm that the controller with DDPG shows better performance than the other baselines such as Normalized Advantage Function (NAF) and Proximal Policy Optimization (PPO). For the performance evaluation, we implemented the proposed algorithm in various settings such as fixed and random speed, start location, and destination location.
AbstractList In this paper, we propose a controller for a bicycle using the DDPG (Deep Deterministic Policy Gradient) algorithm, which is a state-of-the-art deep reinforcement learning algorithm. We use a reward function and a deep neural network to build the controller. By using the proposed controller, a bicycle can not only be stably balanced but also travel to any specified location. We confirm that the controller with DDPG shows better performance than the other baselines such as Normalized Advantage Function (NAF) and Proximal Policy Optimization (PPO). For the performance evaluation, we implemented the proposed algorithm in various settings such as fixed and random speed, start location, and destination location.
Author Layek, Md Abu
Choi, SeungYoon
Le, Tuyen P.
Nguyen, Quang D.
Lee, SeungGwan
Chung, TaeChoong
Author_xml – sequence: 1
  givenname: SeungYoon
  surname: Choi
  fullname: Choi, SeungYoon
– sequence: 2
  givenname: Tuyen P.
  orcidid: 0000-0002-1345-2650
  surname: Le
  fullname: Le, Tuyen P.
– sequence: 3
  givenname: Quang D.
  surname: Nguyen
  fullname: Nguyen, Quang D.
– sequence: 4
  givenname: Md Abu
  orcidid: 0000-0002-0253-4597
  surname: Layek
  fullname: Layek, Md Abu
– sequence: 5
  givenname: SeungGwan
  orcidid: 0000-0002-5510-390X
  surname: Lee
  fullname: Lee, SeungGwan
– sequence: 6
  givenname: TaeChoong
  surname: Chung
  fullname: Chung, TaeChoong
BookMark eNptkE1rAjEQhkOxUGs99Q8s9FjSJruJmxyt9guEQtVLL0uME43sJjaJLf77rtiDlM7lnYFnZuC5RB3nHSB0TcldUUhyH_cNpSQnuSRnqJuTssBCStY56S9QP8YNaYsTzgakiz5m_luFZTaF2uBxsF_WrbIHq_e6hpjN42GcJpUAe4PTGvAwpGwMsM3ewTrjg4YGXMomoII7wMN65YNN6yZeoXOj6gj93-yh-dPjbPSCJ2_Pr6PhBOtcioSNWVAqmVFECqkZGCGBAIdBwZguZWlyADC8zZJLUxBGBAwMXw7UgqmFYEUP3RzvboP_3EFM1cbvgmtfVjnnJC8LKkRL3R4pHXyMAUy1DbZRYV9RUh38VSf-Wpr-obVtLVjvUlC2_nfnB4e-dUA
CitedBy_id crossref_primary_10_1109_TCSII_2019_2947682
crossref_primary_10_1177_01423312211037847
crossref_primary_10_3390_wevj15060246
crossref_primary_10_3390_app10217726
crossref_primary_10_1038_s41598_022_23668_x
crossref_primary_10_3390_electronics11213495
crossref_primary_10_1109_ACCESS_2024_3447054
crossref_primary_10_3390_act12030109
crossref_primary_10_1109_ACCESS_2023_3311850
crossref_primary_10_3390_info10110341
crossref_primary_10_1109_ACCESS_2023_3268524
crossref_primary_10_3390_sym16091227
crossref_primary_10_1109_ACCESS_2021_3094623
Cites_doi 10.1109/TCST.2008.2004349
10.1103/PhysRev.36.823
10.1038/nature14236
10.1109/CCDC.2018.8408296
10.1109/TNN.1998.712192
10.1016/j.neunet.2008.02.003
10.1109/IROS.2009.5353966
10.1109/ICCAR.2018.8384682
ContentType Journal Article
Copyright 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Copyright_xml – notice: 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
DBID AAYXX
CITATION
7SC
7SR
7U5
8BQ
8FD
8FE
8FG
ABJCF
ABUWG
AFKRA
AZQEC
BENPR
BGLVJ
CCPQU
DWQXO
H8D
HCIFZ
JG9
JQ2
L6V
L7M
L~C
L~D
M7S
PHGZM
PHGZT
PIMPY
PKEHL
PQEST
PQGLB
PQQKQ
PQUKI
PRINS
PTHSS
DOI 10.3390/sym11020290
DatabaseName CrossRef
Computer and Information Systems Abstracts
Engineered Materials Abstracts
Solid State and Superconductivity Abstracts
METADEX
Technology Research Database
ProQuest SciTech Collection
ProQuest Technology Collection
Materials Science & Engineering Collection
ProQuest Central (Alumni)
ProQuest Central UK/Ireland
ProQuest Central Essentials
ProQuest Central
ProQuest Technology Collection
ProQuest One
ProQuest Central Korea
Aerospace Database
SciTech Premium Collection
Materials Research Database
ProQuest Computer Science Collection
ProQuest Engineering Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
Engineering Database
ProQuest Central Premium
ProQuest One Academic (New)
Publicly Available Content Database
ProQuest One Academic Middle East (New)
ProQuest One Academic Eastern Edition (DO NOT USE)
ProQuest One Applied & Life Sciences
ProQuest One Academic (retired)
ProQuest One Academic UKI Edition
ProQuest Central China
Engineering Collection
DatabaseTitle CrossRef
Publicly Available Content Database
Materials Research Database
Technology Collection
Technology Research Database
Computer and Information Systems Abstracts – Academic
ProQuest One Academic Middle East (New)
ProQuest Central Essentials
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
ProQuest Central (Alumni Edition)
SciTech Premium Collection
ProQuest One Community College
ProQuest Central China
ProQuest Central
ProQuest One Applied & Life Sciences
Aerospace Database
Engineered Materials Abstracts
ProQuest Engineering Collection
ProQuest Central Korea
ProQuest Central (New)
Advanced Technologies Database with Aerospace
Engineering Collection
Engineering Database
ProQuest One Academic Eastern Edition
ProQuest Technology Collection
ProQuest SciTech Collection
METADEX
Computer and Information Systems Abstracts Professional
ProQuest One Academic UKI Edition
Materials Science & Engineering Collection
Solid State and Superconductivity Abstracts
ProQuest One Academic
ProQuest One Academic (New)
DatabaseTitleList CrossRef
Publicly Available Content Database
Database_xml – sequence: 1
  dbid: PIMPY
  name: Publicly Available Content Database
  url: http://search.proquest.com/publiccontent
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Discipline Sciences (General)
EISSN 2073-8994
ExternalDocumentID 10_3390_sym11020290
GroupedDBID 5VS
8FE
8FG
AADQD
AAYXX
ABDBF
ABJCF
ACUHS
ADBBV
ADMLS
AFFHD
AFKRA
AFZYC
ALMA_UNASSIGNED_HOLDINGS
AMVHM
BCNDV
BENPR
BGLVJ
CCPQU
CITATION
E3Z
ESX
GX1
HCIFZ
IAO
J9A
KQ8
L6V
M7S
MODMG
M~E
OK1
PHGZM
PHGZT
PIMPY
PQGLB
PROAC
PTHSS
TR2
TUS
7SC
7SR
7U5
8BQ
8FD
ABUWG
AZQEC
DWQXO
H8D
JG9
JQ2
L7M
L~C
L~D
PKEHL
PQEST
PQQKQ
PQUKI
PRINS
ID FETCH-LOGICAL-c298t-ffb1194fa0989c4ef89e0e5e6344c797f2eeef57f2759f30408e6f5d6ab4ab843
IEDL.DBID M7S
ISICitedReferencesCount 17
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000460767300168&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 2073-8994
IngestDate Fri Jul 25 11:58:27 EDT 2025
Sat Nov 29 07:11:29 EST 2025
Tue Nov 18 21:44:37 EST 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 2
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c298t-ffb1194fa0989c4ef89e0e5e6344c797f2eeef57f2759f30408e6f5d6ab4ab843
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ORCID 0000-0002-5510-390X
0000-0002-1345-2650
0000-0002-0253-4597
OpenAccessLink https://www.proquest.com/docview/2550273188?pq-origsite=%requestingapplication%
PQID 2550273188
PQPubID 2032326
ParticipantIDs proquest_journals_2550273188
crossref_primary_10_3390_sym11020290
crossref_citationtrail_10_3390_sym11020290
PublicationCentury 2000
PublicationDate 2019-02-01
PublicationDateYYYYMMDD 2019-02-01
PublicationDate_xml – month: 02
  year: 2019
  text: 2019-02-01
  day: 01
PublicationDecade 2010
PublicationPlace Basel
PublicationPlace_xml – name: Basel
PublicationTitle Symmetry (Basel)
PublicationYear 2019
Publisher MDPI AG
Publisher_xml – name: MDPI AG
References Tan (ref_5) 2014; 33
Mnih (ref_16) 2015; 518
(ref_9) 1998; 98
ref_14
ref_13
Uhlenbeck (ref_19) 1930; 36
ref_11
ref_22
ref_10
ref_21
ref_20
ref_1
ref_2
ref_18
ref_15
ref_8
Hwang (ref_17) 2009; 17
Meijaard (ref_3) 2007; Volume 463
ref_4
Peters (ref_12) 2008; 21
ref_7
ref_6
References_xml – volume: 17
  start-page: 658
  year: 2009
  ident: ref_17
  article-title: Fuzzy sliding-mode underactuated control for autonomous dynamic balance of an electrical bicycle
  publication-title: IEEE Trans. Control Syst. Technol.
  doi: 10.1109/TCST.2008.2004349
– ident: ref_8
– ident: ref_4
– volume: 36
  start-page: 823
  year: 1930
  ident: ref_19
  article-title: On the theory of the Brownian motion
  publication-title: Phys. Rev.
  doi: 10.1103/PhysRev.36.823
– volume: 518
  start-page: 529
  year: 2015
  ident: ref_16
  article-title: Human-level control through deep reinforcement learning
  publication-title: Nature
  doi: 10.1038/nature14236
– volume: 98
  start-page: 463
  year: 1998
  ident: ref_9
  article-title: Learning to drive a bicycle using reinforcement learning and shaping
  publication-title: ICML
– ident: ref_10
– ident: ref_15
– ident: ref_6
  doi: 10.1109/CCDC.2018.8408296
– ident: ref_11
  doi: 10.1109/TNN.1998.712192
– ident: ref_13
– volume: 21
  start-page: 682
  year: 2008
  ident: ref_12
  article-title: Reinforcement learning of motor skills with policy gradients
  publication-title: Neural Netw.
  doi: 10.1016/j.neunet.2008.02.003
– ident: ref_14
– ident: ref_1
– ident: ref_18
– ident: ref_2
  doi: 10.1109/IROS.2009.5353966
– ident: ref_22
– ident: ref_21
– ident: ref_20
– volume: Volume 463
  start-page: 1955
  year: 2007
  ident: ref_3
  article-title: Linearized dynamics equations for the balance and steer of a bicycle: a benchmark and review
  publication-title: Proceedings of the Royal Society of London A: Mathematical, Physical and Engineering Sciences
– volume: 33
  start-page: 50
  year: 2014
  ident: ref_5
  article-title: Learning bicycle stunts
  publication-title: ACM Trans. Gr. (TOG)
– ident: ref_7
  doi: 10.1109/ICCAR.2018.8384682
SSID ssj0000505460
Score 2.2500904
Snippet In this paper, we propose a controller for a bicycle using the DDPG (Deep Deterministic Policy Gradient) algorithm, which is a state-of-the-art deep...
SourceID proquest
crossref
SourceType Aggregation Database
Enrichment Source
Index Database
StartPage 290
SubjectTerms Algorithms
Artificial neural networks
Bicycles
Control theory
Controllers
Deep learning
Learning
Machine learning
Neural networks
Optimization
Performance evaluation
Tires
Velocity
Title Toward Self-Driving Bicycles Using State-of-the-Art Deep Reinforcement Learning Algorithms
URI https://www.proquest.com/docview/2550273188
Volume 11
WOSCitedRecordID wos000460767300168&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVHPJ
  databaseName: ROAD: Directory of Open Access Scholarly Resources
  customDbUrl:
  eissn: 2073-8994
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0000505460
  issn: 2073-8994
  databaseCode: M~E
  dateStart: 20080101
  isFulltext: true
  titleUrlDefault: https://road.issn.org
  providerName: ISSN International Centre
– providerCode: PRVPQU
  databaseName: Engineering Database
  customDbUrl:
  eissn: 2073-8994
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0000505460
  issn: 2073-8994
  databaseCode: M7S
  dateStart: 20090301
  isFulltext: true
  titleUrlDefault: http://search.proquest.com
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: ProQuest Central
  customDbUrl:
  eissn: 2073-8994
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0000505460
  issn: 2073-8994
  databaseCode: BENPR
  dateStart: 20090301
  isFulltext: true
  titleUrlDefault: https://www.proquest.com/central
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: Publicly Available Content Database
  customDbUrl:
  eissn: 2073-8994
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0000505460
  issn: 2073-8994
  databaseCode: PIMPY
  dateStart: 20090301
  isFulltext: true
  titleUrlDefault: http://search.proquest.com/publiccontent
  providerName: ProQuest
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV07T8MwELaAMrDwRjwK8tABkCyS2EnsCbVQBANVxUMClshxzlCptKUpSF347diuy0NCLCyJkliRlXv47vzlO4RqIZVBksuAcCUoYUYriMzBXHJd0LAQtCiUazaRtlr87k60fcGt9LDKqU90jrroK1sjPzKhr6VeCTk_HrwQ2zXK7q76FhqzqGJZEkIH3bv-rLHYLm0sCSa_5VGT3R-V42ez3pmE3_rg7wvRTz_sFpezpf9Oaxkt-rAS1yd6sIJmoLeKVrzhlnjfs0sfrKGHGweUxdfQ1eR02LEFBdzoqLGFx2GHIMAuAiV9TUx0SMw78SnAAF-BY1lVrqCIPTHrI653H82ERk_P5Tq6PWvenJwT32CBqEjwEdE6D0PBtAwEF4qB5gICiCGhjKlUpDoCAB2bcxoLTY29c0h0XCQyZzLnjG6guV6_B5sIU6ApFFwZYVuCfsYjGUGayCiQKk4DtYUOp187U5593DbB6GYmC7Giyb6JZgvVPgcPJqQbvw-rTmWSecsrsy-BbP_9eActmOBHTBDYVTQ3Gr7CLppXb6NOOdxDlUaz1b7acwplj-9Nc699cdm-_wDSu9d0
linkProvider ProQuest
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V1baxQxFD7UKuhLtV6wtWoeKqgQmk3mkjyIVNfS0rqIXaH4Ms1kTurCdnfdWZX9U_5GTzIztYL41gefhjAhMMnHuc2X8wFs95QVWWkF184onhAquC2RhtpXqlcZVVUuik3kg4E-OTEfVuBndxcm0Co7mxgNdTV1oUa-Q6FvaL3S0_r17CsPqlHh72onodHA4hCXPyhlq18d9Ol8n0m59274dp-3qgLcSaMX3PuyR5m7t8Jo4xL02qDAFDOVJC43uZeI6FN65qnxlO0LjZlPq8yWiS11omjda3CdwghpIlXw-KKmE1Thkkw01wCVMmKnXp6Tf5VCBpt_2fH9afejM9u7_b9twx1Ya8NmttvgfB1WcHIX1lvDVLPnbffsF_fg8zASgdkxjj3vz0ehYMLejNwy0P9YZEiwGGHzqecU_XJak_URZ-wjxi6yLhZMWdt49oztjs9oAxZfzuv78OlKvvEBrE6mE3wITKHKsdKOwBwECBItrcQ8s1JYl-bCbcDL7nQL13ZXDyIf44KyrACF4hIUNmD7YvKsaSry92lbHQaK1rLUxW8AbP779VO4uT98f1QcHQwOH8EtCvRMwzbfgtXF_Bs-hhvu-2JUz59EEDM4vWq4_AKJeTFi
linkToPdf http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V3fb9MwED6NDiFeGOOHGAzww5AAyZprO4n9ME2DUlENqgqGNPYSEuc8KnVtaQqo_9r-Os75MYaEeNsDT1EUy1LsL3ffXc73Aex0VSbiPBPcOKu4JlTwLEe6Nb5Q3cKqonCV2EQyHJrjYztag_P2LEwoq2xtYmWoi5kLOfJdor6h9UqXAjbflEWMev39-TceFKTCn9ZWTqOGyCGuflL4Vu4NerTXz6Tsvzl6_ZY3CgPcSWuW3Pu8S1G8z4Q11mn0xqLACGOltUts4iUi-oiuSWQ9Rf7CYOyjIs5yneVGK5r3GqwTJdeyA-ujwfvR54sMT9CI07GoDwUqZcVuuTojbyuFDB7gshv80wtUrq2_8T8vym241RBqdlB_AZuwhtM7sNmYrJI9b_pqv7gLJ0dViTD7iBPPe4txSKWwV2O3CoWBrKqdYBX35jPPiRdzmpP1EOfsA1b9ZV2VSmVNS9pTdjA5pQVYfj0r78GnK3nH-9CZzqb4AJhClWBhHME8SBNoIzOJSZxJkbkoEW4LXrY7nbqm73qQ_5ikFH8FWKSXYLEFOxeD53W7kb8P227xkDY2p0x_g-Hhvx8_hRuEkvTdYHj4CG4SA7R1Gfo2dJaL7_gYrrsfy3G5eNIgmsGXq8bLL3gXO5g
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Toward+Self-Driving+Bicycles+Using+State-of-the-Art+Deep+Reinforcement+Learning+Algorithms&rft.jtitle=Symmetry+%28Basel%29&rft.au=Choi%2C+SeungYoon&rft.au=Le%2C+Tuyen+P.&rft.au=Nguyen%2C+Quang+D.&rft.au=Layek%2C+Md+Abu&rft.date=2019-02-01&rft.issn=2073-8994&rft.eissn=2073-8994&rft.volume=11&rft.issue=2&rft.spage=290&rft_id=info:doi/10.3390%2Fsym11020290&rft.externalDBID=n%2Fa&rft.externalDocID=10_3390_sym11020290
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2073-8994&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2073-8994&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2073-8994&client=summon