Toward Self-Driving Bicycles Using State-of-the-Art Deep Reinforcement Learning Algorithms
In this paper, we propose a controller for a bicycle using the DDPG (Deep Deterministic Policy Gradient) algorithm, which is a state-of-the-art deep reinforcement learning algorithm. We use a reward function and a deep neural network to build the controller. By using the proposed controller, a bicyc...
Uloženo v:
| Vydáno v: | Symmetry (Basel) Ročník 11; číslo 2; s. 290 |
|---|---|
| Hlavní autoři: | , , , , , |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
Basel
MDPI AG
01.02.2019
|
| Témata: | |
| ISSN: | 2073-8994, 2073-8994 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Abstract | In this paper, we propose a controller for a bicycle using the DDPG (Deep Deterministic Policy Gradient) algorithm, which is a state-of-the-art deep reinforcement learning algorithm. We use a reward function and a deep neural network to build the controller. By using the proposed controller, a bicycle can not only be stably balanced but also travel to any specified location. We confirm that the controller with DDPG shows better performance than the other baselines such as Normalized Advantage Function (NAF) and Proximal Policy Optimization (PPO). For the performance evaluation, we implemented the proposed algorithm in various settings such as fixed and random speed, start location, and destination location. |
|---|---|
| AbstractList | In this paper, we propose a controller for a bicycle using the DDPG (Deep Deterministic Policy Gradient) algorithm, which is a state-of-the-art deep reinforcement learning algorithm. We use a reward function and a deep neural network to build the controller. By using the proposed controller, a bicycle can not only be stably balanced but also travel to any specified location. We confirm that the controller with DDPG shows better performance than the other baselines such as Normalized Advantage Function (NAF) and Proximal Policy Optimization (PPO). For the performance evaluation, we implemented the proposed algorithm in various settings such as fixed and random speed, start location, and destination location. |
| Author | Layek, Md Abu Choi, SeungYoon Le, Tuyen P. Nguyen, Quang D. Lee, SeungGwan Chung, TaeChoong |
| Author_xml | – sequence: 1 givenname: SeungYoon surname: Choi fullname: Choi, SeungYoon – sequence: 2 givenname: Tuyen P. orcidid: 0000-0002-1345-2650 surname: Le fullname: Le, Tuyen P. – sequence: 3 givenname: Quang D. surname: Nguyen fullname: Nguyen, Quang D. – sequence: 4 givenname: Md Abu orcidid: 0000-0002-0253-4597 surname: Layek fullname: Layek, Md Abu – sequence: 5 givenname: SeungGwan orcidid: 0000-0002-5510-390X surname: Lee fullname: Lee, SeungGwan – sequence: 6 givenname: TaeChoong surname: Chung fullname: Chung, TaeChoong |
| BookMark | eNptkE1rAjEQhkOxUGs99Q8s9FjSJruJmxyt9guEQtVLL0uME43sJjaJLf77rtiDlM7lnYFnZuC5RB3nHSB0TcldUUhyH_cNpSQnuSRnqJuTssBCStY56S9QP8YNaYsTzgakiz5m_luFZTaF2uBxsF_WrbIHq_e6hpjN42GcJpUAe4PTGvAwpGwMsM3ewTrjg4YGXMomoII7wMN65YNN6yZeoXOj6gj93-yh-dPjbPSCJ2_Pr6PhBOtcioSNWVAqmVFECqkZGCGBAIdBwZguZWlyADC8zZJLUxBGBAwMXw7UgqmFYEUP3RzvboP_3EFM1cbvgmtfVjnnJC8LKkRL3R4pHXyMAUy1DbZRYV9RUh38VSf-Wpr-obVtLVjvUlC2_nfnB4e-dUA |
| CitedBy_id | crossref_primary_10_1109_TCSII_2019_2947682 crossref_primary_10_1177_01423312211037847 crossref_primary_10_3390_wevj15060246 crossref_primary_10_3390_app10217726 crossref_primary_10_1038_s41598_022_23668_x crossref_primary_10_3390_electronics11213495 crossref_primary_10_1109_ACCESS_2024_3447054 crossref_primary_10_3390_act12030109 crossref_primary_10_1109_ACCESS_2023_3311850 crossref_primary_10_3390_info10110341 crossref_primary_10_1109_ACCESS_2023_3268524 crossref_primary_10_3390_sym16091227 crossref_primary_10_1109_ACCESS_2021_3094623 |
| Cites_doi | 10.1109/TCST.2008.2004349 10.1103/PhysRev.36.823 10.1038/nature14236 10.1109/CCDC.2018.8408296 10.1109/TNN.1998.712192 10.1016/j.neunet.2008.02.003 10.1109/IROS.2009.5353966 10.1109/ICCAR.2018.8384682 |
| ContentType | Journal Article |
| Copyright | 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. |
| Copyright_xml | – notice: 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. |
| DBID | AAYXX CITATION 7SC 7SR 7U5 8BQ 8FD 8FE 8FG ABJCF ABUWG AFKRA AZQEC BENPR BGLVJ CCPQU DWQXO H8D HCIFZ JG9 JQ2 L6V L7M L~C L~D M7S PHGZM PHGZT PIMPY PKEHL PQEST PQGLB PQQKQ PQUKI PRINS PTHSS |
| DOI | 10.3390/sym11020290 |
| DatabaseName | CrossRef Computer and Information Systems Abstracts Engineered Materials Abstracts Solid State and Superconductivity Abstracts METADEX Technology Research Database ProQuest SciTech Collection ProQuest Technology Collection Materials Science & Engineering Collection ProQuest Central (Alumni) ProQuest Central UK/Ireland ProQuest Central Essentials ProQuest Central ProQuest Technology Collection ProQuest One ProQuest Central Korea Aerospace Database SciTech Premium Collection Materials Research Database ProQuest Computer Science Collection ProQuest Engineering Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional Engineering Database ProQuest Central Premium ProQuest One Academic (New) Publicly Available Content Database ProQuest One Academic Middle East (New) ProQuest One Academic Eastern Edition (DO NOT USE) ProQuest One Applied & Life Sciences ProQuest One Academic (retired) ProQuest One Academic UKI Edition ProQuest Central China Engineering Collection |
| DatabaseTitle | CrossRef Publicly Available Content Database Materials Research Database Technology Collection Technology Research Database Computer and Information Systems Abstracts – Academic ProQuest One Academic Middle East (New) ProQuest Central Essentials ProQuest Computer Science Collection Computer and Information Systems Abstracts ProQuest Central (Alumni Edition) SciTech Premium Collection ProQuest One Community College ProQuest Central China ProQuest Central ProQuest One Applied & Life Sciences Aerospace Database Engineered Materials Abstracts ProQuest Engineering Collection ProQuest Central Korea ProQuest Central (New) Advanced Technologies Database with Aerospace Engineering Collection Engineering Database ProQuest One Academic Eastern Edition ProQuest Technology Collection ProQuest SciTech Collection METADEX Computer and Information Systems Abstracts Professional ProQuest One Academic UKI Edition Materials Science & Engineering Collection Solid State and Superconductivity Abstracts ProQuest One Academic ProQuest One Academic (New) |
| DatabaseTitleList | CrossRef Publicly Available Content Database |
| Database_xml | – sequence: 1 dbid: PIMPY name: ProQuest Publicly Available Content Database url: http://search.proquest.com/publiccontent sourceTypes: Aggregation Database |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Sciences (General) |
| EISSN | 2073-8994 |
| ExternalDocumentID | 10_3390_sym11020290 |
| GroupedDBID | 5VS 8FE 8FG AADQD AAYXX ABDBF ABJCF ACUHS ADBBV ADMLS AFFHD AFKRA AFZYC ALMA_UNASSIGNED_HOLDINGS AMVHM BCNDV BENPR BGLVJ CCPQU CITATION E3Z ESX GX1 HCIFZ IAO J9A KQ8 L6V M7S MODMG M~E OK1 PHGZM PHGZT PIMPY PQGLB PROAC PTHSS TR2 TUS 7SC 7SR 7U5 8BQ 8FD ABUWG AZQEC DWQXO H8D JG9 JQ2 L7M L~C L~D PKEHL PQEST PQQKQ PQUKI PRINS |
| ID | FETCH-LOGICAL-c298t-ffb1194fa0989c4ef89e0e5e6344c797f2eeef57f2759f30408e6f5d6ab4ab843 |
| IEDL.DBID | M7S |
| ISICitedReferencesCount | 17 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000460767300168&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 2073-8994 |
| IngestDate | Fri Jul 25 11:58:27 EDT 2025 Sat Nov 29 07:11:29 EST 2025 Tue Nov 18 21:44:37 EST 2025 |
| IsDoiOpenAccess | true |
| IsOpenAccess | true |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 2 |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c298t-ffb1194fa0989c4ef89e0e5e6344c797f2eeef57f2759f30408e6f5d6ab4ab843 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ORCID | 0000-0002-5510-390X 0000-0002-1345-2650 0000-0002-0253-4597 |
| OpenAccessLink | https://www.proquest.com/docview/2550273188?pq-origsite=%requestingapplication% |
| PQID | 2550273188 |
| PQPubID | 2032326 |
| ParticipantIDs | proquest_journals_2550273188 crossref_primary_10_3390_sym11020290 crossref_citationtrail_10_3390_sym11020290 |
| PublicationCentury | 2000 |
| PublicationDate | 2019-02-01 |
| PublicationDateYYYYMMDD | 2019-02-01 |
| PublicationDate_xml | – month: 02 year: 2019 text: 2019-02-01 day: 01 |
| PublicationDecade | 2010 |
| PublicationPlace | Basel |
| PublicationPlace_xml | – name: Basel |
| PublicationTitle | Symmetry (Basel) |
| PublicationYear | 2019 |
| Publisher | MDPI AG |
| Publisher_xml | – name: MDPI AG |
| References | Tan (ref_5) 2014; 33 Mnih (ref_16) 2015; 518 (ref_9) 1998; 98 ref_14 ref_13 Uhlenbeck (ref_19) 1930; 36 ref_11 ref_22 ref_10 ref_21 ref_20 ref_1 ref_2 ref_18 ref_15 ref_8 Hwang (ref_17) 2009; 17 Meijaard (ref_3) 2007; Volume 463 ref_4 Peters (ref_12) 2008; 21 ref_7 ref_6 |
| References_xml | – volume: 17 start-page: 658 year: 2009 ident: ref_17 article-title: Fuzzy sliding-mode underactuated control for autonomous dynamic balance of an electrical bicycle publication-title: IEEE Trans. Control Syst. Technol. doi: 10.1109/TCST.2008.2004349 – ident: ref_8 – ident: ref_4 – volume: 36 start-page: 823 year: 1930 ident: ref_19 article-title: On the theory of the Brownian motion publication-title: Phys. Rev. doi: 10.1103/PhysRev.36.823 – volume: 518 start-page: 529 year: 2015 ident: ref_16 article-title: Human-level control through deep reinforcement learning publication-title: Nature doi: 10.1038/nature14236 – volume: 98 start-page: 463 year: 1998 ident: ref_9 article-title: Learning to drive a bicycle using reinforcement learning and shaping publication-title: ICML – ident: ref_10 – ident: ref_15 – ident: ref_6 doi: 10.1109/CCDC.2018.8408296 – ident: ref_11 doi: 10.1109/TNN.1998.712192 – ident: ref_13 – volume: 21 start-page: 682 year: 2008 ident: ref_12 article-title: Reinforcement learning of motor skills with policy gradients publication-title: Neural Netw. doi: 10.1016/j.neunet.2008.02.003 – ident: ref_14 – ident: ref_1 – ident: ref_18 – ident: ref_2 doi: 10.1109/IROS.2009.5353966 – ident: ref_22 – ident: ref_21 – ident: ref_20 – volume: Volume 463 start-page: 1955 year: 2007 ident: ref_3 article-title: Linearized dynamics equations for the balance and steer of a bicycle: a benchmark and review publication-title: Proceedings of the Royal Society of London A: Mathematical, Physical and Engineering Sciences – volume: 33 start-page: 50 year: 2014 ident: ref_5 article-title: Learning bicycle stunts publication-title: ACM Trans. Gr. (TOG) – ident: ref_7 doi: 10.1109/ICCAR.2018.8384682 |
| SSID | ssj0000505460 |
| Score | 2.2500904 |
| Snippet | In this paper, we propose a controller for a bicycle using the DDPG (Deep Deterministic Policy Gradient) algorithm, which is a state-of-the-art deep... |
| SourceID | proquest crossref |
| SourceType | Aggregation Database Enrichment Source Index Database |
| StartPage | 290 |
| SubjectTerms | Algorithms Artificial neural networks Bicycles Control theory Controllers Deep learning Learning Machine learning Neural networks Optimization Performance evaluation Tires Velocity |
| Title | Toward Self-Driving Bicycles Using State-of-the-Art Deep Reinforcement Learning Algorithms |
| URI | https://www.proquest.com/docview/2550273188 |
| Volume | 11 |
| WOSCitedRecordID | wos000460767300168&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVHPJ databaseName: ROAD: Directory of Open Access Scholarly Resources customDbUrl: eissn: 2073-8994 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0000505460 issn: 2073-8994 databaseCode: M~E dateStart: 20080101 isFulltext: true titleUrlDefault: https://road.issn.org providerName: ISSN International Centre – providerCode: PRVPQU databaseName: Engineering Database customDbUrl: eissn: 2073-8994 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0000505460 issn: 2073-8994 databaseCode: M7S dateStart: 20090301 isFulltext: true titleUrlDefault: http://search.proquest.com providerName: ProQuest – providerCode: PRVPQU databaseName: ProQuest Central customDbUrl: eissn: 2073-8994 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0000505460 issn: 2073-8994 databaseCode: BENPR dateStart: 20090301 isFulltext: true titleUrlDefault: https://www.proquest.com/central providerName: ProQuest – providerCode: PRVPQU databaseName: ProQuest Publicly Available Content Database customDbUrl: eissn: 2073-8994 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0000505460 issn: 2073-8994 databaseCode: PIMPY dateStart: 20090301 isFulltext: true titleUrlDefault: http://search.proquest.com/publiccontent providerName: ProQuest |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV3NT8IwFG8UPHjx24gi6YGDmjRurNvakwHB6EFCBBP1smztq5IgIEMTLv7ttqX4kRgvXrZka5Zmr--933t9_T2EqlL6kQbiGZGxDAhNqTSHlTkx7EppLYuhJjPbbCJut9ndHe-4hFvuyioXNtEaajkSJkd-qqGvoV7xGTsbvxDTNcrsrroWGsuoaFgSfFu61_3MsZgubTTy5sfyAh3dn-azZ-3vdMBvbPB3R_TTDlvncrH-32ltoDUHK3F9vg420RIMt9CmU9wcHzl26eNt9NCzhbK4CwNFmpO-SSjgRl_MTHkcthUE2CJQMlJEo0Oiv4mbAGN8A5ZlVdiEInbErI-4PnjUE5o-Pec76Pai1Tu_JK7BAhE1zqZEqcz3OVWpxxkXFBTj4EEIUUCpiHmsagCgQn2PQ64Cre8MIhXKKM1omjEa7KLCcDSEPYQhUyGLgXkp0yBRsVRmhtsOgsBjPBVxCZ0s_nYiHPu4aYIxSHQUYkSTfBNNCVU_B4_npBu_DysvZJI4zcuTL4Hs__36AK1q8MPnFdhlVJhOXuEQrYi3aT-fVFCx0Wp3bip2QZnre0s_61xdd-4_APeM19E |
| linkProvider | ProQuest |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V1NTxsxEB2lUKm9AGlBQFPqA0gtksXG-2UfqiolRUGEqGqChLgsu_aYRgpJmg1U-VP8xtreXZpKVW8cetrDWpbWfpp5MzvzBmBfqWZkiHhGVax8GqSBss3Kglp1pZRlMTKVuWETca_HLy_F1xo8VL0wtqyysonOUKuJtDnyI0N9rfRKk_NP0x_UTo2yf1erERoFLM5w8dOEbPnH07a53wPGTr4Mjju0nCpAJRN8TrXOmiZy16knuJABai7QwxAjPwhkLGLNEFGH5hmHQpto3-MY6VBFaRakGQ98s-8zWDU0gglXKth_zOnYqXBB5BVtgL4vvKN8cWv8K_OYtfnLju9Pu--c2cn6_3YMG7BW0mbSKnBehxqOX0G9NEw5eV-qZ394DVcDVwhM-jjStD0b2oQJ-TyUC1v-R1yFBHEMm040NeyXmj1JG3FKvqFTkZUuYUpK4dkb0hrdmAOYf7_NN-HiSb5xC1bGkzFuA8FMhzxG7qXckGDNU5VZ7T70fY-LVMY7cFjdbiJLdXU75GOUmCjLQiFZgsIO7D8unhaiIn9f1qgwkJSWJU9-A2D336_fwYvO4LybdE97Z2_gpSF6oqg2b8DKfHaHb-G5vJ8P89meAzGB66eGyy-2HzG_ |
| linkToPdf | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V3fb9MwED6NDiFeGOOHGAzww5AAyVrqOIn9ME2DUlENqgqGNPYSEvs8KnVtaQqo_9r-Os6JM4aEeNsDT3mIZSn257vvLufvAHas7aZExEtuMxtzWUjrLytr7tWVClFmKGxZN5vIhkN1fKxHa3De3oXxZZWtTawNtZ0ZnyPfJerrpVe6FLC5UBYx6vX359-47yDl_7S27TQaiBzi6ieFb9XeoEd7_UyI_puj12956DDAjdBqyZ0ruxTFuyLSShuJTmmMMME0ltJkOnMCEV1CzyzRjiL_SGHqEpsWpSxKJWOa9xqsEyWXogPro8H70eeLDI_vESfTqLkUGMc62q1WZ-RtRSS8B7jsBv_0ArVr62_8z4tyG24FQs0OmhOwCWs4vQObwWRV7HnQ1X5xF06O6hJh9hEnjvcWY59KYa_GZuULA1ldO8Fq7s1njhMv5jQn6yHO2Qes9WVNnUplQZL2lB1MTmkBll_Pqnvw6Uq-8T50prMpPgCGpUtUhioqFNFjpwpbelU_jONI6cJkW_Cy3encBN113_5jklP85WGRX4LFFuxcDJ43ciN_H7bd4iEPNqfKf4Ph4b9fP4UbhJL83WB4-AhuEgPUTRn6NnSWi-_4GK6bH8txtXgSEM3gy1Xj5ReiQjv1 |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Toward+Self-Driving+Bicycles+Using+State-of-the-Art+Deep+Reinforcement+Learning+Algorithms&rft.jtitle=Symmetry+%28Basel%29&rft.au=Choi%2C+SeungYoon&rft.au=Le%2C+Tuyen+P&rft.au=Nguyen%2C+Quang+D&rft.au=Md+Abu+Layek&rft.date=2019-02-01&rft.pub=MDPI+AG&rft.eissn=2073-8994&rft.volume=11&rft.issue=2&rft.spage=290&rft_id=info:doi/10.3390%2Fsym11020290&rft.externalDBID=HAS_PDF_LINK |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2073-8994&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2073-8994&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2073-8994&client=summon |