Optimization of reward shaping function based on genetic algorithm applied to a cross validated deep deterministic policy gradient in a powered landing guidance problem
One major capability of a Deep Reinforcement Learning (DRL) agent to control a specific vehicle in an environment without any prior knowledge is decision-making based on a well-designed reward shaping function. An important but little-studied major factor that can alter significantly the training re...
Gespeichert in:
| Veröffentlicht in: | Engineering applications of artificial intelligence Jg. 120; S. 105798 |
|---|---|
| Hauptverfasser: | , , , , , |
| Format: | Journal Article |
| Sprache: | Englisch |
| Veröffentlicht: |
Elsevier Ltd
01.04.2023
|
| Schlagworte: | |
| ISSN: | 0952-1976, 1873-6769 |
| Online-Zugang: | Volltext |
| Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
| Abstract | One major capability of a Deep Reinforcement Learning (DRL) agent to control a specific vehicle in an environment without any prior knowledge is decision-making based on a well-designed reward shaping function. An important but little-studied major factor that can alter significantly the training reward score and performance outcomes is the reward shaping function. To maximize the control efficacy of a DRL algorithm, an optimized reward shaping function and a solid hyperparameter combination are essential. In order to achieve optimal control during the powered descent guidance (PDG) landing phase of a reusable launch vehicle, the Deep Deterministic Policy Gradient (DDPG) algorithm is used in this paper to discover the best shape of the reward shaping function (RSF). Although DDPG is quite capable of managing complex environments and producing actions intended for continuous spaces, its state and action performance could still be improved. A reference DDPG agent with the original reward shaping function and a PID controller were placed side by side with the GA-DDPG agent using GA-optimized RSF. The best GA-DDPG individual can maximize overall rewards and minimize state errors with the help of the potential-based GA(PbGA) searched RSF, maintaining the highest fitness score among all individuals after has been cross-validated and retested extensively Monte-Carlo experimental results. |
|---|---|
| AbstractList | One major capability of a Deep Reinforcement Learning (DRL) agent to control a specific vehicle in an environment without any prior knowledge is decision-making based on a well-designed reward shaping function. An important but little-studied major factor that can alter significantly the training reward score and performance outcomes is the reward shaping function. To maximize the control efficacy of a DRL algorithm, an optimized reward shaping function and a solid hyperparameter combination are essential. In order to achieve optimal control during the powered descent guidance (PDG) landing phase of a reusable launch vehicle, the Deep Deterministic Policy Gradient (DDPG) algorithm is used in this paper to discover the best shape of the reward shaping function (RSF). Although DDPG is quite capable of managing complex environments and producing actions intended for continuous spaces, its state and action performance could still be improved. A reference DDPG agent with the original reward shaping function and a PID controller were placed side by side with the GA-DDPG agent using GA-optimized RSF. The best GA-DDPG individual can maximize overall rewards and minimize state errors with the help of the potential-based GA(PbGA) searched RSF, maintaining the highest fitness score among all individuals after has been cross-validated and retested extensively Monte-Carlo experimental results. |
| ArticleNumber | 105798 |
| Author | Andiarti, Rika Larasati, Diva Kartika Wijaya, Sastra Kusuma Nugroho, Larasmoyo Kutay, Ali Türker Akmeliawati, Rini |
| Author_xml | – sequence: 1 givenname: Larasmoyo orcidid: 0000-0003-1139-0289 surname: Nugroho fullname: Nugroho, Larasmoyo email: larasmoyo.nugroho@brin.go.id organization: Physics Dept., Universitas Indonesia, Depok, Indonesia – sequence: 2 givenname: Rika surname: Andiarti fullname: Andiarti, Rika organization: Rocket Technology Center, Indonesian National Air and Space Agency, Bogor, Indonesia – sequence: 3 givenname: Rini surname: Akmeliawati fullname: Akmeliawati, Rini organization: School of Mechanical Eng., University of Adelaide, Adelaide, Australia – sequence: 4 givenname: Ali Türker orcidid: 0000-0002-7243-1390 surname: Kutay fullname: Kutay, Ali Türker organization: Aeronautical Eng. Dept., Middle East Technical University, Ankara, Turkiye – sequence: 5 givenname: Diva Kartika surname: Larasati fullname: Larasati, Diva Kartika organization: Physics Dept., Universitas Indonesia, Depok, Indonesia – sequence: 6 givenname: Sastra Kusuma orcidid: 0000-0003-0780-9585 surname: Wijaya fullname: Wijaya, Sastra Kusuma email: skwijaya@sci.ui.ac.id organization: Physics Dept., Universitas Indonesia, Depok, Indonesia |
| BookMark | eNqFkM9u1DAQhy1UJLaFV0B-gSxOvOs_EgdQBQWpUi-9WxN7ks4qsSPHbVWeiMfEuwsXLr3Y1oy_32i-S3YRU0TGPrZi24pWfTpsMY6wLEDbTnRdLe61NW_YpjVaNkore8E2wu67prVavWOX63oQQkizUxv2-24pNNMvKJQiTwPP-Aw58PUBFoojHx6jP7V6WDHw-hgxYiHPYRpTpvIw8zp7otosiQP3Oa0rf4KJApRaDIhLPQrmmSKtR3JJE_kXPmYIhLFwipVb0jPm-n-CGI6Dx8caED3yJad-wvk9ezvAtOKHv_cVu__-7f76R3N7d_Pz-utt42XblSb03ljfgRISle2sMrjbGTXIwWKvjbUmSKm16Xyrhd4Pth_aXWcBLEjhB3nFPp9jT3tkHJyncpJTMtDkWuGO0t3B_ZPujtLdWXrF1X_4kmmG_PI6-OUMYt3tiTC71Vc5HgNl9MWFRK9F_AH1yKdX |
| CitedBy_id | crossref_primary_10_1371_journal_pone_0292539 crossref_primary_10_1016_j_engappai_2024_108511 crossref_primary_10_1109_ACCESS_2024_3359417 crossref_primary_10_1016_j_aei_2025_103720 crossref_primary_10_1016_j_engappai_2024_109485 crossref_primary_10_3390_robotics14040049 crossref_primary_10_1007_s40747_024_01385_4 crossref_primary_10_1080_0952813X_2024_2321152 crossref_primary_10_1007_s10499_025_01914_z crossref_primary_10_1016_j_displa_2024_102796 crossref_primary_10_1016_j_ymssp_2025_112502 crossref_primary_10_1016_j_engappai_2025_110523 crossref_primary_10_1016_j_engappai_2025_111623 crossref_primary_10_1016_j_engappai_2025_110676 |
| Cites_doi | 10.1007/s40295-015-0045-1 10.1038/scientificamerican0792-66 10.1016/j.ast.2020.105746 10.1080/095281397147149 10.1103/PhysRev.36.823 10.1177/1059712308092835 10.3389/fbioe.2021.793782 10.1109/CAC48633.2019.8996970 10.1016/j.asr.2015.12.006 10.1016/j.neunet.2010.01.001 10.1038/nature14236 10.1007/s42064-018-0053-6 10.1016/j.ifacol.2018.05.020 10.1002/rnc.727 10.2514/3.26850 10.1109/70.143349 |
| ContentType | Journal Article |
| Copyright | 2022 Elsevier Ltd |
| Copyright_xml | – notice: 2022 Elsevier Ltd |
| DBID | AAYXX CITATION |
| DOI | 10.1016/j.engappai.2022.105798 |
| DatabaseName | CrossRef |
| DatabaseTitle | CrossRef |
| DatabaseTitleList | |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Applied Sciences Computer Science |
| EISSN | 1873-6769 |
| ExternalDocumentID | 10_1016_j_engappai_2022_105798 S0952197622007886 |
| GroupedDBID | --K --M .DC .~1 0R~ 1B1 1~. 1~5 29G 4.4 457 4G. 5GY 5VS 7-5 71M 8P~ 9JN AABNK AACTN AAEDT AAEDW AAIAV AAIKJ AAKOC AALRI AAOAW AAQFI AAQXK AAXUO AAYFN ABBOA ABMAC ABXDB ABYKQ ACDAQ ACGFS ACNNM ACRLP ACZNC ADBBV ADEZE ADJOM ADMUD ADTZH AEBSH AECPX AEKER AENEX AFKWA AFTJW AGHFR AGUBO AGYEJ AHHHB AHJVU AHZHX AIALX AIEXJ AIKHN AITUG AJBFU AJOXV ALMA_UNASSIGNED_HOLDINGS AMFUW AMRAJ AOUOD ASPBG AVWKF AXJTR AZFZN BJAXD BKOJK BLXMC CS3 DU5 EBS EFJIC EFLBG EJD EO8 EO9 EP2 EP3 F5P FDB FEDTE FGOYB FIRID FNPLU FYGXN G-2 G-Q GBLVA GBOLZ HLZ HVGLF HZ~ IHE J1W JJJVA KOM LG9 LY7 M41 MO0 N9A O-L O9- OAUVE OZT P-8 P-9 P2P PC. Q38 R2- RIG ROL RPZ SBC SDF SDG SDP SES SET SEW SPC SPCBC SST SSV SSZ T5K TN5 UHS WUQ ZMT ~G- 9DU AATTM AAXKI AAYWO AAYXX ABJNI ABWVN ACLOT ACRPL ACVFH ADCNI ADNMO AEIPS AEUPX AFJKZ AFPUW AGQPQ AIGII AIIUN AKBMS AKRWK AKYEP ANKPU APXCP CITATION EFKBS ~HD |
| ID | FETCH-LOGICAL-c312t-dbc89c2a603e692968e4486f3f9eb78998d337782c17075f9bf1429aa9a30cf3 |
| ISICitedReferencesCount | 15 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000924435400001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 0952-1976 |
| IngestDate | Sat Nov 29 07:09:21 EST 2025 Tue Nov 18 20:56:36 EST 2025 Fri Feb 23 02:37:33 EST 2024 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Keywords | Reward shaping function DDPG Reusable launch vehicle GA-search Fitness |
| Language | English |
| LinkModel | OpenURL |
| MergedId | FETCHMERGED-LOGICAL-c312t-dbc89c2a603e692968e4486f3f9eb78998d337782c17075f9bf1429aa9a30cf3 |
| ORCID | 0000-0002-7243-1390 0000-0003-0780-9585 0000-0003-1139-0289 |
| ParticipantIDs | crossref_citationtrail_10_1016_j_engappai_2022_105798 crossref_primary_10_1016_j_engappai_2022_105798 elsevier_sciencedirect_doi_10_1016_j_engappai_2022_105798 |
| PublicationCentury | 2000 |
| PublicationDate | April 2023 2023-04-00 |
| PublicationDateYYYYMMDD | 2023-04-01 |
| PublicationDate_xml | – month: 04 year: 2023 text: April 2023 |
| PublicationDecade | 2020 |
| PublicationTitle | Engineering applications of artificial intelligence |
| PublicationYear | 2023 |
| Publisher | Elsevier Ltd |
| Publisher_xml | – name: Elsevier Ltd |
| References | Mataric (b46) 1997 Wu, Li (b75) 2020; 2020 Ng, Parr, Koller (b51) 1999 Uhlenbeck, Ornstein (b70) 1930 Ashraf, Mostafa, Sakr, Rashad (b2) 2021; 16 Kapoor (b34) 2018 Urone, Hinrichs (b71) 2022 Fortune Business Insight, 2020. The U.S. Reusable Launch Vehicle Market Size. Market Research Report,. Dutta (b11) 2018 Grzes, Kudenko (b22) 2010; 23 Furfaro, Wibben, Gaudet, Simo (b16) 2015 Gaudet, Furfaro (b18) 2014 Dong, Ding, Zhang (b10) 2020 Li, Jamieson, Desalvo (b41) 2018; 18 Parberry (b55) 2013 Jin, Liu, Shen, Zhu (b32) 2021 Sehgal, La, Louis, Nguyen (b63) 2019 Gunnell, Medina, Mansfield, Rodriguez (b23) 2019 Rauwolf, Coverstone-carroll (b59) 1996; 33 Kim (b38) 2021 Hagan, Demuth, De Jesus (b24) 2002; 12 Khadka, Majumdar, Miret, McAleer, Tumer (b36) 2020 Snoek, Larochelle, Adams (b69) 2012 Chakraborty, Murakami, Shiratori, Noguchi (b7) 1993 Rughani, Barnhart (b62) 2020 . Nguyen, La (b52) 2019 Holland (b27) 1992 Chen, Ma (b9) 2019 Gage, Braun, Kroo (b17) 1994 Gaudet, Linares, Furfaro (b21) 2020 Mnih, Kavukcuoglu, Silver, Rusu, Veness, Bellemare, Graves, Riedmiller, Fidjeland, Ostrovski, Petersen, Beattie, Sadik, Antonoglou, King, Kumaran, Wierstra, Legg, Hassabis (b49) 2015; 518 Gaudet, Furfaro, Linares (b19) 2020; 99 Oestreich, Linares (b54) 2020 Izzo, Martens, Pan (b30) 2019; 3 Pawlyk (b56) 2020 Ferrante (b14) 2017 Gaudet, Linares (b20) 2018 Sewak (b65) 2019 Arulkumaran, Deisenroth, Brundage, Bharath (b1) 2017 Sehgal, Ward, La (b64) 2022 Skinner (b68) 1938 Mathur, Sinha (b47) 2021 Silver, Lever, Heess, Degris, Wierstra, Riedmiller (b67) 2014 Ng (b50) 2003 Jones (b33) 2017 Refaeilzadeh, Tang, Liu (b60) 2009 Haydari, Yilmaz (b26) 2020 Kersandt (b35) 2018 Liu, Liu, Chen (b44) 2019; 1 Hasselt, Guez, Silver (b25) 2016 Huang, X., Luo, W., Liu, J., 2019. Attitude control of fixed-wing UAV based on DDQN. 1, 47224726. Mataric (b45) 1992; 8 Yu, Wang (b76) 2021 Chen, Huang, Wang, Antonoglou, Schrittwieser, Silver, Freitas (b8) 2018; 1 Khadka, Tumer (b37) 2018 Wang, Elgohary (b73) 2020; 2020 Lamba (b40) 2018 Balakrishnan (b3) 2019 Kumar, Penchalaiah, Sarkar, Talole (b39) 2018; 51 Lillicrap, Hunt, Pritzel, Heess, Erez, Tassa, Silver, Wierstra (b42) 2016 Liu, X., Jiang, D., Tao, B., Jiang, G., Sun, Y., Kong, J., 2022. Genetic algorithm-based trajectory optimization for digital twin robots. 9 (January), 1–11. Nugroho, Zani, Qomariyah, Wijaya (b53) 2021; 19 Shin, Kim (b66) 2023 Wan, Jing, Dai, Rea (b72) 2020 Izzo, Öztürk, Märtens (b31) 2019 Meriçli, Meriçli, Akın (b48) 2010 Brockman, Cheung, Pettersson, Schneider, Schulman, Tang, Zaremba (b5) 2016 Catto (b6) 2014 Iiyama (b29) 2021 Rubinsztejn, Bryan, Sood, Laipert, Frank (b61) 2020 Randløv, Alstrøm (b58) 1998 Ragab, Cheatwood, Hughes, Lowry, Agency, Inflatable, Decelerator, Missile, Reentry, Experiment, Station, Recovery, Modular, Return, Stage, Orbit, Engineer, Programs, Systems (b57) 2015 Bower, Hilgard (b4) 1981 Elfwing, Uchibe, Doya, Christensen (b12) 2008; 16 Eversden (b13) 2020 Wibben, Furfaro (b74) 2016; 57 Catto (10.1016/j.engappai.2022.105798_b6) 2014 Chakraborty (10.1016/j.engappai.2022.105798_b7) 1993 Rauwolf (10.1016/j.engappai.2022.105798_b59) 1996; 33 Snoek (10.1016/j.engappai.2022.105798_b69) 2012 Khadka (10.1016/j.engappai.2022.105798_b36) 2020 Sewak (10.1016/j.engappai.2022.105798_b65) 2019 Kumar (10.1016/j.engappai.2022.105798_b39) 2018; 51 Haydari (10.1016/j.engappai.2022.105798_b26) 2020 Eversden (10.1016/j.engappai.2022.105798_b13) 2020 Mnih (10.1016/j.engappai.2022.105798_b49) 2015; 518 Chen (10.1016/j.engappai.2022.105798_b8) 2018; 1 Kapoor (10.1016/j.engappai.2022.105798_b34) 2018 Wan (10.1016/j.engappai.2022.105798_b72) 2020 Balakrishnan (10.1016/j.engappai.2022.105798_b3) 2019 Urone (10.1016/j.engappai.2022.105798_b71) 2022 Rughani (10.1016/j.engappai.2022.105798_b62) 2020 Nugroho (10.1016/j.engappai.2022.105798_b53) 2021; 19 Ferrante (10.1016/j.engappai.2022.105798_b14) 2017 Hasselt (10.1016/j.engappai.2022.105798_b25) 2016 10.1016/j.engappai.2022.105798_b43 Mataric (10.1016/j.engappai.2022.105798_b45) 1992; 8 Gaudet (10.1016/j.engappai.2022.105798_b19) 2020; 99 Gaudet (10.1016/j.engappai.2022.105798_b20) 2018 Izzo (10.1016/j.engappai.2022.105798_b31) 2019 10.1016/j.engappai.2022.105798_b15 Jin (10.1016/j.engappai.2022.105798_b32) 2021 Lillicrap (10.1016/j.engappai.2022.105798_b42) 2016 Arulkumaran (10.1016/j.engappai.2022.105798_b1) 2017 Ashraf (10.1016/j.engappai.2022.105798_b2) 2021; 16 Brockman (10.1016/j.engappai.2022.105798_b5) 2016 Refaeilzadeh (10.1016/j.engappai.2022.105798_b60) 2009 Lamba (10.1016/j.engappai.2022.105798_b40) 2018 Dutta (10.1016/j.engappai.2022.105798_b11) 2018 Grzes (10.1016/j.engappai.2022.105798_b22) 2010; 23 Li (10.1016/j.engappai.2022.105798_b41) 2018; 18 Dong (10.1016/j.engappai.2022.105798_b10) 2020 Furfaro (10.1016/j.engappai.2022.105798_b16) 2015 Sehgal (10.1016/j.engappai.2022.105798_b64) 2022 Gunnell (10.1016/j.engappai.2022.105798_b23) 2019 10.1016/j.engappai.2022.105798_b28 Jones (10.1016/j.engappai.2022.105798_b33) 2017 Skinner (10.1016/j.engappai.2022.105798_b68) 1938 Liu (10.1016/j.engappai.2022.105798_b44) 2019; 1 Ragab (10.1016/j.engappai.2022.105798_b57) 2015 Sehgal (10.1016/j.engappai.2022.105798_b63) 2019 Gaudet (10.1016/j.engappai.2022.105798_b18) 2014 Ng (10.1016/j.engappai.2022.105798_b51) 1999 Meriçli (10.1016/j.engappai.2022.105798_b48) 2010 Kim (10.1016/j.engappai.2022.105798_b38) 2021 Silver (10.1016/j.engappai.2022.105798_b67) 2014 Oestreich (10.1016/j.engappai.2022.105798_b54) 2020 Rubinsztejn (10.1016/j.engappai.2022.105798_b61) 2020 Khadka (10.1016/j.engappai.2022.105798_b37) 2018 Hagan (10.1016/j.engappai.2022.105798_b24) 2002; 12 Kersandt (10.1016/j.engappai.2022.105798_b35) 2018 Wu (10.1016/j.engappai.2022.105798_b75) 2020; 2020 Shin (10.1016/j.engappai.2022.105798_b66) 2023 Randløv (10.1016/j.engappai.2022.105798_b58) 1998 Mataric (10.1016/j.engappai.2022.105798_b46) 1997 Izzo (10.1016/j.engappai.2022.105798_b30) 2019; 3 Elfwing (10.1016/j.engappai.2022.105798_b12) 2008; 16 Wibben (10.1016/j.engappai.2022.105798_b74) 2016; 57 Wang (10.1016/j.engappai.2022.105798_b73) 2020; 2020 Chen (10.1016/j.engappai.2022.105798_b9) 2019 Parberry (10.1016/j.engappai.2022.105798_b55) 2013 Gage (10.1016/j.engappai.2022.105798_b17) 1994 Ng (10.1016/j.engappai.2022.105798_b50) 2003 Pawlyk (10.1016/j.engappai.2022.105798_b56) 2020 Nguyen (10.1016/j.engappai.2022.105798_b52) 2019 Uhlenbeck (10.1016/j.engappai.2022.105798_b70) 1930 Gaudet (10.1016/j.engappai.2022.105798_b21) 2020 Bower (10.1016/j.engappai.2022.105798_b4) 1981 Yu (10.1016/j.engappai.2022.105798_b76) 2021 Holland (10.1016/j.engappai.2022.105798_b27) 1992 Iiyama (10.1016/j.engappai.2022.105798_b29) 2021 Mathur (10.1016/j.engappai.2022.105798_b47) 2021 |
| References_xml | – year: 2018 ident: b11 article-title: Reinforcement Learning with TensorFlow – start-page: 6607 year: 2020 end-page: 6616 ident: b36 article-title: Evolutionary reinforcement learning for sample-efficient multiagent coordination publication-title: 37th International Conference on Machine Learning, ICML 2020, PartF 16814 – volume: 1 year: 2018 ident: b8 article-title: Bayesian optimization in AlphaGo publication-title: DeepMind Technol. – year: 2020 ident: b10 article-title: Deep Reinforcement Learning Fundamentals, Research and Applications – start-page: 1188 year: 2018 end-page: 1200 ident: b37 article-title: Evolution-guided policy gradient in reinforcement learning publication-title: Advances in Neural Information Processing Systems, 2018-Decem – year: 2018 ident: b35 article-title: Deep Reinforcement Learning As Control Method for Autonomous UAVs – reference: Liu, X., Jiang, D., Tao, B., Jiang, G., Sun, Y., Kong, J., 2022. Genetic algorithm-based trajectory optimization for digital twin robots. 9 (January), 1–11. – start-page: 1517 year: 2020 ident: b56 article-title: Rise of the Machines: AI Algorithm Beats F- 16 Pilot in Dogfight – year: 2020 ident: b61 article-title: Using reinforcement learning to design missed thrust resilient trajectories publication-title: AIAA Astrodynamics Conference, August – year: 2017 ident: b14 article-title: A Robust Control Approach for Rocket Landing – start-page: 590 year: 2019 end-page: 595 ident: b52 article-title: Review of deep reinforcement learning for robot manipulation publication-title: Proceedings - 3 Rd IEEE International Conference on Robotic Computing, IRC 2019, November – volume: 16 start-page: 400 year: 2008 end-page: 412 ident: b12 article-title: Co-evolution of shaping rewards and meta-parameters in reinforcement learning publication-title: Adapt. Behav. – year: 2020 ident: b62 article-title: Using Genetic Algorithms for Safe Swarm Trajectory Optimization – year: 2014 ident: b18 article-title: Adaptive pinpoint and fuel efficient mars landing using reinforcement learning adaptive – start-page: 1 year: 2017 end-page: 16 ident: b1 article-title: A brief survey of deep reinforcement learning publication-title: IEEE Signal Process. Mag. – year: 2013 ident: b55 article-title: Introduction Game Physics with Box2D – start-page: 1 year: 2015 end-page: 10 ident: b57 article-title: Launch vehicle recovery and reuse publication-title: AIAA Space – volume: 2020 year: 2020 ident: b73 article-title: A simple and accurate apollo-trained neural network controller for mars atmospheric entry publication-title: Int. J. Aerosp. Eng. – year: 2009 ident: b60 article-title: Cross-Validation. Encyclopedia of Database Systems – year: 1999 ident: b51 article-title: Policy search via density estimation publication-title: Advances in Neural Information Processing Systems, Vol. 12 – volume: 518 start-page: 529 year: 2015 end-page: 533 ident: b49 article-title: Human-level control through deep reinforcement learning publication-title: Nature – start-page: 1 year: 1993 end-page: 5 ident: b7 article-title: A growing network that optimizes between undertraining and overtraining publication-title: IEEE Xplore – volume: 16 start-page: 1 year: 2021 end-page: 24 ident: b2 article-title: Optimizing hyperparameters of deep reinforcement learning for autonomous driving based on whale optimization algorithm publication-title: PLoS ONE – volume: 8 year: 1992 ident: b45 article-title: Integration of representation into goad-driven behavior-based robots publication-title: IEEE Trans. Robot. Autom. – year: 2022 ident: b64 article-title: Automatic parameter optimization using genetic algorithm in deep reinforcement learning for robotic manipulation tasks – start-page: 175 year: 2021 end-page: 180 ident: b47 article-title: Trajectory optimization of lunar landing using genetic algorithm publication-title: IEEE Xplore – year: 1981 ident: b4 article-title: Theories of Learning – year: 2016 ident: b25 article-title: Deep reinforcement learning with double Q-learning – start-page: 538 year: 1994 end-page: 547 ident: b17 article-title: Interplanetary trajectory optimization using a genetic algorithm publication-title: AIAA Astrodynamics Conference – year: 2019 ident: b65 article-title: Deterministic Policy Gradient and the DDPG: Deterministic-Policy-Gradient-Based Approaches – volume: 23 start-page: 541 year: 2010 end-page: 550 ident: b22 article-title: Online learning of shaping rewards in reinforcement learning publication-title: Neural Netw. – start-page: 1 year: 2010 end-page: 3 ident: b48 article-title: A reward function generation method using genetic algorithms: A robot soccer case study publication-title: 9th International Conference on Autonomous Agents and Multiagent Systems AAMAS 2010, May 2014 – year: 2019 ident: b9 article-title: Rocket Powered Landing Guidance Using Proximal Policy Optimization – start-page: 1 year: 2021 end-page: 25 ident: b29 article-title: Deep reinforcement learning for safe landing site selection with concurrent consideration of divert maneuvers publication-title: AIAA Astrodynamics Conference – reference: Fortune Business Insight, 2020. The U.S. Reusable Launch Vehicle Market Size. Market Research Report,. – year: 2021 ident: b32 article-title: SS symmetry deep deterministic policy gradient algorithm based on convolutional block attention for autonomous driving – year: 2019 ident: b23 article-title: Powered Descent and Landing of an Orbital-Class Rocket – volume: 33 year: 1996 ident: b59 article-title: Near-optimal low-thrust orbit transfers generated by a genetic algorithm publication-title: J. Spacecr. Rockets – year: 1998 ident: b58 article-title: Learning to drive a bicycle using reinforcement learning and shaping – volume: 12 start-page: 959 year: 2002 end-page: 985 ident: b24 article-title: An introduction to the use of neural networks in control systems publication-title: Internat. J. Robust Nonlinear Control – start-page: 605 year: 2014 end-page: 619 ident: b67 article-title: Deterministic policy gradient algorithms publication-title: 31st International Conference on Machine Learning, ICML 2014, Vol. 1 – volume: 99 start-page: 1 year: 2020 end-page: 16 ident: b19 article-title: Reinforcement learning for angle-only intercept guidance of maneuvering targets publication-title: Aerosp. Sci. Technol. – start-page: 1 year: 2020 end-page: 20 ident: b54 article-title: Autonomous six-degree-of-freedom spacecraft docking maneuvers via reinforcement learning – start-page: 155 year: 2003 ident: b50 article-title: Shaping and Policy Search in Reinforcement Learning – start-page: 1 year: 2018 end-page: 15 ident: b40 article-title: An introduction to Q- learning : reinforcement learning – year: 2020 ident: b72 article-title: Fuel-Optimal Guidance for End-to-End Human-Mars Entry, Powered-Descent, and Landing Mission – volume: 3 start-page: 287 year: 2019 end-page: 299 ident: b30 article-title: A survey on artificial intelligence trends in spacecraft guidance dynamics and control publication-title: Astrodynamics – year: 2020 ident: b21 article-title: Deep reinforcement learning for six degree-of-freedom planetary landing – volume: 19 start-page: 43 year: 2021 end-page: 56 ident: b53 article-title: Powered landing guidance algorithms using reinforcement learning methods for lunar lander case publication-title: J. Teknol. Dirgant. – start-page: 596 year: 2019 end-page: 601 ident: b63 article-title: Deep reinforcement learning using genetic algorithm for parameter optimization publication-title: Proceedings - 3 Rd IEEE International Conference on Robotic Computing, IRC 2019, February – year: 2016 ident: b5 article-title: OpenAI Gym. OpenAI 1–4 – volume: 51 start-page: 118 year: 2018 end-page: 123 ident: b39 article-title: Hypersonic boost glide vehicle trajectory optimization using genetic algorithm optimization using genetic algorithm publication-title: IFAC-PapersOnLine – year: 2019 ident: b31 article-title: Interplanetary transfers via deep representations of the optimal policy and/or of the value function publication-title: GECCO 2019 Companion - Proceedings of the 2019 Genetic and Evolutionary Computation Conference Companion, July, 1971–1979 – volume: 2020 year: 2020 ident: b75 article-title: Deep ensemble reinforcement learning with multiple deep deterministic policy gradient algorithm publication-title: Math. Probl. Eng. – year: 2023 ident: b66 article-title: Optimal Agent Search Using Surrogate-Assisted Genetic Algorithms. MDPI - Mathematics 1–17 – year: 2021 ident: b38 article-title: Temporal consistency based loss function for both deep Q networks and deep deterministic policy gradients for continuous actions – year: 2014 ident: b6 article-title: Understanding Constraints. GDC 2014, 1–65 – reference: Huang, X., Luo, W., Liu, J., 2019. Attitude control of fixed-wing UAV based on DDQN. 1, 47224726. – volume: 57 start-page: 948 year: 2016 end-page: 961 ident: b74 article-title: Optimal sliding guidance algorithm for Mars powered descent phase publication-title: Adv. Space Res. – year: 2021 ident: b76 article-title: A method for real-time fault detection of liquid rocket engine based on adaptive genetic algorithm optimizing backpropagation neural network publication-title: MDPI - Sensors – start-page: 1 year: 2017 end-page: 8 ident: b33 article-title: Train a software agent to behave rationally with reinforcement learning Q-learning – year: 1930 ident: b70 article-title: On the theory of the Brownian motion publication-title: Phys. Rev. – year: 1997 ident: b46 article-title: Behavior-based control : Examples from navigation, learning, and group behavior publication-title: J. Exp. Theor. Artif. Intell. – year: 2019 ident: b3 article-title: TensorFlow Reinforcement Learning Quick Start Guide Get Up and Running with Training and Deploying Intelligent, Self-Learning Agents using Python – year: 2018 ident: b20 article-title: Integrated guidance and control for pinpoint mars landing using reinforcement learning publication-title: AIAA Guidance, Navigation and Control Conference and Exhibit – start-page: 1 year: 2018 end-page: 13 ident: b34 article-title: Policy gradients in a nutshell publication-title: Towards Data Sci. – year: 2022 ident: b71 article-title: College Physics 2e. OpenStax - Rice University, Houston – year: 2016 ident: b42 article-title: Continuous control with deep reinforcement learning publication-title: 4th International Conference on Learning Representations, ICLR 2016 - Conference Track Proceedings – reference: . – volume: 18 start-page: 1 year: 2018 end-page: 52 ident: b41 article-title: Hyperband: A novel bandit-based approach to publication-title: J. Mach. Learn. Res. – year: 2020 ident: b13 article-title: AI Algorithm Defeats Human Fighter Pilot in Simulated Dogfight – start-page: 1 year: 2012 end-page: 12 ident: b69 article-title: Practical Bayesian optimization of machine learning – year: 1992 ident: b27 article-title: Genetic algorithms publication-title: Sci. Am. – year: 2015 ident: b16 article-title: Terminal multiple surface sliding guidance for planetary landing: Development, tuning and optimization via reinforcement learning publication-title: J. Astronaut. Sci. – year: 2020 ident: b26 article-title: Deep reinforcement learning for intelligent transportation systems : A survey publication-title: IEEE Intell. Transp. Syst. Mag. – volume: 1 start-page: 1 year: 2019 end-page: 14 ident: b44 article-title: Reinforcement learning in multiple-UAV networks :Deployment and movement design publication-title: IEEE Trans. Veh. Technol. – year: 1938 ident: b68 article-title: The Behavior of Organisms: An Experimental Analysis – year: 2020 ident: 10.1016/j.engappai.2022.105798_b26 article-title: Deep reinforcement learning for intelligent transportation systems : A survey publication-title: IEEE Intell. Transp. Syst. Mag. – year: 2022 ident: 10.1016/j.engappai.2022.105798_b71 – year: 2018 ident: 10.1016/j.engappai.2022.105798_b11 – ident: 10.1016/j.engappai.2022.105798_b15 – start-page: 596 year: 2019 ident: 10.1016/j.engappai.2022.105798_b63 article-title: Deep reinforcement learning using genetic algorithm for parameter optimization – year: 2020 ident: 10.1016/j.engappai.2022.105798_b72 – year: 2009 ident: 10.1016/j.engappai.2022.105798_b60 – year: 2015 ident: 10.1016/j.engappai.2022.105798_b16 article-title: Terminal multiple surface sliding guidance for planetary landing: Development, tuning and optimization via reinforcement learning publication-title: J. Astronaut. Sci. doi: 10.1007/s40295-015-0045-1 – year: 1992 ident: 10.1016/j.engappai.2022.105798_b27 article-title: Genetic algorithms publication-title: Sci. Am. doi: 10.1038/scientificamerican0792-66 – year: 2021 ident: 10.1016/j.engappai.2022.105798_b76 article-title: A method for real-time fault detection of liquid rocket engine based on adaptive genetic algorithm optimizing backpropagation neural network publication-title: MDPI - Sensors – start-page: 1 year: 2015 ident: 10.1016/j.engappai.2022.105798_b57 article-title: Launch vehicle recovery and reuse publication-title: AIAA Space – year: 2023 ident: 10.1016/j.engappai.2022.105798_b66 – volume: 99 start-page: 1 year: 2020 ident: 10.1016/j.engappai.2022.105798_b19 article-title: Reinforcement learning for angle-only intercept guidance of maneuvering targets publication-title: Aerosp. Sci. Technol. doi: 10.1016/j.ast.2020.105746 – year: 2019 ident: 10.1016/j.engappai.2022.105798_b3 – year: 1997 ident: 10.1016/j.engappai.2022.105798_b46 article-title: Behavior-based control : Examples from navigation, learning, and group behavior publication-title: J. Exp. Theor. Artif. Intell. doi: 10.1080/095281397147149 – year: 2016 ident: 10.1016/j.engappai.2022.105798_b5 – year: 2013 ident: 10.1016/j.engappai.2022.105798_b55 – year: 1930 ident: 10.1016/j.engappai.2022.105798_b70 article-title: On the theory of the Brownian motion publication-title: Phys. Rev. doi: 10.1103/PhysRev.36.823 – volume: 16 start-page: 400 issue: 6 year: 2008 ident: 10.1016/j.engappai.2022.105798_b12 article-title: Co-evolution of shaping rewards and meta-parameters in reinforcement learning publication-title: Adapt. Behav. doi: 10.1177/1059712308092835 – ident: 10.1016/j.engappai.2022.105798_b43 doi: 10.3389/fbioe.2021.793782 – volume: 2020 issue: September year: 2020 ident: 10.1016/j.engappai.2022.105798_b73 article-title: A simple and accurate apollo-trained neural network controller for mars atmospheric entry publication-title: Int. J. Aerosp. Eng. – year: 2014 ident: 10.1016/j.engappai.2022.105798_b18 – year: 2020 ident: 10.1016/j.engappai.2022.105798_b61 article-title: Using reinforcement learning to design missed thrust resilient trajectories – ident: 10.1016/j.engappai.2022.105798_b28 doi: 10.1109/CAC48633.2019.8996970 – start-page: 590 year: 2019 ident: 10.1016/j.engappai.2022.105798_b52 article-title: Review of deep reinforcement learning for robot manipulation – year: 1998 ident: 10.1016/j.engappai.2022.105798_b58 – year: 2019 ident: 10.1016/j.engappai.2022.105798_b65 – start-page: 1 year: 1993 ident: 10.1016/j.engappai.2022.105798_b7 article-title: A growing network that optimizes between undertraining and overtraining – start-page: 1 year: 2018 ident: 10.1016/j.engappai.2022.105798_b34 article-title: Policy gradients in a nutshell publication-title: Towards Data Sci. – year: 1999 ident: 10.1016/j.engappai.2022.105798_b51 article-title: Policy search via density estimation – volume: 57 start-page: 948 issue: 4 year: 2016 ident: 10.1016/j.engappai.2022.105798_b74 article-title: Optimal sliding guidance algorithm for Mars powered descent phase publication-title: Adv. Space Res. doi: 10.1016/j.asr.2015.12.006 – year: 2017 ident: 10.1016/j.engappai.2022.105798_b14 – year: 1938 ident: 10.1016/j.engappai.2022.105798_b68 – volume: 23 start-page: 541 issue: 4 year: 2010 ident: 10.1016/j.engappai.2022.105798_b22 article-title: Online learning of shaping rewards in reinforcement learning publication-title: Neural Netw. doi: 10.1016/j.neunet.2010.01.001 – year: 2016 ident: 10.1016/j.engappai.2022.105798_b42 article-title: Continuous control with deep reinforcement learning – year: 1981 ident: 10.1016/j.engappai.2022.105798_b4 – year: 2014 ident: 10.1016/j.engappai.2022.105798_b6 – year: 2016 ident: 10.1016/j.engappai.2022.105798_b25 – volume: 2020 year: 2020 ident: 10.1016/j.engappai.2022.105798_b75 article-title: Deep ensemble reinforcement learning with multiple deep deterministic policy gradient algorithm publication-title: Math. Probl. Eng. – year: 2022 ident: 10.1016/j.engappai.2022.105798_b64 – volume: 1 year: 2018 ident: 10.1016/j.engappai.2022.105798_b8 article-title: Bayesian optimization in AlphaGo publication-title: DeepMind Technol. – year: 2019 ident: 10.1016/j.engappai.2022.105798_b9 – start-page: 1 year: 2020 ident: 10.1016/j.engappai.2022.105798_b54 – year: 2018 ident: 10.1016/j.engappai.2022.105798_b35 – volume: 518 start-page: 529 issue: 7540 year: 2015 ident: 10.1016/j.engappai.2022.105798_b49 article-title: Human-level control through deep reinforcement learning publication-title: Nature doi: 10.1038/nature14236 – year: 2021 ident: 10.1016/j.engappai.2022.105798_b32 – start-page: 605 year: 2014 ident: 10.1016/j.engappai.2022.105798_b67 article-title: Deterministic policy gradient algorithms – year: 2020 ident: 10.1016/j.engappai.2022.105798_b13 – volume: 3 start-page: 287 issue: 4 year: 2019 ident: 10.1016/j.engappai.2022.105798_b30 article-title: A survey on artificial intelligence trends in spacecraft guidance dynamics and control publication-title: Astrodynamics doi: 10.1007/s42064-018-0053-6 – volume: 19 start-page: 43 issue: 1 year: 2021 ident: 10.1016/j.engappai.2022.105798_b53 article-title: Powered landing guidance algorithms using reinforcement learning methods for lunar lander case publication-title: J. Teknol. Dirgant. – start-page: 1 year: 2021 ident: 10.1016/j.engappai.2022.105798_b29 article-title: Deep reinforcement learning for safe landing site selection with concurrent consideration of divert maneuvers – start-page: 1 year: 2017 ident: 10.1016/j.engappai.2022.105798_b1 article-title: A brief survey of deep reinforcement learning publication-title: IEEE Signal Process. Mag. – start-page: 538 year: 1994 ident: 10.1016/j.engappai.2022.105798_b17 article-title: Interplanetary trajectory optimization using a genetic algorithm – volume: 18 start-page: 1 year: 2018 ident: 10.1016/j.engappai.2022.105798_b41 article-title: Hyperband: A novel bandit-based approach to publication-title: J. Mach. Learn. Res. – start-page: 1 year: 2012 ident: 10.1016/j.engappai.2022.105798_b69 – start-page: 6607 year: 2020 ident: 10.1016/j.engappai.2022.105798_b36 article-title: Evolutionary reinforcement learning for sample-efficient multiagent coordination – year: 2020 ident: 10.1016/j.engappai.2022.105798_b21 – start-page: 1188 year: 2018 ident: 10.1016/j.engappai.2022.105798_b37 article-title: Evolution-guided policy gradient in reinforcement learning – start-page: 175 year: 2021 ident: 10.1016/j.engappai.2022.105798_b47 article-title: Trajectory optimization of lunar landing using genetic algorithm – year: 2018 ident: 10.1016/j.engappai.2022.105798_b20 article-title: Integrated guidance and control for pinpoint mars landing using reinforcement learning – volume: 1 start-page: 1 year: 2019 ident: 10.1016/j.engappai.2022.105798_b44 article-title: Reinforcement learning in multiple-UAV networks :Deployment and movement design publication-title: IEEE Trans. Veh. Technol. – start-page: 1 year: 2010 ident: 10.1016/j.engappai.2022.105798_b48 article-title: A reward function generation method using genetic algorithms: A robot soccer case study – volume: 51 start-page: 118 issue: 1 year: 2018 ident: 10.1016/j.engappai.2022.105798_b39 article-title: Hypersonic boost glide vehicle trajectory optimization using genetic algorithm optimization using genetic algorithm publication-title: IFAC-PapersOnLine doi: 10.1016/j.ifacol.2018.05.020 – volume: 12 start-page: 959 issue: 11 year: 2002 ident: 10.1016/j.engappai.2022.105798_b24 article-title: An introduction to the use of neural networks in control systems publication-title: Internat. J. Robust Nonlinear Control doi: 10.1002/rnc.727 – year: 2020 ident: 10.1016/j.engappai.2022.105798_b10 – start-page: 155 year: 2003 ident: 10.1016/j.engappai.2022.105798_b50 – start-page: 1 year: 2018 ident: 10.1016/j.engappai.2022.105798_b40 – volume: 16 start-page: 1 issue: 6 June year: 2021 ident: 10.1016/j.engappai.2022.105798_b2 article-title: Optimizing hyperparameters of deep reinforcement learning for autonomous driving based on whale optimization algorithm publication-title: PLoS ONE – year: 2021 ident: 10.1016/j.engappai.2022.105798_b38 – volume: 33 issue: 6 year: 1996 ident: 10.1016/j.engappai.2022.105798_b59 article-title: Near-optimal low-thrust orbit transfers generated by a genetic algorithm publication-title: J. Spacecr. Rockets doi: 10.2514/3.26850 – year: 2019 ident: 10.1016/j.engappai.2022.105798_b23 – volume: 8 issue: 3 year: 1992 ident: 10.1016/j.engappai.2022.105798_b45 article-title: Integration of representation into goad-driven behavior-based robots publication-title: IEEE Trans. Robot. Autom. doi: 10.1109/70.143349 – year: 2020 ident: 10.1016/j.engappai.2022.105798_b62 – start-page: 1517 year: 2020 ident: 10.1016/j.engappai.2022.105798_b56 – year: 2019 ident: 10.1016/j.engappai.2022.105798_b31 article-title: Interplanetary transfers via deep representations of the optimal policy and/or of the value function – start-page: 1 year: 2017 ident: 10.1016/j.engappai.2022.105798_b33 |
| SSID | ssj0003846 |
| Score | 2.4405708 |
| Snippet | One major capability of a Deep Reinforcement Learning (DRL) agent to control a specific vehicle in an environment without any prior knowledge is... |
| SourceID | crossref elsevier |
| SourceType | Enrichment Source Index Database Publisher |
| StartPage | 105798 |
| SubjectTerms | DDPG Fitness GA-search Reusable launch vehicle Reward shaping function |
| Title | Optimization of reward shaping function based on genetic algorithm applied to a cross validated deep deterministic policy gradient in a powered landing guidance problem |
| URI | https://dx.doi.org/10.1016/j.engappai.2022.105798 |
| Volume | 120 |
| WOSCitedRecordID | wos000924435400001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVESC databaseName: Elsevier SD Freedom Collection Journals 2021 customDbUrl: eissn: 1873-6769 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0003846 issn: 0952-1976 databaseCode: AIEXJ dateStart: 19950201 isFulltext: true titleUrlDefault: https://www.sciencedirect.com providerName: Elsevier |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Lb5tAEF65SQ-99F0lfWkOvSFcw2IDRzdK1faQVpUPvqFlWRxiGyyMk_gf9Tf21NkXxm3UNIdeEFq0w6L52HnsPAh5N8hyFnpp5oaMj9wg58JlfJC5QRhkKI2EF2VMNZsIz86i6TT-1uv9tLkwl4uwLKPr63j1X1mNY8hsmTp7B3a3RHEA75HpeEW24_WfGP8VN4Glya6UqmAtZGCssz5nKjVKCjL1SMqvTJ4VICWhyrYuZlVdNOdLhxnVFPVS5ig56uAKC-kcyJxMiBVedBSNKvOsOj3wrTOrVfyYajrAcPBK9gGVHlGVODPbIAG5jZgeNntnAruqiE73SF1FKdQqnEk1F-nUD2292JtZXZ1XOse7Zutlta12vg1EvwlY-F7MWwk0ni_FomBXzD4qi92RVsO2OvOncCYyjODDST03IczGO-LTTlCNdXP6rhfrHjPtju8POnu27HSsO2H_IU60Z-OiL8oZfjwr-vgKv7-bsF-_-ze52kY72kC6i8TSSSSdRNO5Rw79cBijUDkcfz6dfmn1CBrpNDP7BZ389ptXdLNq1VGXJo_JQ2PnwFjj8wnpifIpeWRsHjASZY1Dtq2IHXtGfnQRDFUOGsFgEAwWwaAQDHhjEAwtgsEgGJoKGKjlQotgkAiGPQSDRjBYBENR4jyDYDAIBotgMAh-TiYfTycnn1zTUMTl1PMbN0t5FHOfjQZUjNAuGEUiCKJRTvNYpKH0PGSUhqgzcy9EVTqP09xDfY2xmNEBz-kLclBWpTgikIqAo_EeiTilaHCw1GNoeXtDRtM0Q6PnmAwtKxJuiu3Lni-L5O9gOCbv23krXW7m1hmx5XRilGatDCcI4lvmvrzz216RB7u_7DU5aOqNeEPu88umWNdvDYJ_AVTl86I |
| linkProvider | Elsevier |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Optimization+of+reward+shaping+function+based+on+genetic+algorithm+applied+to+a+cross+validated+deep+deterministic+policy+gradient+in+a+powered+landing+guidance+problem&rft.jtitle=Engineering+applications+of+artificial+intelligence&rft.au=Nugroho%2C+Larasmoyo&rft.au=Andiarti%2C+Rika&rft.au=Akmeliawati%2C+Rini&rft.au=Kutay%2C+Ali+T%C3%BCrker&rft.date=2023-04-01&rft.issn=0952-1976&rft.volume=120&rft.spage=105798&rft_id=info:doi/10.1016%2Fj.engappai.2022.105798&rft.externalDBID=n%2Fa&rft.externalDocID=10_1016_j_engappai_2022_105798 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0952-1976&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0952-1976&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0952-1976&client=summon |