Reward criteria impact on the performance of reinforcement learning agent for autonomous navigation
In reinforcement learning, an agent takes action at every time step (follows a policy) in an environment to maximize the expected cumulative reward. Therefore, the shaping of a reward function plays a crucial role in an agent’s learning. Designing an optimal reward function is not a trivial task. In...
Uloženo v:
| Vydáno v: | Applied soft computing Ročník 126; s. 109241 |
|---|---|
| Hlavní autoři: | , , |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
Elsevier B.V
01.09.2022
|
| Témata: | |
| ISSN: | 1568-4946, 1872-9681 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Abstract | In reinforcement learning, an agent takes action at every time step (follows a policy) in an environment to maximize the expected cumulative reward. Therefore, the shaping of a reward function plays a crucial role in an agent’s learning. Designing an optimal reward function is not a trivial task. In this article, we propose a reward criterion using which we develop different reward functions. The reward criterion chosen is based on the percentage of positive and negative rewards received by an agent. This reward criteria further gives rise to three different classes, ‘Balanced Class,’ ‘Skewed Positive Class,’ and ‘Skewed Negative Class.’ We train a Deep Q-Network agent on a point-goal based navigation task using the different reward classes. We also compare the performance of the proposed classes with a benchmark class. Based on the experiments, the skewed negative class outperforms the benchmark class by achieving very less variance. On the other hand, the benchmark class converges relatively faster than the skewed negative class.
•A reward criterion to assess the performance of an RL agent.•Various reward functions to train an RL agent.•The proportion of positive and negative rewards in a reward shaping function.•The reward criterion: ‘Balanced Class’, ‘Skewed Positive Class’ and ‘Skewed Negative Class’.•The performance of an RL agent in the case of autonomous navigation. |
|---|---|
| AbstractList | In reinforcement learning, an agent takes action at every time step (follows a policy) in an environment to maximize the expected cumulative reward. Therefore, the shaping of a reward function plays a crucial role in an agent’s learning. Designing an optimal reward function is not a trivial task. In this article, we propose a reward criterion using which we develop different reward functions. The reward criterion chosen is based on the percentage of positive and negative rewards received by an agent. This reward criteria further gives rise to three different classes, ‘Balanced Class,’ ‘Skewed Positive Class,’ and ‘Skewed Negative Class.’ We train a Deep Q-Network agent on a point-goal based navigation task using the different reward classes. We also compare the performance of the proposed classes with a benchmark class. Based on the experiments, the skewed negative class outperforms the benchmark class by achieving very less variance. On the other hand, the benchmark class converges relatively faster than the skewed negative class.
•A reward criterion to assess the performance of an RL agent.•Various reward functions to train an RL agent.•The proportion of positive and negative rewards in a reward shaping function.•The reward criterion: ‘Balanced Class’, ‘Skewed Positive Class’ and ‘Skewed Negative Class’.•The performance of an RL agent in the case of autonomous navigation. |
| ArticleNumber | 109241 |
| Author | Dayal, Aveen Jha, Ajit Cenkeramaddi, Linga Reddy |
| Author_xml | – sequence: 1 givenname: Aveen orcidid: 0000-0001-6792-9170 surname: Dayal fullname: Dayal, Aveen email: aveendayal97@gmail.com organization: Department of Information and Communication Technology, University of Agder, Norway – sequence: 2 givenname: Linga Reddy orcidid: 0000-0002-1023-2118 surname: Cenkeramaddi fullname: Cenkeramaddi, Linga Reddy email: linga.cenkeramaddi@uia.no organization: Department of Information and Communication Technology, University of Agder, Norway – sequence: 3 givenname: Ajit orcidid: 0000-0003-1435-9260 surname: Jha fullname: Jha, Ajit email: ajit.jha@uia.no organization: Department of Engineering Science, University of Agder, Norway |
| BookMark | eNp9kM9OAyEQh4mpiW31BTzxAluBZSkkXkzjv8TExOiZsDBbabrQsLTGt5e1njz0NPODfJOZb4YmIQZA6JqSBSVU3GwWZoh2wQhj5UExTs_QlMolq5SQdFL6RsiKKy4u0GwYNqRAiskpsm_wZZLDNvkMyRvs-52xGceA8yfgHaQupt4ECzh2OIEPJVvoIWS8BZOCD2ts1mMsH9jscwyxj_sBB3Pwa5N9DJfovDPbAa7-6hx9PNy_r56ql9fH59XdS2VrznMlmCAgQbhaOiVcS03bEiubuqZW1gws543kZsyGC2eh5UQqt5RSNYqJtp4jeZxrUxyGBJ22Pv9ukJPxW02JHmXpjR5l6VGWPsoqKPuH7pLvTfo-Dd0eIShHHTwkPVgPRZXzCWzWLvpT-A8rEocQ |
| CitedBy_id | crossref_primary_10_3390_app14125213 crossref_primary_10_1007_s40435_024_01407_6 crossref_primary_10_1016_j_aei_2024_102960 crossref_primary_10_1080_0952813X_2024_2321152 crossref_primary_10_1016_j_asoc_2023_110543 crossref_primary_10_3390_electronics12183773 crossref_primary_10_3390_s23167216 crossref_primary_10_1016_j_asoc_2023_110756 crossref_primary_10_3390_chemengineering9020034 crossref_primary_10_1016_j_aej_2025_01_052 crossref_primary_10_59400_cai2923 crossref_primary_10_1186_s11671_025_04338_z |
| Cites_doi | 10.3390/s20133664 10.1177/1729881421992621 10.1038/nature14236 10.1038/s41586-019-1724-z 10.1016/j.artint.2015.03.009 10.1038/nature16961 |
| ContentType | Journal Article |
| Copyright | 2022 The Author(s) |
| Copyright_xml | – notice: 2022 The Author(s) |
| DBID | 6I. AAFTH AAYXX CITATION |
| DOI | 10.1016/j.asoc.2022.109241 |
| DatabaseName | ScienceDirect Open Access Titles Elsevier:ScienceDirect:Open Access CrossRef |
| DatabaseTitle | CrossRef |
| DatabaseTitleList | |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Computer Science |
| EISSN | 1872-9681 |
| ExternalDocumentID | 10_1016_j_asoc_2022_109241 S1568494622004586 |
| GroupedDBID | --K --M .DC .~1 0R~ 1B1 1~. 1~5 23M 4.4 457 4G. 53G 5GY 5VS 6I. 6J9 7-5 71M 8P~ AABNK AACTN AAEDT AAEDW AAFTH AAIAV AAIKJ AAKOC AALRI AAOAW AAQFI AAQXK AAXUO AAYFN ABBOA ABFNM ABFRF ABJNI ABMAC ABXDB ABYKQ ACDAQ ACGFO ACGFS ACNNM ACRLP ACZNC ADBBV ADEZE ADJOM ADMUD ADTZH AEBSH AECPX AEFWE AEKER AENEX AFKWA AFTJW AGHFR AGUBO AGYEJ AHJVU AHZHX AIALX AIEXJ AIKHN AITUG AJBFU AJOXV ALMA_UNASSIGNED_HOLDINGS AMFUW AMRAJ AOUOD ASPBG AVWKF AXJTR AZFZN BJAXD BKOJK BLXMC CS3 EBS EFJIC EFLBG EJD EO8 EO9 EP2 EP3 F5P FDB FEDTE FGOYB FIRID FNPLU FYGXN G-Q GBLVA GBOLZ HVGLF HZ~ IHE J1W JJJVA KOM M41 MO0 N9A O-L O9- OAUVE OZT P-8 P-9 P2P PC. Q38 R2- RIG ROL RPZ SDF SDG SES SEW SPC SPCBC SST SSV SSZ T5K UHS UNMZH ~G- 9DU AATTM AAXKI AAYWO AAYXX ABWVN ACLOT ACRPL ACVFH ADCNI ADNMO AEIPS AEUPX AFJKZ AFPUW AGQPQ AIGII AIIUN AKBMS AKRWK AKYEP ANKPU APXCP CITATION EFKBS ~HD |
| ID | FETCH-LOGICAL-c344t-6260e8e6d38d96db1abb0c85331c832ec44584a5331a46dceb4089d78895926b3 |
| ISICitedReferencesCount | 19 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000927334100012&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 1568-4946 |
| IngestDate | Sat Nov 29 07:02:31 EST 2025 Tue Nov 18 22:14:30 EST 2025 Fri Feb 23 02:39:07 EST 2024 |
| IsDoiOpenAccess | true |
| IsOpenAccess | true |
| IsPeerReviewed | true |
| IsScholarly | true |
| Keywords | Deep reinforcement learning Autonomous navigation Reward criteria Machine learning and artificial intelligence |
| Language | English |
| License | This is an open access article under the CC BY license. |
| LinkModel | OpenURL |
| MergedId | FETCHMERGED-LOGICAL-c344t-6260e8e6d38d96db1abb0c85331c832ec44584a5331a46dceb4089d78895926b3 |
| ORCID | 0000-0003-1435-9260 0000-0001-6792-9170 0000-0002-1023-2118 |
| OpenAccessLink | https://dx.doi.org/10.1016/j.asoc.2022.109241 |
| ParticipantIDs | crossref_citationtrail_10_1016_j_asoc_2022_109241 crossref_primary_10_1016_j_asoc_2022_109241 elsevier_sciencedirect_doi_10_1016_j_asoc_2022_109241 |
| PublicationCentury | 2000 |
| PublicationDate | September 2022 2022-09-00 |
| PublicationDateYYYYMMDD | 2022-09-01 |
| PublicationDate_xml | – month: 09 year: 2022 text: September 2022 |
| PublicationDecade | 2020 |
| PublicationTitle | Applied soft computing |
| PublicationYear | 2022 |
| Publisher | Elsevier B.V |
| Publisher_xml | – name: Elsevier B.V |
| References | Vinyals, Babuschkin, Czarnecki, Mathieu, Dudzik, Chung, Choi, Powell, Ewalds, Georgiev, Oh, Horgan, Kroiss, Danihelka, Huang, Sifre, Cai, Agapiou, Jaderberg, Silver (b1) 2019; 575 Dobrevski, Skočaj (b9) 2021; 18 de Villiers, Sabatta (b20) 2020 Zhang, Zhu, Zou, Li, Zhang (b13) 2020; 20 Najar, Sigaud, Chetouani (b22) 2019 A. Laud, G. DeJong, The influence of reward on the speed of reinforcement learning: An analysis of shaping, in: Proceedings of the 20th International Conference on Machine Learning (ICML-03), 2003, pp. 440–447. Suay, Brys, Taylor, Chernova (b16) 2016 Silver, Huang, Maddison, Guez, Sifre, Driessche, Schrittwieser, Antonoglou, Panneershelvam, Lanctot, Dieleman, Grewe, Nham, Kalchbrenner, Sutskever, Lillicrap, Leach, Kavukcuoglu, Graepel, Hassabis (b2) 2016; 529 François-Lavet, Henderson, Islam, Bellemare, Pineau (b7) 2018 van Seijen, Fatemi, Romoff, Laroche, Barnes, Tsang (b15) 2017 Camacho, Icarte, Klassen, Valenzano, Mcilraith (b23) 2019 Mnih, Kavukcuoglu, Silver, Rusu, Veness, Bellemare, Graves, Riedmiller, Fidjeland, Ostrovski, Petersen, Beattie, Sadik, Antonoglou, King, Kumaran, Wierstra, Legg, Hassabis (b24) 2015; 518 Zhu, Liu, Wang (b5) 2019 Tolovski (b6) 2020 Mnih, Kavukcuoglu, Silver, Graves, Antonoglou, Wierstra, Riedmiller (b3) 2013 Knox, Stone (b21) 2015; 225 Zou, Ren, Yan, Su, Zhu (b19) 2019 Matignon, Laurent, Fort-Piat (b11) 2006 Grzes, Kudenko (b10) 2009 Hussein, Elyan, Gaber, Jayne (b17) 2017 Yang, Xie, Wang (b25) 2019 Kiran, Sobh, Talpaert, Mannion, Sallab, Yogamani, Pérez (b4) 2020 Kimura, Chaudhury, Tachibana, Dasgupta (b18) 2018 Hu, Wan, Gao, Zhai (b12) 2019 Botteghi, Sirmaçek, Mustafa, Poel, Stramigioli (b14) 2020 Grzes (10.1016/j.asoc.2022.109241_b10) 2009 Kimura (10.1016/j.asoc.2022.109241_b18) 2018 Botteghi (10.1016/j.asoc.2022.109241_b14) 2020 Hussein (10.1016/j.asoc.2022.109241_b17) 2017 Silver (10.1016/j.asoc.2022.109241_b2) 2016; 529 Suay (10.1016/j.asoc.2022.109241_b16) 2016 Mnih (10.1016/j.asoc.2022.109241_b24) 2015; 518 Zhu (10.1016/j.asoc.2022.109241_b5) 2019 Yang (10.1016/j.asoc.2022.109241_b25) 2019 Tolovski (10.1016/j.asoc.2022.109241_b6) 2020 Kiran (10.1016/j.asoc.2022.109241_b4) 2020 Matignon (10.1016/j.asoc.2022.109241_b11) 2006 Knox (10.1016/j.asoc.2022.109241_b21) 2015; 225 Zou (10.1016/j.asoc.2022.109241_b19) 2019 Dobrevski (10.1016/j.asoc.2022.109241_b9) 2021; 18 van Seijen (10.1016/j.asoc.2022.109241_b15) 2017 Vinyals (10.1016/j.asoc.2022.109241_b1) 2019; 575 10.1016/j.asoc.2022.109241_b8 Najar (10.1016/j.asoc.2022.109241_b22) 2019 Camacho (10.1016/j.asoc.2022.109241_b23) 2019 Hu (10.1016/j.asoc.2022.109241_b12) 2019 de Villiers (10.1016/j.asoc.2022.109241_b20) 2020 François-Lavet (10.1016/j.asoc.2022.109241_b7) 2018 Zhang (10.1016/j.asoc.2022.109241_b13) 2020; 20 Mnih (10.1016/j.asoc.2022.109241_b3) 2013 |
| References_xml | – year: 2017 ident: b15 article-title: Hybrid reward architecture for reinforcement learning – volume: 529 start-page: 484 year: 2016 end-page: 489 ident: b2 article-title: Mastering the game of Go with deep neural networks and tree search publication-title: Nature – year: 2018 ident: b7 article-title: An introduction to deep reinforcement learning – year: 2019 ident: b5 article-title: Deep reinforcement learning for unmanned aerial vehicle-assisted vehicular networks – year: 2020 ident: b4 article-title: Deep reinforcement learning for autonomous driving: A survey – year: 2019 ident: b25 article-title: A theoretical analysis of deep Q-learning – start-page: 429 year: 2016 end-page: 437 ident: b16 article-title: Learning from demonstration for shaping through inverse reinforcement learning publication-title: Proceedings of the 2016 International Conference on Autonomous Agents & Multiagent Systems – year: 2013 ident: b3 article-title: Playing atari with deep reinforcement learning – volume: 18 year: 2021 ident: b9 article-title: Deep reinforcement learning for map-less goal-driven robot navigation publication-title: Int. J. Adv. Robot. Syst. – start-page: 337 year: 2009 end-page: 344 ident: b10 article-title: Theoretical and empirical analysis of reward shaping in reinforcement learning publication-title: 2009 International Conference on Machine Learning and Applications – start-page: 1 year: 2020 end-page: 7 ident: b20 article-title: Hindsight reward shaping in deep reinforcement learning publication-title: 2020 International SAUPEC/RobMech/PRASA Conference – volume: 575 start-page: 350 year: 2019 end-page: 354 ident: b1 article-title: Grandmaster level in StarCraft II using multi-agent reinforcement learning publication-title: Nature – year: 2020 ident: b14 article-title: On reward shaping for mobile robot navigation: A reinforcement learning and SLAM based approach – year: 2019 ident: b19 article-title: Reward shaping via meta-learning – year: 2019 ident: b22 article-title: Interactively shaping robot behaviour with unlabeled human instructions – reference: A. Laud, G. DeJong, The influence of reward on the speed of reinforcement learning: An analysis of shaping, in: Proceedings of the 20th International Conference on Machine Learning (ICML-03), 2003, pp. 440–447. – year: 2006 ident: b11 article-title: Reward function and initial values: Better choices for accelerated goal-directed reinforcement learning publication-title: International Conference on Artificial Neural Networks – volume: 20 start-page: 3664 year: 2020 ident: b13 article-title: Learning reward function with matching network for mapless navigation publication-title: Sensors – year: 2020 ident: b6 article-title: Advancing renewable electricity consumption with reinforcement learning – start-page: 6065 year: 2019 end-page: 6073 ident: b23 article-title: LTL and beyond: Formal languages for reward function specification in reinforcement learning – volume: 518 start-page: 529 year: 2015 end-page: 533 ident: b24 article-title: Human-level control through deep reinforcement learning publication-title: Nature – volume: 225 start-page: 24 year: 2015 end-page: 50 ident: b21 article-title: Framing reinforcement learning from human reward: Reward positivity, temporal discounting, episodicity, and performance publication-title: Artificial Intelligence – start-page: 10 year: 2019 ident: b12 article-title: A dynamic adjusting reward function method for deep reinforcement learning with adjustable parameters publication-title: Math. Probl. Eng. – start-page: 510 year: 2017 end-page: 517 ident: b17 article-title: Deep reward shaping from demonstrations publication-title: 2017 International Joint Conference on Neural Networks (IJCNN) – year: 2018 ident: b18 article-title: Internal model from observations for reward shaping – year: 2019 ident: 10.1016/j.asoc.2022.109241_b5 – year: 2013 ident: 10.1016/j.asoc.2022.109241_b3 – year: 2020 ident: 10.1016/j.asoc.2022.109241_b14 – start-page: 10 year: 2019 ident: 10.1016/j.asoc.2022.109241_b12 article-title: A dynamic adjusting reward function method for deep reinforcement learning with adjustable parameters publication-title: Math. Probl. Eng. – volume: 20 start-page: 3664 year: 2020 ident: 10.1016/j.asoc.2022.109241_b13 article-title: Learning reward function with matching network for mapless navigation publication-title: Sensors doi: 10.3390/s20133664 – ident: 10.1016/j.asoc.2022.109241_b8 – start-page: 510 year: 2017 ident: 10.1016/j.asoc.2022.109241_b17 article-title: Deep reward shaping from demonstrations – year: 2018 ident: 10.1016/j.asoc.2022.109241_b18 – year: 2020 ident: 10.1016/j.asoc.2022.109241_b4 – start-page: 6065 year: 2019 ident: 10.1016/j.asoc.2022.109241_b23 – volume: 18 issue: 1 year: 2021 ident: 10.1016/j.asoc.2022.109241_b9 article-title: Deep reinforcement learning for map-less goal-driven robot navigation publication-title: Int. J. Adv. Robot. Syst. doi: 10.1177/1729881421992621 – start-page: 337 year: 2009 ident: 10.1016/j.asoc.2022.109241_b10 article-title: Theoretical and empirical analysis of reward shaping in reinforcement learning – volume: 518 start-page: 529 year: 2015 ident: 10.1016/j.asoc.2022.109241_b24 article-title: Human-level control through deep reinforcement learning publication-title: Nature doi: 10.1038/nature14236 – volume: 575 start-page: 350 year: 2019 ident: 10.1016/j.asoc.2022.109241_b1 article-title: Grandmaster level in StarCraft II using multi-agent reinforcement learning publication-title: Nature doi: 10.1038/s41586-019-1724-z – year: 2020 ident: 10.1016/j.asoc.2022.109241_b6 – year: 2018 ident: 10.1016/j.asoc.2022.109241_b7 – year: 2019 ident: 10.1016/j.asoc.2022.109241_b19 – year: 2017 ident: 10.1016/j.asoc.2022.109241_b15 – start-page: 1 year: 2020 ident: 10.1016/j.asoc.2022.109241_b20 article-title: Hindsight reward shaping in deep reinforcement learning – volume: 225 start-page: 24 year: 2015 ident: 10.1016/j.asoc.2022.109241_b21 article-title: Framing reinforcement learning from human reward: Reward positivity, temporal discounting, episodicity, and performance publication-title: Artificial Intelligence doi: 10.1016/j.artint.2015.03.009 – volume: 529 start-page: 484 year: 2016 ident: 10.1016/j.asoc.2022.109241_b2 article-title: Mastering the game of Go with deep neural networks and tree search publication-title: Nature doi: 10.1038/nature16961 – year: 2019 ident: 10.1016/j.asoc.2022.109241_b22 – start-page: 429 year: 2016 ident: 10.1016/j.asoc.2022.109241_b16 article-title: Learning from demonstration for shaping through inverse reinforcement learning – year: 2006 ident: 10.1016/j.asoc.2022.109241_b11 article-title: Reward function and initial values: Better choices for accelerated goal-directed reinforcement learning – year: 2019 ident: 10.1016/j.asoc.2022.109241_b25 |
| SSID | ssj0016928 |
| Score | 2.4527974 |
| Snippet | In reinforcement learning, an agent takes action at every time step (follows a policy) in an environment to maximize the expected cumulative reward. Therefore,... |
| SourceID | crossref elsevier |
| SourceType | Enrichment Source Index Database Publisher |
| StartPage | 109241 |
| SubjectTerms | Autonomous navigation Deep reinforcement learning Machine learning and artificial intelligence Reward criteria |
| Title | Reward criteria impact on the performance of reinforcement learning agent for autonomous navigation |
| URI | https://dx.doi.org/10.1016/j.asoc.2022.109241 |
| Volume | 126 |
| WOSCitedRecordID | wos000927334100012&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVESC databaseName: ScienceDirect Freedom Collection customDbUrl: eissn: 1872-9681 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0016928 issn: 1568-4946 databaseCode: AIEXJ dateStart: 20010601 isFulltext: true titleUrlDefault: https://www.sciencedirect.com providerName: Elsevier |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1LT9tAEF7xOnBpS0tV-kB74BY5il-b3WOEqNoeEEIg5Watd4eIQJ0ocSL67zv7sg0CBAcuUeLYEyvfp_Hs7DczhBzpJBFMlDISqTLZKs2jUmaDKMkBEp4rs7lkh00MT0_5eCzO_FbM0o4TGFYVv7sT8zeFGo8h2KZ09hVwN0bxAL5H0PEVYcfXFwF_DkYI20NvYNowt3WQTs847xQKYJy4ANs5VdkkYRghMenJie3ZZASWq9qUPdg-rnJt-3F4IEPrWh_GLtGfW4H6qg5PQ5sA_yedDGANbdHZMVQ3sJB_jZjJJwYmEpHWulX0uI2o0fS67iYmcE0blFeNL2WIvvAZxuBsk667jAe4_Isf9eQuqTDtSyRp35jvtyffb5v94HHWiAyDfm1aGBuFsVE4G5tkOxnmAp3g9uj3yfhPs-3EhB3G29y5r7JygsCHd_J4JNOJTi4-kHd-WUFHjg57ZAOqj-R9GNlBvQf_RJRjBw3soI4ddFZRZAftsIPOrug9dtDADmrZQfEL2rKDtuzYJ5c_Ty6Of0V-zkak0iyrI7OmBQ5Mp1wLpstYluVAYRyXxgodPqjMbKZL81lmTCsoswEXesi5yEXCyvQz2apmFXwhVDIQV0zqFO1kXGKsEysps6TEqBNYDgckDv9ZoXwTejML5bZ4Gq0D0muumbsWLM-enQcoCh9EuuCwQGY9c93XV_3KN7LbUv472aoXK_hBdtS6vl4uDj2t_gPYXZVu |
| linkProvider | Elsevier |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Reward+criteria+impact+on+the+performance+of+reinforcement+learning+agent+for+autonomous+navigation&rft.jtitle=Applied+soft+computing&rft.au=Dayal%2C+Aveen&rft.au=Cenkeramaddi%2C+Linga+Reddy&rft.au=Jha%2C+Ajit&rft.date=2022-09-01&rft.issn=1568-4946&rft.volume=126&rft.spage=109241&rft_id=info:doi/10.1016%2Fj.asoc.2022.109241&rft.externalDBID=n%2Fa&rft.externalDocID=10_1016_j_asoc_2022_109241 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1568-4946&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1568-4946&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1568-4946&client=summon |