Reward criteria impact on the performance of reinforcement learning agent for autonomous navigation

In reinforcement learning, an agent takes action at every time step (follows a policy) in an environment to maximize the expected cumulative reward. Therefore, the shaping of a reward function plays a crucial role in an agent’s learning. Designing an optimal reward function is not a trivial task. In...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Applied soft computing Ročník 126; s. 109241
Hlavní autoři: Dayal, Aveen, Cenkeramaddi, Linga Reddy, Jha, Ajit
Médium: Journal Article
Jazyk:angličtina
Vydáno: Elsevier B.V 01.09.2022
Témata:
ISSN:1568-4946, 1872-9681
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Abstract In reinforcement learning, an agent takes action at every time step (follows a policy) in an environment to maximize the expected cumulative reward. Therefore, the shaping of a reward function plays a crucial role in an agent’s learning. Designing an optimal reward function is not a trivial task. In this article, we propose a reward criterion using which we develop different reward functions. The reward criterion chosen is based on the percentage of positive and negative rewards received by an agent. This reward criteria further gives rise to three different classes, ‘Balanced Class,’ ‘Skewed Positive Class,’ and ‘Skewed Negative Class.’ We train a Deep Q-Network agent on a point-goal based navigation task using the different reward classes. We also compare the performance of the proposed classes with a benchmark class. Based on the experiments, the skewed negative class outperforms the benchmark class by achieving very less variance. On the other hand, the benchmark class converges relatively faster than the skewed negative class. •A reward criterion to assess the performance of an RL agent.•Various reward functions to train an RL agent.•The proportion of positive and negative rewards in a reward shaping function.•The reward criterion: ‘Balanced Class’, ‘Skewed Positive Class’ and ‘Skewed Negative Class’.•The performance of an RL agent in the case of autonomous navigation.
AbstractList In reinforcement learning, an agent takes action at every time step (follows a policy) in an environment to maximize the expected cumulative reward. Therefore, the shaping of a reward function plays a crucial role in an agent’s learning. Designing an optimal reward function is not a trivial task. In this article, we propose a reward criterion using which we develop different reward functions. The reward criterion chosen is based on the percentage of positive and negative rewards received by an agent. This reward criteria further gives rise to three different classes, ‘Balanced Class,’ ‘Skewed Positive Class,’ and ‘Skewed Negative Class.’ We train a Deep Q-Network agent on a point-goal based navigation task using the different reward classes. We also compare the performance of the proposed classes with a benchmark class. Based on the experiments, the skewed negative class outperforms the benchmark class by achieving very less variance. On the other hand, the benchmark class converges relatively faster than the skewed negative class. •A reward criterion to assess the performance of an RL agent.•Various reward functions to train an RL agent.•The proportion of positive and negative rewards in a reward shaping function.•The reward criterion: ‘Balanced Class’, ‘Skewed Positive Class’ and ‘Skewed Negative Class’.•The performance of an RL agent in the case of autonomous navigation.
ArticleNumber 109241
Author Dayal, Aveen
Jha, Ajit
Cenkeramaddi, Linga Reddy
Author_xml – sequence: 1
  givenname: Aveen
  orcidid: 0000-0001-6792-9170
  surname: Dayal
  fullname: Dayal, Aveen
  email: aveendayal97@gmail.com
  organization: Department of Information and Communication Technology, University of Agder, Norway
– sequence: 2
  givenname: Linga Reddy
  orcidid: 0000-0002-1023-2118
  surname: Cenkeramaddi
  fullname: Cenkeramaddi, Linga Reddy
  email: linga.cenkeramaddi@uia.no
  organization: Department of Information and Communication Technology, University of Agder, Norway
– sequence: 3
  givenname: Ajit
  orcidid: 0000-0003-1435-9260
  surname: Jha
  fullname: Jha, Ajit
  email: ajit.jha@uia.no
  organization: Department of Engineering Science, University of Agder, Norway
BookMark eNp9kM9OAyEQh4mpiW31BTzxAluBZSkkXkzjv8TExOiZsDBbabrQsLTGt5e1njz0NPODfJOZb4YmIQZA6JqSBSVU3GwWZoh2wQhj5UExTs_QlMolq5SQdFL6RsiKKy4u0GwYNqRAiskpsm_wZZLDNvkMyRvs-52xGceA8yfgHaQupt4ECzh2OIEPJVvoIWS8BZOCD2ts1mMsH9jscwyxj_sBB3Pwa5N9DJfovDPbAa7-6hx9PNy_r56ql9fH59XdS2VrznMlmCAgQbhaOiVcS03bEiubuqZW1gws543kZsyGC2eh5UQqt5RSNYqJtp4jeZxrUxyGBJ22Pv9ukJPxW02JHmXpjR5l6VGWPsoqKPuH7pLvTfo-Dd0eIShHHTwkPVgPRZXzCWzWLvpT-A8rEocQ
CitedBy_id crossref_primary_10_3390_app14125213
crossref_primary_10_1007_s40435_024_01407_6
crossref_primary_10_1016_j_aei_2024_102960
crossref_primary_10_1080_0952813X_2024_2321152
crossref_primary_10_1016_j_asoc_2023_110543
crossref_primary_10_3390_electronics12183773
crossref_primary_10_3390_s23167216
crossref_primary_10_1016_j_asoc_2023_110756
crossref_primary_10_3390_chemengineering9020034
crossref_primary_10_1016_j_aej_2025_01_052
crossref_primary_10_59400_cai2923
crossref_primary_10_1186_s11671_025_04338_z
Cites_doi 10.3390/s20133664
10.1177/1729881421992621
10.1038/nature14236
10.1038/s41586-019-1724-z
10.1016/j.artint.2015.03.009
10.1038/nature16961
ContentType Journal Article
Copyright 2022 The Author(s)
Copyright_xml – notice: 2022 The Author(s)
DBID 6I.
AAFTH
AAYXX
CITATION
DOI 10.1016/j.asoc.2022.109241
DatabaseName ScienceDirect Open Access Titles
Elsevier:ScienceDirect:Open Access
CrossRef
DatabaseTitle CrossRef
DatabaseTitleList
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISSN 1872-9681
ExternalDocumentID 10_1016_j_asoc_2022_109241
S1568494622004586
GroupedDBID --K
--M
.DC
.~1
0R~
1B1
1~.
1~5
23M
4.4
457
4G.
53G
5GY
5VS
6I.
6J9
7-5
71M
8P~
AABNK
AACTN
AAEDT
AAEDW
AAFTH
AAIAV
AAIKJ
AAKOC
AALRI
AAOAW
AAQFI
AAQXK
AAXUO
AAYFN
ABBOA
ABFNM
ABFRF
ABJNI
ABMAC
ABXDB
ABYKQ
ACDAQ
ACGFO
ACGFS
ACNNM
ACRLP
ACZNC
ADBBV
ADEZE
ADJOM
ADMUD
ADTZH
AEBSH
AECPX
AEFWE
AEKER
AENEX
AFKWA
AFTJW
AGHFR
AGUBO
AGYEJ
AHJVU
AHZHX
AIALX
AIEXJ
AIKHN
AITUG
AJBFU
AJOXV
ALMA_UNASSIGNED_HOLDINGS
AMFUW
AMRAJ
AOUOD
ASPBG
AVWKF
AXJTR
AZFZN
BJAXD
BKOJK
BLXMC
CS3
EBS
EFJIC
EFLBG
EJD
EO8
EO9
EP2
EP3
F5P
FDB
FEDTE
FGOYB
FIRID
FNPLU
FYGXN
G-Q
GBLVA
GBOLZ
HVGLF
HZ~
IHE
J1W
JJJVA
KOM
M41
MO0
N9A
O-L
O9-
OAUVE
OZT
P-8
P-9
P2P
PC.
Q38
R2-
RIG
ROL
RPZ
SDF
SDG
SES
SEW
SPC
SPCBC
SST
SSV
SSZ
T5K
UHS
UNMZH
~G-
9DU
AATTM
AAXKI
AAYWO
AAYXX
ABWVN
ACLOT
ACRPL
ACVFH
ADCNI
ADNMO
AEIPS
AEUPX
AFJKZ
AFPUW
AGQPQ
AIGII
AIIUN
AKBMS
AKRWK
AKYEP
ANKPU
APXCP
CITATION
EFKBS
~HD
ID FETCH-LOGICAL-c344t-6260e8e6d38d96db1abb0c85331c832ec44584a5331a46dceb4089d78895926b3
ISICitedReferencesCount 19
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000927334100012&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 1568-4946
IngestDate Sat Nov 29 07:02:31 EST 2025
Tue Nov 18 22:14:30 EST 2025
Fri Feb 23 02:39:07 EST 2024
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Keywords Deep reinforcement learning
Autonomous navigation
Reward criteria
Machine learning and artificial intelligence
Language English
License This is an open access article under the CC BY license.
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-c344t-6260e8e6d38d96db1abb0c85331c832ec44584a5331a46dceb4089d78895926b3
ORCID 0000-0003-1435-9260
0000-0001-6792-9170
0000-0002-1023-2118
OpenAccessLink https://dx.doi.org/10.1016/j.asoc.2022.109241
ParticipantIDs crossref_citationtrail_10_1016_j_asoc_2022_109241
crossref_primary_10_1016_j_asoc_2022_109241
elsevier_sciencedirect_doi_10_1016_j_asoc_2022_109241
PublicationCentury 2000
PublicationDate September 2022
2022-09-00
PublicationDateYYYYMMDD 2022-09-01
PublicationDate_xml – month: 09
  year: 2022
  text: September 2022
PublicationDecade 2020
PublicationTitle Applied soft computing
PublicationYear 2022
Publisher Elsevier B.V
Publisher_xml – name: Elsevier B.V
References Vinyals, Babuschkin, Czarnecki, Mathieu, Dudzik, Chung, Choi, Powell, Ewalds, Georgiev, Oh, Horgan, Kroiss, Danihelka, Huang, Sifre, Cai, Agapiou, Jaderberg, Silver (b1) 2019; 575
Dobrevski, Skočaj (b9) 2021; 18
de Villiers, Sabatta (b20) 2020
Zhang, Zhu, Zou, Li, Zhang (b13) 2020; 20
Najar, Sigaud, Chetouani (b22) 2019
A. Laud, G. DeJong, The influence of reward on the speed of reinforcement learning: An analysis of shaping, in: Proceedings of the 20th International Conference on Machine Learning (ICML-03), 2003, pp. 440–447.
Suay, Brys, Taylor, Chernova (b16) 2016
Silver, Huang, Maddison, Guez, Sifre, Driessche, Schrittwieser, Antonoglou, Panneershelvam, Lanctot, Dieleman, Grewe, Nham, Kalchbrenner, Sutskever, Lillicrap, Leach, Kavukcuoglu, Graepel, Hassabis (b2) 2016; 529
François-Lavet, Henderson, Islam, Bellemare, Pineau (b7) 2018
van Seijen, Fatemi, Romoff, Laroche, Barnes, Tsang (b15) 2017
Camacho, Icarte, Klassen, Valenzano, Mcilraith (b23) 2019
Mnih, Kavukcuoglu, Silver, Rusu, Veness, Bellemare, Graves, Riedmiller, Fidjeland, Ostrovski, Petersen, Beattie, Sadik, Antonoglou, King, Kumaran, Wierstra, Legg, Hassabis (b24) 2015; 518
Zhu, Liu, Wang (b5) 2019
Tolovski (b6) 2020
Mnih, Kavukcuoglu, Silver, Graves, Antonoglou, Wierstra, Riedmiller (b3) 2013
Knox, Stone (b21) 2015; 225
Zou, Ren, Yan, Su, Zhu (b19) 2019
Matignon, Laurent, Fort-Piat (b11) 2006
Grzes, Kudenko (b10) 2009
Hussein, Elyan, Gaber, Jayne (b17) 2017
Yang, Xie, Wang (b25) 2019
Kiran, Sobh, Talpaert, Mannion, Sallab, Yogamani, Pérez (b4) 2020
Kimura, Chaudhury, Tachibana, Dasgupta (b18) 2018
Hu, Wan, Gao, Zhai (b12) 2019
Botteghi, Sirmaçek, Mustafa, Poel, Stramigioli (b14) 2020
Grzes (10.1016/j.asoc.2022.109241_b10) 2009
Kimura (10.1016/j.asoc.2022.109241_b18) 2018
Botteghi (10.1016/j.asoc.2022.109241_b14) 2020
Hussein (10.1016/j.asoc.2022.109241_b17) 2017
Silver (10.1016/j.asoc.2022.109241_b2) 2016; 529
Suay (10.1016/j.asoc.2022.109241_b16) 2016
Mnih (10.1016/j.asoc.2022.109241_b24) 2015; 518
Zhu (10.1016/j.asoc.2022.109241_b5) 2019
Yang (10.1016/j.asoc.2022.109241_b25) 2019
Tolovski (10.1016/j.asoc.2022.109241_b6) 2020
Kiran (10.1016/j.asoc.2022.109241_b4) 2020
Matignon (10.1016/j.asoc.2022.109241_b11) 2006
Knox (10.1016/j.asoc.2022.109241_b21) 2015; 225
Zou (10.1016/j.asoc.2022.109241_b19) 2019
Dobrevski (10.1016/j.asoc.2022.109241_b9) 2021; 18
van Seijen (10.1016/j.asoc.2022.109241_b15) 2017
Vinyals (10.1016/j.asoc.2022.109241_b1) 2019; 575
10.1016/j.asoc.2022.109241_b8
Najar (10.1016/j.asoc.2022.109241_b22) 2019
Camacho (10.1016/j.asoc.2022.109241_b23) 2019
Hu (10.1016/j.asoc.2022.109241_b12) 2019
de Villiers (10.1016/j.asoc.2022.109241_b20) 2020
François-Lavet (10.1016/j.asoc.2022.109241_b7) 2018
Zhang (10.1016/j.asoc.2022.109241_b13) 2020; 20
Mnih (10.1016/j.asoc.2022.109241_b3) 2013
References_xml – year: 2017
  ident: b15
  article-title: Hybrid reward architecture for reinforcement learning
– volume: 529
  start-page: 484
  year: 2016
  end-page: 489
  ident: b2
  article-title: Mastering the game of Go with deep neural networks and tree search
  publication-title: Nature
– year: 2018
  ident: b7
  article-title: An introduction to deep reinforcement learning
– year: 2019
  ident: b5
  article-title: Deep reinforcement learning for unmanned aerial vehicle-assisted vehicular networks
– year: 2020
  ident: b4
  article-title: Deep reinforcement learning for autonomous driving: A survey
– year: 2019
  ident: b25
  article-title: A theoretical analysis of deep Q-learning
– start-page: 429
  year: 2016
  end-page: 437
  ident: b16
  article-title: Learning from demonstration for shaping through inverse reinforcement learning
  publication-title: Proceedings of the 2016 International Conference on Autonomous Agents & Multiagent Systems
– year: 2013
  ident: b3
  article-title: Playing atari with deep reinforcement learning
– volume: 18
  year: 2021
  ident: b9
  article-title: Deep reinforcement learning for map-less goal-driven robot navigation
  publication-title: Int. J. Adv. Robot. Syst.
– start-page: 337
  year: 2009
  end-page: 344
  ident: b10
  article-title: Theoretical and empirical analysis of reward shaping in reinforcement learning
  publication-title: 2009 International Conference on Machine Learning and Applications
– start-page: 1
  year: 2020
  end-page: 7
  ident: b20
  article-title: Hindsight reward shaping in deep reinforcement learning
  publication-title: 2020 International SAUPEC/RobMech/PRASA Conference
– volume: 575
  start-page: 350
  year: 2019
  end-page: 354
  ident: b1
  article-title: Grandmaster level in StarCraft II using multi-agent reinforcement learning
  publication-title: Nature
– year: 2020
  ident: b14
  article-title: On reward shaping for mobile robot navigation: A reinforcement learning and SLAM based approach
– year: 2019
  ident: b19
  article-title: Reward shaping via meta-learning
– year: 2019
  ident: b22
  article-title: Interactively shaping robot behaviour with unlabeled human instructions
– reference: A. Laud, G. DeJong, The influence of reward on the speed of reinforcement learning: An analysis of shaping, in: Proceedings of the 20th International Conference on Machine Learning (ICML-03), 2003, pp. 440–447.
– year: 2006
  ident: b11
  article-title: Reward function and initial values: Better choices for accelerated goal-directed reinforcement learning
  publication-title: International Conference on Artificial Neural Networks
– volume: 20
  start-page: 3664
  year: 2020
  ident: b13
  article-title: Learning reward function with matching network for mapless navigation
  publication-title: Sensors
– year: 2020
  ident: b6
  article-title: Advancing renewable electricity consumption with reinforcement learning
– start-page: 6065
  year: 2019
  end-page: 6073
  ident: b23
  article-title: LTL and beyond: Formal languages for reward function specification in reinforcement learning
– volume: 518
  start-page: 529
  year: 2015
  end-page: 533
  ident: b24
  article-title: Human-level control through deep reinforcement learning
  publication-title: Nature
– volume: 225
  start-page: 24
  year: 2015
  end-page: 50
  ident: b21
  article-title: Framing reinforcement learning from human reward: Reward positivity, temporal discounting, episodicity, and performance
  publication-title: Artificial Intelligence
– start-page: 10
  year: 2019
  ident: b12
  article-title: A dynamic adjusting reward function method for deep reinforcement learning with adjustable parameters
  publication-title: Math. Probl. Eng.
– start-page: 510
  year: 2017
  end-page: 517
  ident: b17
  article-title: Deep reward shaping from demonstrations
  publication-title: 2017 International Joint Conference on Neural Networks (IJCNN)
– year: 2018
  ident: b18
  article-title: Internal model from observations for reward shaping
– year: 2019
  ident: 10.1016/j.asoc.2022.109241_b5
– year: 2013
  ident: 10.1016/j.asoc.2022.109241_b3
– year: 2020
  ident: 10.1016/j.asoc.2022.109241_b14
– start-page: 10
  year: 2019
  ident: 10.1016/j.asoc.2022.109241_b12
  article-title: A dynamic adjusting reward function method for deep reinforcement learning with adjustable parameters
  publication-title: Math. Probl. Eng.
– volume: 20
  start-page: 3664
  year: 2020
  ident: 10.1016/j.asoc.2022.109241_b13
  article-title: Learning reward function with matching network for mapless navigation
  publication-title: Sensors
  doi: 10.3390/s20133664
– ident: 10.1016/j.asoc.2022.109241_b8
– start-page: 510
  year: 2017
  ident: 10.1016/j.asoc.2022.109241_b17
  article-title: Deep reward shaping from demonstrations
– year: 2018
  ident: 10.1016/j.asoc.2022.109241_b18
– year: 2020
  ident: 10.1016/j.asoc.2022.109241_b4
– start-page: 6065
  year: 2019
  ident: 10.1016/j.asoc.2022.109241_b23
– volume: 18
  issue: 1
  year: 2021
  ident: 10.1016/j.asoc.2022.109241_b9
  article-title: Deep reinforcement learning for map-less goal-driven robot navigation
  publication-title: Int. J. Adv. Robot. Syst.
  doi: 10.1177/1729881421992621
– start-page: 337
  year: 2009
  ident: 10.1016/j.asoc.2022.109241_b10
  article-title: Theoretical and empirical analysis of reward shaping in reinforcement learning
– volume: 518
  start-page: 529
  year: 2015
  ident: 10.1016/j.asoc.2022.109241_b24
  article-title: Human-level control through deep reinforcement learning
  publication-title: Nature
  doi: 10.1038/nature14236
– volume: 575
  start-page: 350
  year: 2019
  ident: 10.1016/j.asoc.2022.109241_b1
  article-title: Grandmaster level in StarCraft II using multi-agent reinforcement learning
  publication-title: Nature
  doi: 10.1038/s41586-019-1724-z
– year: 2020
  ident: 10.1016/j.asoc.2022.109241_b6
– year: 2018
  ident: 10.1016/j.asoc.2022.109241_b7
– year: 2019
  ident: 10.1016/j.asoc.2022.109241_b19
– year: 2017
  ident: 10.1016/j.asoc.2022.109241_b15
– start-page: 1
  year: 2020
  ident: 10.1016/j.asoc.2022.109241_b20
  article-title: Hindsight reward shaping in deep reinforcement learning
– volume: 225
  start-page: 24
  year: 2015
  ident: 10.1016/j.asoc.2022.109241_b21
  article-title: Framing reinforcement learning from human reward: Reward positivity, temporal discounting, episodicity, and performance
  publication-title: Artificial Intelligence
  doi: 10.1016/j.artint.2015.03.009
– volume: 529
  start-page: 484
  year: 2016
  ident: 10.1016/j.asoc.2022.109241_b2
  article-title: Mastering the game of Go with deep neural networks and tree search
  publication-title: Nature
  doi: 10.1038/nature16961
– year: 2019
  ident: 10.1016/j.asoc.2022.109241_b22
– start-page: 429
  year: 2016
  ident: 10.1016/j.asoc.2022.109241_b16
  article-title: Learning from demonstration for shaping through inverse reinforcement learning
– year: 2006
  ident: 10.1016/j.asoc.2022.109241_b11
  article-title: Reward function and initial values: Better choices for accelerated goal-directed reinforcement learning
– year: 2019
  ident: 10.1016/j.asoc.2022.109241_b25
SSID ssj0016928
Score 2.4527974
Snippet In reinforcement learning, an agent takes action at every time step (follows a policy) in an environment to maximize the expected cumulative reward. Therefore,...
SourceID crossref
elsevier
SourceType Enrichment Source
Index Database
Publisher
StartPage 109241
SubjectTerms Autonomous navigation
Deep reinforcement learning
Machine learning and artificial intelligence
Reward criteria
Title Reward criteria impact on the performance of reinforcement learning agent for autonomous navigation
URI https://dx.doi.org/10.1016/j.asoc.2022.109241
Volume 126
WOSCitedRecordID wos000927334100012&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVESC
  databaseName: ScienceDirect Freedom Collection
  customDbUrl:
  eissn: 1872-9681
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0016928
  issn: 1568-4946
  databaseCode: AIEXJ
  dateStart: 20010601
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1LT9tAEF7xOnBpS0tV-kB74BY5il-b3WOEqNoeEEIg5Watd4eIQJ0ocSL67zv7sg0CBAcuUeLYEyvfp_Hs7DczhBzpJBFMlDISqTLZKs2jUmaDKMkBEp4rs7lkh00MT0_5eCzO_FbM0o4TGFYVv7sT8zeFGo8h2KZ09hVwN0bxAL5H0PEVYcfXFwF_DkYI20NvYNowt3WQTs847xQKYJy4ANs5VdkkYRghMenJie3ZZASWq9qUPdg-rnJt-3F4IEPrWh_GLtGfW4H6qg5PQ5sA_yedDGANbdHZMVQ3sJB_jZjJJwYmEpHWulX0uI2o0fS67iYmcE0blFeNL2WIvvAZxuBsk667jAe4_Isf9eQuqTDtSyRp35jvtyffb5v94HHWiAyDfm1aGBuFsVE4G5tkOxnmAp3g9uj3yfhPs-3EhB3G29y5r7JygsCHd_J4JNOJTi4-kHd-WUFHjg57ZAOqj-R9GNlBvQf_RJRjBw3soI4ddFZRZAftsIPOrug9dtDADmrZQfEL2rKDtuzYJ5c_Ty6Of0V-zkak0iyrI7OmBQ5Mp1wLpstYluVAYRyXxgodPqjMbKZL81lmTCsoswEXesi5yEXCyvQz2apmFXwhVDIQV0zqFO1kXGKsEysps6TEqBNYDgckDv9ZoXwTejML5bZ4Gq0D0muumbsWLM-enQcoCh9EuuCwQGY9c93XV_3KN7LbUv472aoXK_hBdtS6vl4uDj2t_gPYXZVu
linkProvider Elsevier
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Reward+criteria+impact+on+the+performance+of+reinforcement+learning+agent+for+autonomous+navigation&rft.jtitle=Applied+soft+computing&rft.au=Dayal%2C+Aveen&rft.au=Cenkeramaddi%2C+Linga+Reddy&rft.au=Jha%2C+Ajit&rft.date=2022-09-01&rft.issn=1568-4946&rft.volume=126&rft.spage=109241&rft_id=info:doi/10.1016%2Fj.asoc.2022.109241&rft.externalDBID=n%2Fa&rft.externalDocID=10_1016_j_asoc_2022_109241
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1568-4946&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1568-4946&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1568-4946&client=summon