An Actor-Critic Algorithm with Function Approximation for Risk Sensitive Cost Markov Decision Processes

In this paper, we consider the risk-sensitive cost criterion with exponentiated costs for Markov decision processes and develop a model-free policy gradient algorithm in this setting. Unlike additive cost criteria such as average or discounted cost, the risk-sensitive cost criterion is less studied...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on automatic control S. 1 - 8
Hauptverfasser: Guin, Soumyajit, Borkar, Vivek S., Bhatnagar, Shalabh
Format: Journal Article
Sprache:Englisch
Veröffentlicht: IEEE 2025
Schlagworte:
ISSN:0018-9286, 1558-2523
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Abstract In this paper, we consider the risk-sensitive cost criterion with exponentiated costs for Markov decision processes and develop a model-free policy gradient algorithm in this setting. Unlike additive cost criteria such as average or discounted cost, the risk-sensitive cost criterion is less studied due to the complexity resulting from the multiplicative structure of the resulting Bellman equation. We develop an actor-critic algorithm with function approximation in this setting and provide its asymptotic convergence analysis. We also show the results of numerical experiments that demonstrate the superiority in performance of our algorithm over other recent algorithms in the literature.
AbstractList In this paper, we consider the risk-sensitive cost criterion with exponentiated costs for Markov decision processes and develop a model-free policy gradient algorithm in this setting. Unlike additive cost criteria such as average or discounted cost, the risk-sensitive cost criterion is less studied due to the complexity resulting from the multiplicative structure of the resulting Bellman equation. We develop an actor-critic algorithm with function approximation in this setting and provide its asymptotic convergence analysis. We also show the results of numerical experiments that demonstrate the superiority in performance of our algorithm over other recent algorithms in the literature.
Author Bhatnagar, Shalabh
Borkar, Vivek S.
Guin, Soumyajit
Author_xml – sequence: 1
  givenname: Soumyajit
  surname: Guin
  fullname: Guin, Soumyajit
  email: gsoumyajit@iisc.ac.in
  organization: Department of Computer Science and Automation, Indian Institute of Science, Bengaluru, Karnataka, India
– sequence: 2
  givenname: Vivek S.
  surname: Borkar
  fullname: Borkar, Vivek S.
  email: borkar.vs@gmail.com
  organization: Department of Electrical Engineering, Indian Institute of Technology, Bombay, Mumbai, India
– sequence: 3
  givenname: Shalabh
  surname: Bhatnagar
  fullname: Bhatnagar, Shalabh
  email: shalabh@iisc.ac.in
  organization: Department of Computer Science and Automation, Indian Institute of Science, Bengaluru, Karnataka, India
BookMark eNpFkMtOwzAQRS1UJNrCngUL_0CKH3FiL6NAAakIBGUdOc6kmLZxZYcCf49DK7GZh3TuaHQmaNS5DhC6pGRGKVHXy6KcMcLEjAvFOZMnaEyFkAkTjI_QmBAqE8VkdoYmIXzENUtTOkarosOF6Z1PSm97a3CxWbk4vW_xV6x4_tmZ3roI7Xbefdut_tta5_GLDWv8Cl2IuT3g0oUeP2q_dnt8A8aGgXv2zkAIEM7Raas3AS6OfYre5rfL8j5ZPN09lMUiMZSnfVJzrXJgqVS1NozyWgva1Eo2ecaVFJpBK6RpMsVM0-Sak9akklCVSyNqwlI-ReRw13gXgoe22vn4tP-pKKkGUVUUVQ2iqqOoGLk6RCwA_OORlZkU_BdxNWeR
CODEN IETAA9
ContentType Journal Article
DBID 97E
RIA
RIE
AAYXX
CITATION
DOI 10.1109/TAC.2025.3593328
DatabaseName IEEE Xplore (IEEE)
IEEE All-Society Periodicals Package (ASPP) 1998–Present
IEEE
CrossRef
DatabaseTitle CrossRef
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISSN 1558-2523
EndPage 8
ExternalDocumentID 10_1109_TAC_2025_3593328
11098685
Genre orig-research
GroupedDBID -~X
.DC
0R~
29I
4.4
5GY
6IK
97E
AAJGR
AARMG
AASAJ
AAWTH
ABAZT
ABQJQ
ABVLG
ACGFO
ACGFS
ACIWK
ACNCT
AENEX
AGQYO
AHBIQ
AKJIK
AKQYR
ALMA_UNASSIGNED_HOLDINGS
ASUFR
ATWAV
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CS3
DU5
EBS
F5P
HZ~
IFIPE
IPLJI
JAVBF
LAI
M43
MS~
O9-
OCL
P2P
RIA
RIE
RNS
TAE
TN5
~02
3EH
5VS
AAYXX
AETIX
AGSQL
AI.
AIBXA
ALLEH
CITATION
EJD
H~9
IAAWW
IBMZZ
ICLAB
IDIHD
IFJZH
VH1
VJK
ID FETCH-LOGICAL-c134t-b3a97e2489bac213ba51db98d763985a2ef58cd692cdd7a30fc4801978c5b0243
IEDL.DBID RIE
ISSN 0018-9286
IngestDate Sat Nov 29 07:44:10 EST 2025
Wed Aug 06 17:59:59 EDT 2025
IsPeerReviewed true
IsScholarly true
Language English
License https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html
https://doi.org/10.15223/policy-029
https://doi.org/10.15223/policy-037
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c134t-b3a97e2489bac213ba51db98d763985a2ef58cd692cdd7a30fc4801978c5b0243
PageCount 8
ParticipantIDs ieee_primary_11098685
crossref_primary_10_1109_TAC_2025_3593328
PublicationCentury 2000
PublicationDate 2025-00-00
PublicationDateYYYYMMDD 2025-01-01
PublicationDate_xml – year: 2025
  text: 2025-00-00
PublicationDecade 2020
PublicationTitle IEEE transactions on automatic control
PublicationTitleAbbrev TAC
PublicationYear 2025
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0016441
Score 2.4741788
Snippet In this paper, we consider the risk-sensitive cost criterion with exponentiated costs for Markov decision processes and develop a model-free policy gradient...
SourceID crossref
ieee
SourceType Index Database
Publisher
StartPage 1
SubjectTerms Approximation algorithms
Convergence
Costs
Function approximation
Machine learning algorithms
Markov decision processes
Training
Vectors
Zinc
Title An Actor-Critic Algorithm with Function Approximation for Risk Sensitive Cost Markov Decision Processes
URI https://ieeexplore.ieee.org/document/11098685
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVIEE
  databaseName: IEEE Electronic Library (IEL)
  customDbUrl:
  eissn: 1558-2523
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0016441
  issn: 0018-9286
  databaseCode: RIE
  dateStart: 19630101
  isFulltext: true
  titleUrlDefault: https://ieeexplore.ieee.org/
  providerName: IEEE
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV07T8MwELagYoCBZxHlJQ8sDG6bxK7tMSpUTBWCInWL_EqpgAQ1acXPx5eE0oWBzYoSKbqz7-n7PoRuuA65ompAZEo1oZaHBLitiNWpD5ipVQHVFdkEH4_FdCofm2H1ahbGOVddPnNdWFa9fJubJZTKeoCOKQaCbaNtznk9rLVuGYBjr82uP8GhWPck-7I3iYc-EwxZN2I-fwfi9Q0ftEGqUvmU0cE__-YQ7TfBI45rbR-hLZcdo70NSMETNIszHEMhntQkBjh-n-V-9fqBoeSKR96PgS5wDGDiX_N6chH70BU_zYs3_AwX2sEE4mFelBhGefIVvmuYeHAzV-CKNnoZ3U-GD6QhUyAmiGhJdKQkdyEVUisTBpFWLLBaCusNjBRMhS5lwtiBDI21XEX91ACyjE8yDdMAW3iKWlmeuTOEjU_CuA-8fH5racpSQa1TTjpq_PmnLOig2x_xJp81ZkZS5Rp9mXhVJKCKpFFFB7VBsr_vNUI9_-P5BdqFz-siyCVqlYulu0I7ZlXOi8V1tSO-AVBFtKc
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV05T8MwFLagIAEDZxHl9MDCkNI4dmOPUaEqolQIitQt8pVSAQnqJX4-fkkoXRjYrCiKoveSd9nf9yF0GSoSSiqbnkio8qgJiQfaVp5RiSuYqZE-VbnYRNjr8cFAPJZg9RwLY63ND5_ZOizzvXyT6RmMyq6BHZM3OVtFa4xS4hdwrcWmAaT2IvC6f5jwxa5kQ1z3o5brBQmrB8x18CC9vpSFlmRV8qzS3vnn--yi7bJ8xFHh7z20YtN9tLVEKniAhlGKIxjFe4WMAY7eh5lbvX5gGLritstk4A0cAZ3416jALmJXvOKn0eQNP8ORdgiCuJVNphjAPNkc35RaPLhEFthJFb20b_utjlfKKXjaD-jUU4EUoSWUCyU18QMlmW-U4MaFGMGZJDZhXJumINqYUAaNRAO3jGszNVNAXHiIKmmW2iOEtWvDQld6uQ7X0IQlnBorrbBUuwhAmV9DVz_mjT8L1ow47zYaInauiMEVcemKGqqCZX_vK416_Mf1C7TR6T904-5d7_4EbcKjipHIKapMxzN7htb1fDqajM_zr-Mb4_a37g
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=An+Actor-Critic+Algorithm+with+Function+Approximation+for+Risk+Sensitive+Cost+Markov+Decision+Processes&rft.jtitle=IEEE+transactions+on+automatic+control&rft.au=Guin%2C+Soumyajit&rft.au=Borkar%2C+Vivek+S.&rft.au=Bhatnagar%2C+Shalabh&rft.date=2025&rft.issn=0018-9286&rft.eissn=1558-2523&rft.spage=1&rft.epage=8&rft_id=info:doi/10.1109%2FTAC.2025.3593328&rft.externalDBID=n%2Fa&rft.externalDocID=10_1109_TAC_2025_3593328
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0018-9286&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0018-9286&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0018-9286&client=summon