An Actor-Critic Algorithm with Function Approximation for Risk Sensitive Cost Markov Decision Processes
In this paper, we consider the risk-sensitive cost criterion with exponentiated costs for Markov decision processes and develop a model-free policy gradient algorithm in this setting. Unlike additive cost criteria such as average or discounted cost, the risk-sensitive cost criterion is less studied...
Gespeichert in:
| Veröffentlicht in: | IEEE transactions on automatic control S. 1 - 8 |
|---|---|
| Hauptverfasser: | , , |
| Format: | Journal Article |
| Sprache: | Englisch |
| Veröffentlicht: |
IEEE
2025
|
| Schlagworte: | |
| ISSN: | 0018-9286, 1558-2523 |
| Online-Zugang: | Volltext |
| Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
| Abstract | In this paper, we consider the risk-sensitive cost criterion with exponentiated costs for Markov decision processes and develop a model-free policy gradient algorithm in this setting. Unlike additive cost criteria such as average or discounted cost, the risk-sensitive cost criterion is less studied due to the complexity resulting from the multiplicative structure of the resulting Bellman equation. We develop an actor-critic algorithm with function approximation in this setting and provide its asymptotic convergence analysis. We also show the results of numerical experiments that demonstrate the superiority in performance of our algorithm over other recent algorithms in the literature. |
|---|---|
| AbstractList | In this paper, we consider the risk-sensitive cost criterion with exponentiated costs for Markov decision processes and develop a model-free policy gradient algorithm in this setting. Unlike additive cost criteria such as average or discounted cost, the risk-sensitive cost criterion is less studied due to the complexity resulting from the multiplicative structure of the resulting Bellman equation. We develop an actor-critic algorithm with function approximation in this setting and provide its asymptotic convergence analysis. We also show the results of numerical experiments that demonstrate the superiority in performance of our algorithm over other recent algorithms in the literature. |
| Author | Bhatnagar, Shalabh Borkar, Vivek S. Guin, Soumyajit |
| Author_xml | – sequence: 1 givenname: Soumyajit surname: Guin fullname: Guin, Soumyajit email: gsoumyajit@iisc.ac.in organization: Department of Computer Science and Automation, Indian Institute of Science, Bengaluru, Karnataka, India – sequence: 2 givenname: Vivek S. surname: Borkar fullname: Borkar, Vivek S. email: borkar.vs@gmail.com organization: Department of Electrical Engineering, Indian Institute of Technology, Bombay, Mumbai, India – sequence: 3 givenname: Shalabh surname: Bhatnagar fullname: Bhatnagar, Shalabh email: shalabh@iisc.ac.in organization: Department of Computer Science and Automation, Indian Institute of Science, Bengaluru, Karnataka, India |
| BookMark | eNpFkMtOwzAQRS1UJNrCngUL_0CKH3FiL6NAAakIBGUdOc6kmLZxZYcCf49DK7GZh3TuaHQmaNS5DhC6pGRGKVHXy6KcMcLEjAvFOZMnaEyFkAkTjI_QmBAqE8VkdoYmIXzENUtTOkarosOF6Z1PSm97a3CxWbk4vW_xV6x4_tmZ3roI7Xbefdut_tta5_GLDWv8Cl2IuT3g0oUeP2q_dnt8A8aGgXv2zkAIEM7Raas3AS6OfYre5rfL8j5ZPN09lMUiMZSnfVJzrXJgqVS1NozyWgva1Eo2ecaVFJpBK6RpMsVM0-Sak9akklCVSyNqwlI-ReRw13gXgoe22vn4tP-pKKkGUVUUVQ2iqqOoGLk6RCwA_OORlZkU_BdxNWeR |
| CODEN | IETAA9 |
| ContentType | Journal Article |
| DBID | 97E RIA RIE AAYXX CITATION |
| DOI | 10.1109/TAC.2025.3593328 |
| DatabaseName | IEEE Xplore (IEEE) IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE CrossRef |
| DatabaseTitle | CrossRef |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Engineering |
| EISSN | 1558-2523 |
| EndPage | 8 |
| ExternalDocumentID | 10_1109_TAC_2025_3593328 11098685 |
| Genre | orig-research |
| GroupedDBID | -~X .DC 0R~ 29I 4.4 5GY 6IK 97E AAJGR AARMG AASAJ AAWTH ABAZT ABQJQ ABVLG ACGFO ACGFS ACIWK ACNCT AENEX AGQYO AHBIQ AKJIK AKQYR ALMA_UNASSIGNED_HOLDINGS ASUFR ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ CS3 DU5 EBS F5P HZ~ IFIPE IPLJI JAVBF LAI M43 MS~ O9- OCL P2P RIA RIE RNS TAE TN5 ~02 3EH 5VS AAYXX AETIX AGSQL AI. AIBXA ALLEH CITATION EJD H~9 IAAWW IBMZZ ICLAB IDIHD IFJZH VH1 VJK |
| ID | FETCH-LOGICAL-c134t-b3a97e2489bac213ba51db98d763985a2ef58cd692cdd7a30fc4801978c5b0243 |
| IEDL.DBID | RIE |
| ISSN | 0018-9286 |
| IngestDate | Sat Nov 29 07:44:10 EST 2025 Wed Aug 06 17:59:59 EDT 2025 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Language | English |
| License | https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html https://doi.org/10.15223/policy-029 https://doi.org/10.15223/policy-037 |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c134t-b3a97e2489bac213ba51db98d763985a2ef58cd692cdd7a30fc4801978c5b0243 |
| PageCount | 8 |
| ParticipantIDs | ieee_primary_11098685 crossref_primary_10_1109_TAC_2025_3593328 |
| PublicationCentury | 2000 |
| PublicationDate | 2025-00-00 |
| PublicationDateYYYYMMDD | 2025-01-01 |
| PublicationDate_xml | – year: 2025 text: 2025-00-00 |
| PublicationDecade | 2020 |
| PublicationTitle | IEEE transactions on automatic control |
| PublicationTitleAbbrev | TAC |
| PublicationYear | 2025 |
| Publisher | IEEE |
| Publisher_xml | – name: IEEE |
| SSID | ssj0016441 |
| Score | 2.4741788 |
| Snippet | In this paper, we consider the risk-sensitive cost criterion with exponentiated costs for Markov decision processes and develop a model-free policy gradient... |
| SourceID | crossref ieee |
| SourceType | Index Database Publisher |
| StartPage | 1 |
| SubjectTerms | Approximation algorithms Convergence Costs Function approximation Machine learning algorithms Markov decision processes Training Vectors Zinc |
| Title | An Actor-Critic Algorithm with Function Approximation for Risk Sensitive Cost Markov Decision Processes |
| URI | https://ieeexplore.ieee.org/document/11098685 |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVIEE databaseName: IEEE Electronic Library (IEL) customDbUrl: eissn: 1558-2523 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0016441 issn: 0018-9286 databaseCode: RIE dateStart: 19630101 isFulltext: true titleUrlDefault: https://ieeexplore.ieee.org/ providerName: IEEE |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV07T8MwELagYoCBZxHlJQ8sDG6bxK7tMSpUTBWCInWL_EqpgAQ1acXPx5eE0oWBzYoSKbqz7-n7PoRuuA65ompAZEo1oZaHBLitiNWpD5ipVQHVFdkEH4_FdCofm2H1ahbGOVddPnNdWFa9fJubJZTKeoCOKQaCbaNtznk9rLVuGYBjr82uP8GhWPck-7I3iYc-EwxZN2I-fwfi9Q0ftEGqUvmU0cE__-YQ7TfBI45rbR-hLZcdo70NSMETNIszHEMhntQkBjh-n-V-9fqBoeSKR96PgS5wDGDiX_N6chH70BU_zYs3_AwX2sEE4mFelBhGefIVvmuYeHAzV-CKNnoZ3U-GD6QhUyAmiGhJdKQkdyEVUisTBpFWLLBaCusNjBRMhS5lwtiBDI21XEX91ACyjE8yDdMAW3iKWlmeuTOEjU_CuA-8fH5racpSQa1TTjpq_PmnLOig2x_xJp81ZkZS5Rp9mXhVJKCKpFFFB7VBsr_vNUI9_-P5BdqFz-siyCVqlYulu0I7ZlXOi8V1tSO-AVBFtKc |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV05T8MwFLagIAEDZxHl9MDCkNI4dmOPUaEqolQIitQt8pVSAQnqJX4-fkkoXRjYrCiKoveSd9nf9yF0GSoSSiqbnkio8qgJiQfaVp5RiSuYqZE-VbnYRNjr8cFAPJZg9RwLY63ND5_ZOizzvXyT6RmMyq6BHZM3OVtFa4xS4hdwrcWmAaT2IvC6f5jwxa5kQ1z3o5brBQmrB8x18CC9vpSFlmRV8qzS3vnn--yi7bJ8xFHh7z20YtN9tLVEKniAhlGKIxjFe4WMAY7eh5lbvX5gGLritstk4A0cAZ3416jALmJXvOKn0eQNP8ORdgiCuJVNphjAPNkc35RaPLhEFthJFb20b_utjlfKKXjaD-jUU4EUoSWUCyU18QMlmW-U4MaFGMGZJDZhXJumINqYUAaNRAO3jGszNVNAXHiIKmmW2iOEtWvDQld6uQ7X0IQlnBorrbBUuwhAmV9DVz_mjT8L1ow47zYaInauiMEVcemKGqqCZX_vK416_Mf1C7TR6T904-5d7_4EbcKjipHIKapMxzN7htb1fDqajM_zr-Mb4_a37g |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=An+Actor-Critic+Algorithm+with+Function+Approximation+for+Risk+Sensitive+Cost+Markov+Decision+Processes&rft.jtitle=IEEE+transactions+on+automatic+control&rft.au=Guin%2C+Soumyajit&rft.au=Borkar%2C+Vivek+S.&rft.au=Bhatnagar%2C+Shalabh&rft.date=2025&rft.issn=0018-9286&rft.eissn=1558-2523&rft.spage=1&rft.epage=8&rft_id=info:doi/10.1109%2FTAC.2025.3593328&rft.externalDBID=n%2Fa&rft.externalDocID=10_1109_TAC_2025_3593328 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0018-9286&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0018-9286&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0018-9286&client=summon |