Approximate Constrained Discounted Dynamic Programming With Uniform Feasibility and Optimality
An important question about finite constrained Markov decision process (CMDP) problem is if there exists a condition under which a uniformly optimal and uniformly feasible policy exists in the set of deterministic, history-independent, and stationary policies that achieves the optimal value at all i...
Gespeichert in:
| Veröffentlicht in: | IEEE transactions on automatic control Jg. 70; H. 6; S. 4031 - 4036 |
|---|---|
| 1. Verfasser: | |
| Format: | Journal Article |
| Sprache: | Englisch |
| Veröffentlicht: |
New York
IEEE
01.06.2025
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Schlagworte: | |
| ISSN: | 0018-9286, 1558-2523 |
| Online-Zugang: | Volltext |
| Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
| Abstract | An important question about finite constrained Markov decision process (CMDP) problem is if there exists a condition under which a uniformly optimal and uniformly feasible policy exists in the set of deterministic, history-independent, and stationary policies that achieves the optimal value at all initial states and if the CMDP problem with the condition can be solved by dynamic programming (DP). This is because the crux of the unconstrained MDP theory developed by Bellman lies in the answer to the same existence question of such an optimal policy to MDP. Even if the topic of CMDP has been studied over the years, there has not been any relevant responsive work since the open question was raised about three decades ago in the literature. We establish (as some answer to this question) that any finite CMDP problem <inline-formula><tex-math notation="LaTeX"> \mathsf{M}^{c}</tex-math></inline-formula> "contains" inherently a DP-structure in its "subordinate" CMDP problem <inline-formula><tex-math notation="LaTeX">\hat{ \mathsf{M} }^{c}</tex-math></inline-formula> induced from the parameters of <inline-formula><tex-math notation="LaTeX"> \mathsf{M} ^{c}</tex-math></inline-formula> and <inline-formula><tex-math notation="LaTeX">\hat{\mathsf{M} }^{c}</tex-math></inline-formula> is DP-solvable. We drive a policy-iteration-type algorithm for solving <inline-formula><tex-math notation="LaTeX">\hat{\mathsf{M} }^{c}</tex-math></inline-formula> providing an approximate solution to <inline-formula><tex-math notation="LaTeX"> \mathsf{M}^{c}</tex-math></inline-formula> or <inline-formula><tex-math notation="LaTeX"> \mathsf{M} ^{c}</tex-math></inline-formula> with a fixed initial state. |
|---|---|
| AbstractList | An important question about finite constrained Markov decision process (CMDP) problem is if there exists a condition under which a uniformly optimal and uniformly feasible policy exists in the set of deterministic, history-independent, and stationary policies that achieves the optimal value at all initial states and if the CMDP problem with the condition can be solved by dynamic programming (DP). This is because the crux of the unconstrained MDP theory developed by Bellman lies in the answer to the same existence question of such an optimal policy to MDP. Even if the topic of CMDP has been studied over the years, there has not been any relevant responsive work since the open question was raised about three decades ago in the literature. We establish (as some answer to this question) that any finite CMDP problem <inline-formula><tex-math notation="LaTeX"> \mathsf{M}^{c}</tex-math></inline-formula> "contains" inherently a DP-structure in its "subordinate" CMDP problem <inline-formula><tex-math notation="LaTeX">\hat{ \mathsf{M} }^{c}</tex-math></inline-formula> induced from the parameters of <inline-formula><tex-math notation="LaTeX"> \mathsf{M} ^{c}</tex-math></inline-formula> and <inline-formula><tex-math notation="LaTeX">\hat{\mathsf{M} }^{c}</tex-math></inline-formula> is DP-solvable. We drive a policy-iteration-type algorithm for solving <inline-formula><tex-math notation="LaTeX">\hat{\mathsf{M} }^{c}</tex-math></inline-formula> providing an approximate solution to <inline-formula><tex-math notation="LaTeX"> \mathsf{M}^{c}</tex-math></inline-formula> or <inline-formula><tex-math notation="LaTeX"> \mathsf{M} ^{c}</tex-math></inline-formula> with a fixed initial state. An important question about finite constrained Markov decision process (CMDP) problem is if there exists a condition under which a uniformly optimal and uniformly feasible policy exists in the set of deterministic, history-independent, and stationary policies that achieves the optimal value at all initial states and if the CMDP problem with the condition can be solved by dynamic programming (DP). This is because the crux of the unconstrained MDP theory developed by Bellman lies in the answer to the same existence question of such an optimal policy to MDP. Even if the topic of CMDP has been studied over the years, there has not been any relevant responsive work since the open question was raised about three decades ago in the literature. We establish (as some answer to this question) that any finite CMDP problem [Formula Omitted] “contains” inherently a DP-structure in its “subordinate” CMDP problem [Formula Omitted] induced from the parameters of [Formula Omitted] and [Formula Omitted] is DP-solvable. We drive a policy-iteration-type algorithm for solving [Formula Omitted] providing an approximate solution to [Formula Omitted] or [Formula Omitted] with a fixed initial state. |
| Author | Chang, Hyeong Soo |
| Author_xml | – sequence: 1 givenname: Hyeong Soo orcidid: 0000-0003-3298-0018 surname: Chang fullname: Chang, Hyeong Soo email: hschang@sogang.ac.kr organization: Department of Computer Science and Engineering, Sogang University, Seoul, South Korea |
| BookMark | eNpNUD1PwzAUtFCRaAs7A4Ml5hR_JLE9VoECUqUytGLDch2nuGrsYKcS_fc4agem9066u_fuJmDkvDMA3GM0wxiJp_W8mhFE8hktCOU5uwJjXBQ8IwmOwBghzDNBeHkDJjHuEyzzHI_B17zrgv-1reoNrLyLfVDWmRo-26j90fXDenKqtRp-BL8Lqm2t28FP23_DjbONDy1cGBXt1h5sf4LK1XDV9clwgLfgulGHaO4ucwo2i5d19ZYtV6_v1XyZaZIXfdYYipguUCm0xkYYwxjRGG-pbhjeikLpkvAC16RGpdai1EQJzmnKVteYN4pOwePZN4X5OZrYy70_BpdOSkowFRTxkicWOrN08DEG08gupEfDSWIkhxZlalEOLcpLi0nycJZYY8w_OsesYIz-AT-kcN8 |
| CODEN | IETAA9 |
| Cites_doi | 10.1109/tnnls.2023.3315598 10.1613/jair.1.12233 10.1109/TAC.2006.880801 10.1007/978-1-4419-8714-3 10.1007/s10957-024-02453-y 10.1007/BF02006255 10.1109/tac.2023.3274791 10.1007/s11590-011-0338-7 10.1137/1117020 10.1007/978-1-84628-690-2 10.1287/moor.19.1.152 10.1007/s001860000071 10.1109/TAC.2018.2890756 10.1287/moor.21.4.922 10.1016/j.orl.2024.107107 10.1109/TAC.2014.2309262 10.1002/SERIES1345 10.1137/120867925 10.1016/j.automatica.2014.03.020 10.1109/TNNLS.2021.3121546 10.1109/TAC.2004.826725 10.1007/s001860050035 10.1007/978-1-4615-0805-2 10.1016/j.camwa.2005.11.013 10.1016/S0167-6377(00)00039-0 10.1007/s00186-006-0133-x 10.1109/TASE.2022.3188009 10.1016/j.automatica.2019.108582 10.1017/S0269964800005131 |
| ContentType | Journal Article |
| Copyright | Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2025 |
| Copyright_xml | – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2025 |
| DBID | 97E RIA RIE AAYXX CITATION 7SC 7SP 7TB 8FD FR3 JQ2 L7M L~C L~D |
| DOI | 10.1109/TAC.2024.3523847 |
| DatabaseName | IEEE All-Society Periodicals Package (ASPP) 2005–Present IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE/IET Electronic Library (IEL) (UW System Shared) CrossRef Computer and Information Systems Abstracts Electronics & Communications Abstracts Mechanical & Transportation Engineering Abstracts Technology Research Database Engineering Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional |
| DatabaseTitle | CrossRef Technology Research Database Computer and Information Systems Abstracts – Academic Mechanical & Transportation Engineering Abstracts Electronics & Communications Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts Engineering Research Database Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Professional |
| DatabaseTitleList | Technology Research Database |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE/IET Electronic Library url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Engineering |
| EISSN | 1558-2523 |
| EndPage | 4036 |
| ExternalDocumentID | 10_1109_TAC_2024_3523847 10817577 |
| Genre | orig-research |
| GroupedDBID | -~X .DC 0R~ 29I 3EH 4.4 5GY 5VS 6IK 97E AAJGR AARMG AASAJ AAWTH ABAZT ABQJQ ABVLG ACGFO ACGFS ACIWK ACNCT AENEX AETIX AGQYO AGSQL AHBIQ AI. AIBXA AKJIK AKQYR ALLEH ALMA_UNASSIGNED_HOLDINGS ASUFR ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ CS3 DU5 EBS EJD F5P HZ~ H~9 IAAWW IBMZZ ICLAB IDIHD IFIPE IFJZH IPLJI JAVBF LAI M43 MS~ O9- OCL P2P RIA RIE RNS TAE TN5 VH1 VJK ~02 AAYXX CITATION 7SC 7SP 7TB 8FD FR3 JQ2 L7M L~C L~D |
| ID | FETCH-LOGICAL-c245t-fe307c5069cc1e9ee772c11b3cf71b95ac62851d2d06cc96c2a9883252dd18fa3 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 0 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001499525600014&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 0018-9286 |
| IngestDate | Thu Oct 30 15:55:14 EDT 2025 Sat Nov 29 07:49:29 EST 2025 Wed Aug 27 01:47:16 EDT 2025 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 6 |
| Language | English |
| License | https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html https://doi.org/10.15223/policy-029 https://doi.org/10.15223/policy-037 |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c245t-fe307c5069cc1e9ee772c11b3cf71b95ac62851d2d06cc96c2a9883252dd18fa3 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ORCID | 0000-0003-3298-0018 |
| PQID | 3213930868 |
| PQPubID | 85475 |
| PageCount | 6 |
| ParticipantIDs | proquest_journals_3213930868 ieee_primary_10817577 crossref_primary_10_1109_TAC_2024_3523847 |
| PublicationCentury | 2000 |
| PublicationDate | 2025-06-01 |
| PublicationDateYYYYMMDD | 2025-06-01 |
| PublicationDate_xml | – month: 06 year: 2025 text: 2025-06-01 day: 01 |
| PublicationDecade | 2020 |
| PublicationPlace | New York |
| PublicationPlace_xml | – name: New York |
| PublicationTitle | IEEE transactions on automatic control |
| PublicationTitleAbbrev | TAC |
| PublicationYear | 2025 |
| Publisher | IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Publisher_xml | – name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| References | Chang (ref6) 2007 ref13 Hernndez-Lerma (ref18) 1989 ref35 ref12 ref34 ref15 ref14 ref31 ref30 ref11 ref10 ref32 ref2 ref17 ref16 ref19 Altman (ref1) 1998 ref24 ref23 ref26 Lez-Martez (ref29) 2003; 7 ref25 Hernndez-Lerma (ref20) 2012 ref22 ref28 ref27 ref8 ref7 ref9 ref4 ref5 Hernndez-Lerma (ref21) 2012 Ross (ref33) 1970 Bellman (ref3) 1957 |
| References_xml | – ident: ref26 doi: 10.1109/tnnls.2023.3315598 – volume: 7 start-page: 1 issue: 1 year: 2003 ident: ref29 article-title: The Lagrange approach to constrained Markov control processes: A survey and extension of results publication-title: Morfismos – ident: ref30 doi: 10.1613/jair.1.12233 – ident: ref5 doi: 10.1109/TAC.2006.880801 – volume-title: Adaptive Markov Control Processes year: 1989 ident: ref18 doi: 10.1007/978-1-4419-8714-3 – ident: ref24 doi: 10.1007/s10957-024-02453-y – ident: ref28 doi: 10.1007/BF02006255 – ident: ref9 doi: 10.1109/tac.2023.3274791 – ident: ref7 doi: 10.1007/s11590-011-0338-7 – ident: ref17 doi: 10.1137/1117020 – volume-title: Simulation-Based Algorithms for Markov Decision Processes year: 2007 ident: ref6 doi: 10.1007/978-1-84628-690-2 – ident: ref16 doi: 10.1287/moor.19.1.152 – ident: ref19 doi: 10.1007/s001860000071 – ident: ref34 doi: 10.1109/TAC.2018.2890756 – ident: ref14 doi: 10.1287/moor.21.4.922 – ident: ref27 doi: 10.1016/j.orl.2024.107107 – ident: ref4 doi: 10.1109/TAC.2014.2309262 – ident: ref31 doi: 10.1002/SERIES1345 – ident: ref12 doi: 10.1137/120867925 – ident: ref8 doi: 10.1016/j.automatica.2014.03.020 – ident: ref22 doi: 10.1109/TNNLS.2021.3121546 – volume-title: Dynamic Programming year: 1957 ident: ref3 – ident: ref10 doi: 10.1109/TAC.2004.826725 – ident: ref2 doi: 10.1007/s001860050035 – ident: ref15 doi: 10.1007/978-1-4615-0805-2 – ident: ref25 doi: 10.1016/j.camwa.2005.11.013 – volume-title: Further Topics on Discrete-Time Markov Control Processes year: 2012 ident: ref21 – ident: ref32 doi: 10.1016/S0167-6377(00)00039-0 – volume-title: Constrained Markov Decision Processes year: 1998 ident: ref1 – ident: ref11 doi: 10.1007/s00186-006-0133-x – volume-title: Applied Probability Models With Optimization Applications year: 1970 ident: ref33 – ident: ref23 doi: 10.1109/TASE.2022.3188009 – ident: ref13 doi: 10.1016/j.automatica.2019.108582 – ident: ref35 doi: 10.1017/S0269964800005131 – volume-title: Discrete-Time Markov Control Processes: Basic Optimality Criteria year: 2012 ident: ref20 |
| SSID | ssj0016441 |
| Score | 2.480187 |
| Snippet | An important question about finite constrained Markov decision process (CMDP) problem is if there exists a condition under which a uniformly optimal and... |
| SourceID | proquest crossref ieee |
| SourceType | Aggregation Database Index Database Publisher |
| StartPage | 4031 |
| SubjectTerms | Algorithms Approximation algorithms Constraints Costs Dynamic programming Dynamic programming (DP) Feasibility Markov decision process Markov processes optimality equation Optimization Probability distribution Programming Reviews Scheduling Scheduling algorithms Throughput Trajectory uniform-optimality |
| Title | Approximate Constrained Discounted Dynamic Programming With Uniform Feasibility and Optimality |
| URI | https://ieeexplore.ieee.org/document/10817577 https://www.proquest.com/docview/3213930868 |
| Volume | 70 |
| WOSCitedRecordID | wos001499525600014&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVIEE databaseName: IEEE/IET Electronic Library customDbUrl: eissn: 1558-2523 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0016441 issn: 0018-9286 databaseCode: RIE dateStart: 19630101 isFulltext: true titleUrlDefault: https://ieeexplore.ieee.org/ providerName: IEEE |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV07T8MwELZoxQADzyLKSx5YGFISO47tsSpUTKVDEZ2IEuciMjRFbYrg33N20qoSYmDzEEeRL3f3ne_xEXLLcy78JBBeZiR4dgS6p8CkHroSpkPNU53njmxCjkZqOtXjplnd9cIAgCs-g55dulx-Njcre1WGGq7Q20nZIi0po7pZa5MysI69NruowUxtcpK-vp_0BxgJsrCHaIMry6Sy5YMcqcovS-zcy_Dwnx92RA4aHEn7teCPyQ6UJ2R_a7rgKXnr23nhXwViUqCWmNPRQUBGH4qlo4iwy5qQno7rMq0ZbqSvRfVOEYtaOEsRIjYFtN80KTP6jCZm5rB7h7wMHyeDJ6-hU_AMC0Xl5YD6bIQfaWMC0AAIrE0QpNzkMki1SIxtpwwylvmRMToyLNEKFV6wLAtUnvAz0i7nJZzbRm8teCoUvs8mRtPET6IwQuuRilxFErrkbn3A8Uc9NSN20YavYxRGbIURN8Loko490K3n6rPskqu1SOJGr5YxZ4hYOYZh6uKPbZdkj1mKXndRckXa1WIF12TXfFbFcnHjfpkfB1m_Wg |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV07T8MwELagIAEDzyIKBTywMKQkdpzYY1WoiiilQxGdiBLbERmaojZF8O85OymqhBjYPMRR5Mvdfed7fAhd0ZQyN_aYo2SoHTMC3eFaJg64EiJ8QRORppZsIhwM-HgshlWzuu2F0Vrb4jPdMkuby1dTuTBXZaDhHLxdGK6jDUOd5ZbtWj9JA-PaS8MLOkz4T1bSFTejdgdiQeK3AG9QbrhUVryQpVX5ZYutg-nu_fPT9tFuhSRxuxT9AVrT-SHaWZkveIRe22Zi-GcGqFRjQ81pCSG0wrfZ3JJEmGVJSY-HZaHWBDbil6x4w4BGDaDFABKrEtovHOcKP4GRmVj0XkfP3btRp-dUhAqOJD4rnFSDRkvmBkJKTwutAVpLz0uoTEMvESyWpqHSU0S5gZQikCQWHFSeEaU8nsb0GNXyaa5PTKu3YDRhHN5nUqNJ7MaBH4D9SFjKg1A30PXygKP3cm5GZOMNV0QgjMgII6qE0UB1c6Arz5Vn2UDNpUiiSrPmESWAWSkEYvz0j22XaKs3euxH_fvBwxnaJoaw116bNFGtmC30OdqUH0U2n13Y3-cbJI_CoA |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Approximate+Constrained+Discounted+Dynamic+Programming+With+Uniform+Feasibility+and+Optimality&rft.jtitle=IEEE+transactions+on+automatic+control&rft.au=Chang%2C+Hyeong+Soo&rft.date=2025-06-01&rft.issn=0018-9286&rft.eissn=1558-2523&rft.volume=70&rft.issue=6&rft.spage=4031&rft.epage=4036&rft_id=info:doi/10.1109%2FTAC.2024.3523847&rft.externalDBID=n%2Fa&rft.externalDocID=10_1109_TAC_2024_3523847 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0018-9286&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0018-9286&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0018-9286&client=summon |