Approximate Constrained Discounted Dynamic Programming With Uniform Feasibility and Optimality

An important question about finite constrained Markov decision process (CMDP) problem is if there exists a condition under which a uniformly optimal and uniformly feasible policy exists in the set of deterministic, history-independent, and stationary policies that achieves the optimal value at all i...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on automatic control Jg. 70; H. 6; S. 4031 - 4036
1. Verfasser: Chang, Hyeong Soo
Format: Journal Article
Sprache:Englisch
Veröffentlicht: New York IEEE 01.06.2025
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Schlagworte:
ISSN:0018-9286, 1558-2523
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Abstract An important question about finite constrained Markov decision process (CMDP) problem is if there exists a condition under which a uniformly optimal and uniformly feasible policy exists in the set of deterministic, history-independent, and stationary policies that achieves the optimal value at all initial states and if the CMDP problem with the condition can be solved by dynamic programming (DP). This is because the crux of the unconstrained MDP theory developed by Bellman lies in the answer to the same existence question of such an optimal policy to MDP. Even if the topic of CMDP has been studied over the years, there has not been any relevant responsive work since the open question was raised about three decades ago in the literature. We establish (as some answer to this question) that any finite CMDP problem <inline-formula><tex-math notation="LaTeX"> \mathsf{M}^{c}</tex-math></inline-formula> "contains" inherently a DP-structure in its "subordinate" CMDP problem <inline-formula><tex-math notation="LaTeX">\hat{ \mathsf{M} }^{c}</tex-math></inline-formula> induced from the parameters of <inline-formula><tex-math notation="LaTeX"> \mathsf{M} ^{c}</tex-math></inline-formula> and <inline-formula><tex-math notation="LaTeX">\hat{\mathsf{M} }^{c}</tex-math></inline-formula> is DP-solvable. We drive a policy-iteration-type algorithm for solving <inline-formula><tex-math notation="LaTeX">\hat{\mathsf{M} }^{c}</tex-math></inline-formula> providing an approximate solution to <inline-formula><tex-math notation="LaTeX"> \mathsf{M}^{c}</tex-math></inline-formula> or <inline-formula><tex-math notation="LaTeX"> \mathsf{M} ^{c}</tex-math></inline-formula> with a fixed initial state.
AbstractList An important question about finite constrained Markov decision process (CMDP) problem is if there exists a condition under which a uniformly optimal and uniformly feasible policy exists in the set of deterministic, history-independent, and stationary policies that achieves the optimal value at all initial states and if the CMDP problem with the condition can be solved by dynamic programming (DP). This is because the crux of the unconstrained MDP theory developed by Bellman lies in the answer to the same existence question of such an optimal policy to MDP. Even if the topic of CMDP has been studied over the years, there has not been any relevant responsive work since the open question was raised about three decades ago in the literature. We establish (as some answer to this question) that any finite CMDP problem <inline-formula><tex-math notation="LaTeX"> \mathsf{M}^{c}</tex-math></inline-formula> "contains" inherently a DP-structure in its "subordinate" CMDP problem <inline-formula><tex-math notation="LaTeX">\hat{ \mathsf{M} }^{c}</tex-math></inline-formula> induced from the parameters of <inline-formula><tex-math notation="LaTeX"> \mathsf{M} ^{c}</tex-math></inline-formula> and <inline-formula><tex-math notation="LaTeX">\hat{\mathsf{M} }^{c}</tex-math></inline-formula> is DP-solvable. We drive a policy-iteration-type algorithm for solving <inline-formula><tex-math notation="LaTeX">\hat{\mathsf{M} }^{c}</tex-math></inline-formula> providing an approximate solution to <inline-formula><tex-math notation="LaTeX"> \mathsf{M}^{c}</tex-math></inline-formula> or <inline-formula><tex-math notation="LaTeX"> \mathsf{M} ^{c}</tex-math></inline-formula> with a fixed initial state.
An important question about finite constrained Markov decision process (CMDP) problem is if there exists a condition under which a uniformly optimal and uniformly feasible policy exists in the set of deterministic, history-independent, and stationary policies that achieves the optimal value at all initial states and if the CMDP problem with the condition can be solved by dynamic programming (DP). This is because the crux of the unconstrained MDP theory developed by Bellman lies in the answer to the same existence question of such an optimal policy to MDP. Even if the topic of CMDP has been studied over the years, there has not been any relevant responsive work since the open question was raised about three decades ago in the literature. We establish (as some answer to this question) that any finite CMDP problem [Formula Omitted] “contains” inherently a DP-structure in its “subordinate” CMDP problem [Formula Omitted] induced from the parameters of [Formula Omitted] and [Formula Omitted] is DP-solvable. We drive a policy-iteration-type algorithm for solving [Formula Omitted] providing an approximate solution to [Formula Omitted] or [Formula Omitted] with a fixed initial state.
Author Chang, Hyeong Soo
Author_xml – sequence: 1
  givenname: Hyeong Soo
  orcidid: 0000-0003-3298-0018
  surname: Chang
  fullname: Chang, Hyeong Soo
  email: hschang@sogang.ac.kr
  organization: Department of Computer Science and Engineering, Sogang University, Seoul, South Korea
BookMark eNpNUD1PwzAUtFCRaAs7A4Ml5hR_JLE9VoECUqUytGLDch2nuGrsYKcS_fc4agem9066u_fuJmDkvDMA3GM0wxiJp_W8mhFE8hktCOU5uwJjXBQ8IwmOwBghzDNBeHkDJjHuEyzzHI_B17zrgv-1reoNrLyLfVDWmRo-26j90fXDenKqtRp-BL8Lqm2t28FP23_DjbONDy1cGBXt1h5sf4LK1XDV9clwgLfgulGHaO4ucwo2i5d19ZYtV6_v1XyZaZIXfdYYipguUCm0xkYYwxjRGG-pbhjeikLpkvAC16RGpdai1EQJzmnKVteYN4pOwePZN4X5OZrYy70_BpdOSkowFRTxkicWOrN08DEG08gupEfDSWIkhxZlalEOLcpLi0nycJZYY8w_OsesYIz-AT-kcN8
CODEN IETAA9
Cites_doi 10.1109/tnnls.2023.3315598
10.1613/jair.1.12233
10.1109/TAC.2006.880801
10.1007/978-1-4419-8714-3
10.1007/s10957-024-02453-y
10.1007/BF02006255
10.1109/tac.2023.3274791
10.1007/s11590-011-0338-7
10.1137/1117020
10.1007/978-1-84628-690-2
10.1287/moor.19.1.152
10.1007/s001860000071
10.1109/TAC.2018.2890756
10.1287/moor.21.4.922
10.1016/j.orl.2024.107107
10.1109/TAC.2014.2309262
10.1002/SERIES1345
10.1137/120867925
10.1016/j.automatica.2014.03.020
10.1109/TNNLS.2021.3121546
10.1109/TAC.2004.826725
10.1007/s001860050035
10.1007/978-1-4615-0805-2
10.1016/j.camwa.2005.11.013
10.1016/S0167-6377(00)00039-0
10.1007/s00186-006-0133-x
10.1109/TASE.2022.3188009
10.1016/j.automatica.2019.108582
10.1017/S0269964800005131
ContentType Journal Article
Copyright Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2025
Copyright_xml – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2025
DBID 97E
RIA
RIE
AAYXX
CITATION
7SC
7SP
7TB
8FD
FR3
JQ2
L7M
L~C
L~D
DOI 10.1109/TAC.2024.3523847
DatabaseName IEEE All-Society Periodicals Package (ASPP) 2005–Present
IEEE All-Society Periodicals Package (ASPP) 1998–Present
IEEE/IET Electronic Library (IEL) (UW System Shared)
CrossRef
Computer and Information Systems Abstracts
Electronics & Communications Abstracts
Mechanical & Transportation Engineering Abstracts
Technology Research Database
Engineering Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
DatabaseTitle CrossRef
Technology Research Database
Computer and Information Systems Abstracts – Academic
Mechanical & Transportation Engineering Abstracts
Electronics & Communications Abstracts
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
Engineering Research Database
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts Professional
DatabaseTitleList
Technology Research Database
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE/IET Electronic Library
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISSN 1558-2523
EndPage 4036
ExternalDocumentID 10_1109_TAC_2024_3523847
10817577
Genre orig-research
GroupedDBID -~X
.DC
0R~
29I
3EH
4.4
5GY
5VS
6IK
97E
AAJGR
AARMG
AASAJ
AAWTH
ABAZT
ABQJQ
ABVLG
ACGFO
ACGFS
ACIWK
ACNCT
AENEX
AETIX
AGQYO
AGSQL
AHBIQ
AI.
AIBXA
AKJIK
AKQYR
ALLEH
ALMA_UNASSIGNED_HOLDINGS
ASUFR
ATWAV
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CS3
DU5
EBS
EJD
F5P
HZ~
H~9
IAAWW
IBMZZ
ICLAB
IDIHD
IFIPE
IFJZH
IPLJI
JAVBF
LAI
M43
MS~
O9-
OCL
P2P
RIA
RIE
RNS
TAE
TN5
VH1
VJK
~02
AAYXX
CITATION
7SC
7SP
7TB
8FD
FR3
JQ2
L7M
L~C
L~D
ID FETCH-LOGICAL-c245t-fe307c5069cc1e9ee772c11b3cf71b95ac62851d2d06cc96c2a9883252dd18fa3
IEDL.DBID RIE
ISICitedReferencesCount 0
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001499525600014&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 0018-9286
IngestDate Thu Oct 30 15:55:14 EDT 2025
Sat Nov 29 07:49:29 EST 2025
Wed Aug 27 01:47:16 EDT 2025
IsPeerReviewed true
IsScholarly true
Issue 6
Language English
License https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html
https://doi.org/10.15223/policy-029
https://doi.org/10.15223/policy-037
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c245t-fe307c5069cc1e9ee772c11b3cf71b95ac62851d2d06cc96c2a9883252dd18fa3
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ORCID 0000-0003-3298-0018
PQID 3213930868
PQPubID 85475
PageCount 6
ParticipantIDs proquest_journals_3213930868
ieee_primary_10817577
crossref_primary_10_1109_TAC_2024_3523847
PublicationCentury 2000
PublicationDate 2025-06-01
PublicationDateYYYYMMDD 2025-06-01
PublicationDate_xml – month: 06
  year: 2025
  text: 2025-06-01
  day: 01
PublicationDecade 2020
PublicationPlace New York
PublicationPlace_xml – name: New York
PublicationTitle IEEE transactions on automatic control
PublicationTitleAbbrev TAC
PublicationYear 2025
Publisher IEEE
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml – name: IEEE
– name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References Chang (ref6) 2007
ref13
Hernndez-Lerma (ref18) 1989
ref35
ref12
ref34
ref15
ref14
ref31
ref30
ref11
ref10
ref32
ref2
ref17
ref16
ref19
Altman (ref1) 1998
ref24
ref23
ref26
Lez-Martez (ref29) 2003; 7
ref25
Hernndez-Lerma (ref20) 2012
ref22
ref28
ref27
ref8
ref7
ref9
ref4
ref5
Hernndez-Lerma (ref21) 2012
Ross (ref33) 1970
Bellman (ref3) 1957
References_xml – ident: ref26
  doi: 10.1109/tnnls.2023.3315598
– volume: 7
  start-page: 1
  issue: 1
  year: 2003
  ident: ref29
  article-title: The Lagrange approach to constrained Markov control processes: A survey and extension of results
  publication-title: Morfismos
– ident: ref30
  doi: 10.1613/jair.1.12233
– ident: ref5
  doi: 10.1109/TAC.2006.880801
– volume-title: Adaptive Markov Control Processes
  year: 1989
  ident: ref18
  doi: 10.1007/978-1-4419-8714-3
– ident: ref24
  doi: 10.1007/s10957-024-02453-y
– ident: ref28
  doi: 10.1007/BF02006255
– ident: ref9
  doi: 10.1109/tac.2023.3274791
– ident: ref7
  doi: 10.1007/s11590-011-0338-7
– ident: ref17
  doi: 10.1137/1117020
– volume-title: Simulation-Based Algorithms for Markov Decision Processes
  year: 2007
  ident: ref6
  doi: 10.1007/978-1-84628-690-2
– ident: ref16
  doi: 10.1287/moor.19.1.152
– ident: ref19
  doi: 10.1007/s001860000071
– ident: ref34
  doi: 10.1109/TAC.2018.2890756
– ident: ref14
  doi: 10.1287/moor.21.4.922
– ident: ref27
  doi: 10.1016/j.orl.2024.107107
– ident: ref4
  doi: 10.1109/TAC.2014.2309262
– ident: ref31
  doi: 10.1002/SERIES1345
– ident: ref12
  doi: 10.1137/120867925
– ident: ref8
  doi: 10.1016/j.automatica.2014.03.020
– ident: ref22
  doi: 10.1109/TNNLS.2021.3121546
– volume-title: Dynamic Programming
  year: 1957
  ident: ref3
– ident: ref10
  doi: 10.1109/TAC.2004.826725
– ident: ref2
  doi: 10.1007/s001860050035
– ident: ref15
  doi: 10.1007/978-1-4615-0805-2
– ident: ref25
  doi: 10.1016/j.camwa.2005.11.013
– volume-title: Further Topics on Discrete-Time Markov Control Processes
  year: 2012
  ident: ref21
– ident: ref32
  doi: 10.1016/S0167-6377(00)00039-0
– volume-title: Constrained Markov Decision Processes
  year: 1998
  ident: ref1
– ident: ref11
  doi: 10.1007/s00186-006-0133-x
– volume-title: Applied Probability Models With Optimization Applications
  year: 1970
  ident: ref33
– ident: ref23
  doi: 10.1109/TASE.2022.3188009
– ident: ref13
  doi: 10.1016/j.automatica.2019.108582
– ident: ref35
  doi: 10.1017/S0269964800005131
– volume-title: Discrete-Time Markov Control Processes: Basic Optimality Criteria
  year: 2012
  ident: ref20
SSID ssj0016441
Score 2.480187
Snippet An important question about finite constrained Markov decision process (CMDP) problem is if there exists a condition under which a uniformly optimal and...
SourceID proquest
crossref
ieee
SourceType Aggregation Database
Index Database
Publisher
StartPage 4031
SubjectTerms Algorithms
Approximation algorithms
Constraints
Costs
Dynamic programming
Dynamic programming (DP)
Feasibility
Markov decision process
Markov processes
optimality equation
Optimization
Probability distribution
Programming
Reviews
Scheduling
Scheduling algorithms
Throughput
Trajectory
uniform-optimality
Title Approximate Constrained Discounted Dynamic Programming With Uniform Feasibility and Optimality
URI https://ieeexplore.ieee.org/document/10817577
https://www.proquest.com/docview/3213930868
Volume 70
WOSCitedRecordID wos001499525600014&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVIEE
  databaseName: IEEE/IET Electronic Library
  customDbUrl:
  eissn: 1558-2523
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0016441
  issn: 0018-9286
  databaseCode: RIE
  dateStart: 19630101
  isFulltext: true
  titleUrlDefault: https://ieeexplore.ieee.org/
  providerName: IEEE
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV07T8MwELZoxQADzyLKSx5YGFISO47tsSpUTKVDEZ2IEuciMjRFbYrg33N20qoSYmDzEEeRL3f3ne_xEXLLcy78JBBeZiR4dgS6p8CkHroSpkPNU53njmxCjkZqOtXjplnd9cIAgCs-g55dulx-Njcre1WGGq7Q20nZIi0po7pZa5MysI69NruowUxtcpK-vp_0BxgJsrCHaIMry6Sy5YMcqcovS-zcy_Dwnx92RA4aHEn7teCPyQ6UJ2R_a7rgKXnr23nhXwViUqCWmNPRQUBGH4qlo4iwy5qQno7rMq0ZbqSvRfVOEYtaOEsRIjYFtN80KTP6jCZm5rB7h7wMHyeDJ6-hU_AMC0Xl5YD6bIQfaWMC0AAIrE0QpNzkMki1SIxtpwwylvmRMToyLNEKFV6wLAtUnvAz0i7nJZzbRm8teCoUvs8mRtPET6IwQuuRilxFErrkbn3A8Uc9NSN20YavYxRGbIURN8Loko490K3n6rPskqu1SOJGr5YxZ4hYOYZh6uKPbZdkj1mKXndRckXa1WIF12TXfFbFcnHjfpkfB1m_Wg
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV07T8MwELagIAEDzyIKBTywMKQkdpzYY1WoiiilQxGdiBLbERmaojZF8O85OymqhBjYPMRR5Mvdfed7fAhd0ZQyN_aYo2SoHTMC3eFaJg64EiJ8QRORppZsIhwM-HgshlWzuu2F0Vrb4jPdMkuby1dTuTBXZaDhHLxdGK6jDUOd5ZbtWj9JA-PaS8MLOkz4T1bSFTejdgdiQeK3AG9QbrhUVryQpVX5ZYutg-nu_fPT9tFuhSRxuxT9AVrT-SHaWZkveIRe22Zi-GcGqFRjQ81pCSG0wrfZ3JJEmGVJSY-HZaHWBDbil6x4w4BGDaDFABKrEtovHOcKP4GRmVj0XkfP3btRp-dUhAqOJD4rnFSDRkvmBkJKTwutAVpLz0uoTEMvESyWpqHSU0S5gZQikCQWHFSeEaU8nsb0GNXyaa5PTKu3YDRhHN5nUqNJ7MaBH4D9SFjKg1A30PXygKP3cm5GZOMNV0QgjMgII6qE0UB1c6Arz5Vn2UDNpUiiSrPmESWAWSkEYvz0j22XaKs3euxH_fvBwxnaJoaw116bNFGtmC30OdqUH0U2n13Y3-cbJI_CoA
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Approximate+Constrained+Discounted+Dynamic+Programming+With+Uniform+Feasibility+and+Optimality&rft.jtitle=IEEE+transactions+on+automatic+control&rft.au=Chang%2C+Hyeong+Soo&rft.date=2025-06-01&rft.issn=0018-9286&rft.eissn=1558-2523&rft.volume=70&rft.issue=6&rft.spage=4031&rft.epage=4036&rft_id=info:doi/10.1109%2FTAC.2024.3523847&rft.externalDBID=n%2Fa&rft.externalDocID=10_1109_TAC_2024_3523847
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0018-9286&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0018-9286&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0018-9286&client=summon