Faster algorithm and sharper analysis for constrained Markov decision process

The problem of constrained Markov decision process (CMDP) is investigated, where an agent aims to maximize the expected accumulated reward subject to constraints on its utilities/costs. We propose a new primal-dual approach with a novel integration of entropy regularization and Nesterov's accel...

Full description

Saved in:
Bibliographic Details
Published in:Operations research letters Vol. 54; p. 107107
Main Authors: Li, Tianjiao, Guan, Ziwei, Zou, Shaofeng, Xu, Tengyu, Liang, Yingbin, Lan, Guanghui
Format: Journal Article
Language:English
Published: Elsevier B.V 01.05.2024
Subjects:
ISSN:0167-6377, 1872-7468
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The problem of constrained Markov decision process (CMDP) is investigated, where an agent aims to maximize the expected accumulated reward subject to constraints on its utilities/costs. We propose a new primal-dual approach with a novel integration of entropy regularization and Nesterov's accelerated gradient method. The proposed approach is shown to converge to the global optimum with a complexity of O˜(1/ϵ) in terms of the optimality gap and the constraint violation, which improves the complexity of the existing primal-dual approaches by a factor of O(1/ϵ).
ISSN:0167-6377
1872-7468
DOI:10.1016/j.orl.2024.107107