Faster algorithm and sharper analysis for constrained Markov decision process

The problem of constrained Markov decision process (CMDP) is investigated, where an agent aims to maximize the expected accumulated reward subject to constraints on its utilities/costs. We propose a new primal-dual approach with a novel integration of entropy regularization and Nesterov's accel...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Operations research letters Jg. 54; S. 107107
Hauptverfasser:	Li, Tianjiao, Guan, Ziwei, Zou, Shaofeng, Xu, Tengyu, Liang, Yingbin, Lan, Guanghui
Format:	Journal Article
Sprache:	Englisch
Veröffentlicht:	Elsevier B.V 01.05.2024
Schlagworte:	Accelerated gradient method Constrained Markov decision process Entropy regularization Policy optimization Primal-dual algorithm Primal-dual algorithm Constrained Markov decision process Accelerated gradient method Entropy regularization Policy optimization
ISSN:	0167-6377, 1872-7468
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Schreiben Sie den ersten Kommentar!