Policy-based Primal-Dual Methods for Concave CMDP with Variance Reduction

We study Concave Constrained Markov Decision Processes (Concave CMDPs) where both the objective and constraints are defined as concave functions of the state-action occupancy measure. We propose the Variance-Reduced Primal-Dual Policy Gradient Algorithm (VR-PDPG), which updates the primal variable v...

Full description

Saved in:
Bibliographic Details
Published in:The Journal of artificial intelligence research Vol. 83
Main Authors: Ying, Donghao, Guo, Mengzi Amy, Lee, Hyunin, Ding, Yuhao, Lavaei, Javad, Shen, Zuo-Jun Max
Format: Journal Article
Language:English
Published: 01.01.2025
ISSN:1076-9757, 1076-9757
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Be the first to leave a comment!
You must be logged in first