Policy-based Primal-Dual Methods for Concave CMDP with Variance Reduction
We study Concave Constrained Markov Decision Processes (Concave CMDPs) where both the objective and constraints are defined as concave functions of the state-action occupancy measure. We propose the Variance-Reduced Primal-Dual Policy Gradient Algorithm (VR-PDPG), which updates the primal variable v...
Saved in:
| Published in: | The Journal of artificial intelligence research Vol. 83 |
|---|---|
| Main Authors: | , , , , , |
| Format: | Journal Article |
| Language: | English |
| Published: |
01.01.2025
|
| ISSN: | 1076-9757, 1076-9757 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Be the first to leave a comment!