Multiagent value iteration algorithms in dynamic programming and reinforcement learning

We consider infinite horizon dynamic programming problems, where the control at each stage consists of several distinct decisions, each one made by one of several agents. In an earlier work we introduced a policy iteration algorithm, where the policy improvement is done one-agent-at-a-time in a give...

Full description

Saved in:

Bibliographic Details
Published in:	Results in control and optimization Vol. 1; p. 100003
Main Author:	Bertsekas, Dimitri
Format:	Journal Article
Language:	English
Published:	Elsevier B.V 01.12.2020 Elsevier
ISSN:	2666-7207, 2666-7207
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Be the first to leave a comment!