Multiagent value iteration algorithms in dynamic programming and reinforcement learning
We consider infinite horizon dynamic programming problems, where the control at each stage consists of several distinct decisions, each one made by one of several agents. In an earlier work we introduced a policy iteration algorithm, where the policy improvement is done one-agent-at-a-time in a give...
Saved in:
| Published in: | Results in control and optimization Vol. 1; p. 100003 |
|---|---|
| Main Author: | |
| Format: | Journal Article |
| Language: | English |
| Published: |
Elsevier B.V
01.12.2020
Elsevier |
| ISSN: | 2666-7207, 2666-7207 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Be the first to leave a comment!