A Universal Empirical Dynamic Programming Algorithm for Continuous State MDPs
We propose universal randomized function approximation-based empirical value learning (EVL) algorithms for Markov decision processes. The "empirical" nature comes from each iteration being done empirically from samples available from simulations of the next state. This makes the Bellman op...
Saved in:
| Published in: | IEEE transactions on automatic control Vol. 65; no. 1; pp. 115 - 129 |
|---|---|
| Main Authors: | , , , |
| Format: | Journal Article |
| Language: | English |
| Published: |
New York
IEEE
01.01.2020
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Subjects: | |
| ISSN: | 0018-9286, 1558-2523 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Be the first to leave a comment!