Quasi-Stochastic Approximation and Off-Policy Reinforcement Learning
The Robbins-Monro stochastic approximation algorithm is a foundation of many algorithmic frameworks for reinforcement learning (RL), and often an efficient approach to solving (or approximating the solution to) complex optimal control problems. However, in many cases practitioners are unable to appl...
Saved in:
| Published in: | Proceedings of the IEEE Conference on Decision & Control pp. 5244 - 5251 |
|---|---|
| Main Authors: | , , , , , |
| Format: | Conference Proceeding |
| Language: | English |
| Published: |
IEEE
01.12.2019
|
| Subjects: | |
| ISSN: | 2576-2370 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Be the first to leave a comment!