Improving the Performance of Batch-Constrained Reinforcement Learning in Continuous Action Domains via Generative Adversarial Networks
The Batch-Constrained Q-learning algorithm is shown to overcome the extrapolation error and enable deep reinforcement learning agents to learn from a previously collected fixed batch of transitions. However, due to conditional Variational Autoencoders (VAE) used in the data generation module, the BC...
Saved in:
| Published in: | 2022 30th Signal Processing and Communications Applications Conference (SIU) pp. 1 - 4 |
|---|---|
| Main Authors: | , , , |
| Format: | Conference Proceeding |
| Language: | English Turkish |
| Published: |
IEEE
15.05.2022
|
| Subjects: | |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Be the first to leave a comment!