Low-complexity algorithm for restless bandits with imperfect observations
We consider a class of restless bandit problems that finds a broad application area in reinforcement learning and stochastic optimization. We consider N independent discrete-time Markov processes, each of which had two possible states: 1 and 0 (‘good’ and ‘bad’). Only if a process is both in state 1...
Saved in:
| Published in: | Mathematical methods of operations research (Heidelberg, Germany) Vol. 100; no. 2; pp. 467 - 508 |
|---|---|
| Main Authors: | , , |
| Format: | Journal Article |
| Language: | English |
| Published: |
Berlin/Heidelberg
Springer Berlin Heidelberg
01.10.2024
Springer Nature B.V |
| Subjects: | |
| ISSN: | 1432-2994, 1432-5217 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Be the first to leave a comment!