A New Value Iteration method for the Average Cost Dynamic Programming Problem
We propose a new value iteration method for the classical average cost Markovian decision problem, under the assumption that all stationary policies are unichain and that, furthermore, there exists a state that is recurrent under all stationary policies. This method is motivated by a relation betwee...
Gespeichert in:
| Veröffentlicht in: | SIAM journal on control and optimization Jg. 36; H. 2; S. 742 - 759 |
|---|---|
| 1. Verfasser: | |
| Format: | Journal Article |
| Sprache: | Englisch |
| Veröffentlicht: |
Philadelphia, PA
Society for Industrial and Applied Mathematics
01.03.1998
|
| Schlagworte: | |
| ISSN: | 0363-0129, 1095-7138 |
| Online-Zugang: | Volltext |
| Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
| Zusammenfassung: | We propose a new value iteration method for the classical average cost Markovian decision problem, under the assumption that all stationary policies are unichain and that, furthermore, there exists a state that is recurrent under all stationary policies. This method is motivated by a relation between the average cost problem and an associated stochastic shortest path problem. Contrary to the standard relative value iteration, our method involves a weighted sup-norm contraction, and for this reason it admits a Gauss--Seidel implementation. Computational tests indicate that the Gauss--Seidel version of the new method substantially outperforms the standard method for difficult problems. |
|---|---|
| Bibliographie: | ObjectType-Article-1 SourceType-Scholarly Journals-1 content type line 14 |
| ISSN: | 0363-0129 1095-7138 |
| DOI: | 10.1137/S0363012995291609 |