A New Value Iteration method for the Average Cost Dynamic Programming Problem

We propose a new value iteration method for the classical average cost Markovian decision problem, under the assumption that all stationary policies are unichain and that, furthermore, there exists a state that is recurrent under all stationary policies. This method is motivated by a relation betwee...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:SIAM journal on control and optimization Jg. 36; H. 2; S. 742 - 759
1. Verfasser: Bertsekas, Dimitri P.
Format: Journal Article
Sprache:Englisch
Veröffentlicht: Philadelphia, PA Society for Industrial and Applied Mathematics 01.03.1998
Schlagworte:
ISSN:0363-0129, 1095-7138
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:We propose a new value iteration method for the classical average cost Markovian decision problem, under the assumption that all stationary policies are unichain and that, furthermore, there exists a state that is recurrent under all stationary policies. This method is motivated by a relation between the average cost problem and an associated stochastic shortest path problem. Contrary to the standard relative value iteration, our method involves a weighted sup-norm contraction, and for this reason it admits a Gauss--Seidel implementation. Computational tests indicate that the Gauss--Seidel version of the new method substantially outperforms the standard method for difficult problems.
Bibliographie:ObjectType-Article-1
SourceType-Scholarly Journals-1
content type line 14
ISSN:0363-0129
1095-7138
DOI:10.1137/S0363012995291609