A New Value Iteration method for the Average Cost Dynamic Programming Problem

We propose a new value iteration method for the classical average cost Markovian decision problem, under the assumption that all stationary policies are unichain and that, furthermore, there exists a state that is recurrent under all stationary policies. This method is motivated by a relation betwee...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	SIAM journal on control and optimization Jg. 36; H. 2; S. 742 - 759
1. Verfasser:	Bertsekas, Dimitri P.
Format:	Journal Article
Sprache:	Englisch
Veröffentlicht:	Philadelphia, PA Society for Industrial and Applied Mathematics 01.03.1998
Schlagworte:	Algorithms Applied sciences Computer science; control theory; systems Control theory. Systems Dynamic programming Exact sciences and technology Markov analysis Operational research and scientific management Operational research. Management science Optimal control Optimization. Search problems Probability Markov decision Jacobi method Control system Average cost Shortest path Iterative method Discrete system Dynamic programming Gauss Seidel method Convergence
ISSN:	0363-0129, 1095-7138
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	We propose a new value iteration method for the classical average cost Markovian decision problem, under the assumption that all stationary policies are unichain and that, furthermore, there exists a state that is recurrent under all stationary policies. This method is motivated by a relation between the average cost problem and an associated stochastic shortest path problem. Contrary to the standard relative value iteration, our method involves a weighted sup-norm contraction, and for this reason it admits a Gauss--Seidel implementation. Computational tests indicate that the Gauss--Seidel version of the new method substantially outperforms the standard method for difficult problems.
Bibliographie:	ObjectType-Article-1 SourceType-Scholarly Journals-1 content type line 14
ISSN:	0363-0129 1095-7138
DOI:	10.1137/S0363012995291609