Efficient sampling in approximate dynamic programming algorithms

Dynamic Programming (DP) is known to be a standard optimization tool for solving Stochastic Optimal Control (SOC) problems, either over a finite or an infinite horizon of stages. Under very general assumptions, commonly employed numerical algorithms are based on approximations of the cost-to-go func...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Computational optimization and applications Jg. 38; H. 3; S. 417 - 443
Hauptverfasser:	Cervellera, Cristiano, Muselli, Marco
Format:	Journal Article
Sprache:	Englisch
Veröffentlicht:	New York Springer Nature B.V 01.12.2007
Schlagworte:	Algorithms Approximation Dynamic programming Hypotheses Learning Optimization Stochastic models Studies
ISSN:	0926-6003, 1573-2894
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Dynamic Programming (DP) is known to be a standard optimization tool for solving Stochastic Optimal Control (SOC) problems, either over a finite or an infinite horizon of stages. Under very general assumptions, commonly employed numerical algorithms are based on approximations of the cost-to-go functions, by means of suitable parametric models built from a set of sampling points in the d-dimensional state space. Here the problem of sample complexity, i.e., how ldquo;fastrdquo; the number of points must grow with the input dimension in order to have an accurate estimate of the cost-to-go functions in typical DP approaches such as value iteration and policy iteration, is discussed. It is shown that a choice of the sampling based on low-discrepancy sequences, commonly used for efficient numerical integration, permits to achieve, under suitable hypotheses, an almost linear sample complexity, thus contributing to mitigate the curse of dimensionality of the approximate DP procedure. [PUBLICATION ABSTRACT]
Bibliographie:	SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 14 ObjectType-Article-2 content type line 23
ISSN:	0926-6003 1573-2894
DOI:	10.1007/s10589-007-9054-8