Adaptive value function approximation for continuous-state stochastic dynamic programming

Approximate dynamic programming (ADP) commonly employs value function approximation to numerically solve complex dynamic programming problems. A statistical perspective of value function approximation employs a design and analysis of computer experiments (DACE) approach, where the “computer experime...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Computers & operations research Jg. 40; H. 4; S. 1076 - 1084
Hauptverfasser:	Fan, Huiyuan, Tarun, Prashant K., Chen, Victoria C.P.
Format:	Journal Article
Sprache:	Englisch
Veröffentlicht:	Kidlington Elsevier Ltd 01.04.2013 Elsevier Pergamon Press Inc
Schlagworte:	Applied sciences Approximate dynamic programming Approximation Design of experiments Dynamic programming Exact sciences and technology Experimental design Inventory control, production control. Distribution Inventory forecasting Mathematical analysis Mathematical models Mathematical programming Mathematics Neural network Neural networks Number theoretic methods Operational research and scientific management Operational research. Management science Probability and statistics Sample size Samples Sciences and techniques of general use Sequential design of experiments Statistical analysis Statistical methods Statistical modeling Statistics Stochastic models Studies Approximate dynamic programming Statistical modeling Inventory forecasting Neural network Number theoretic methods Sequential design of experiments Sample size State space Function approximation Modeling Adaptive method Discretization Sequential method Inventory control Dynamic programming Feedforward Statistical analysis Sequential design Computer simulation Continuous function State space method Stochastic programming Approximation method Experimental design Complex programming Number theory
ISSN:	0305-0548, 1873-765X, 0305-0548
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Approximate dynamic programming (ADP) commonly employs value function approximation to numerically solve complex dynamic programming problems. A statistical perspective of value function approximation employs a design and analysis of computer experiments (DACE) approach, where the “computer experiment” yields points on the value function curve. The DACE approach has been used to numerically solve high-dimensional, continuous-state stochastic dynamic programming, and performs two tasks primarily: (1) design of experiments and (2) statistical modeling. The use of design of experiments enables more efficient discretization. However, identifying the appropriate sample size is not straightforward. Furthermore, identifying the appropriate model structure is a well-known problem in the field of statistics. In this paper, we present a sequential method that can adaptively determine both sample size and model structure. Number-theoretic methods (NTM) are used to sequentially grow the experimental design because of their ability to fill the design space. Feed-forward neural networks (NNs) are used for statistical modeling because of their adjustability in structure-complexity . This adaptive value function approximation (AVFA) method must be automated to enable efficient implementation within ADP. An AVFA algorithm is introduced, that increments the size of the state space training data in each sequential step, and for each sample size a successive model search process is performed to find an optimal NN model. The new algorithm is tested on a nine-dimensional inventory forecasting problem.
Bibliographie:	SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 14 ObjectType-Article-2 content type line 23
ISSN:	0305-0548 1873-765X 0305-0548
DOI:	10.1016/j.cor.2012.11.016