An Equivalence Between Adaptive Dynamic Programming With a Critic and Backpropagation Through Time

We consider the adaptive dynamic programming technique called Dual Heuristic Programming (DHP), which is designed to learn a critic function, when using learned model functions of the environment. DHP is designed for optimizing control problems in large and continuous state spaces. We extend DHP int...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transaction on neural networks and learning systems Jg. 24; H. 12; S. 2088 - 2100
Hauptverfasser:	Fairbank, Michael, Alonso, Eduardo, Prokhorov, Danil
Format:	Journal Article
Sprache:	Englisch
Veröffentlicht:	New York, NY IEEE 01.12.2013 Institute of Electrical and Electronics Engineers The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Schlagworte:	Adaptive dynamic programming (ADP) Algorithm design and analysis Algorithms Applied sciences Approximation algorithms Artificial intelligence Back propagation backpropagation through time Computer science; control theory; systems Control theory. Systems Convergence dual heuristic programming (DHP) Dynamic programming Dynamical systems Equations Exact sciences and technology Learning Learning and adaptive systems Mathematical models Neural networks Optimal control Policies Trajectory value-gradient learning Vectors Backpropagation Gradient dual heuristic programming (DHP) neural networks backpropagation through time Neural network value-gradient learning Function approximation Modeling Backpropagation algorithm Adaptive dynamic programming (ADP) Smooth function Heuristic method Greedy algorithm Dynamic programming Learning algorithm
ISSN:	2162-237X, 2162-2388, 2162-2388
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Schreiben Sie den ersten Kommentar!