An Equivalence Between Adaptive Dynamic Programming With a Critic and Backpropagation Through Time

We consider the adaptive dynamic programming technique called Dual Heuristic Programming (DHP), which is designed to learn a critic function, when using learned model functions of the environment. DHP is designed for optimizing control problems in large and continuous state spaces. We extend DHP int...

Celý popis

Uložené v:

Podrobná bibliografia
Vydané v:	IEEE transaction on neural networks and learning systems Ročník 24; číslo 12; s. 2088 - 2100
Hlavní autori:	Fairbank, Michael, Alonso, Eduardo, Prokhorov, Danil
Médium:	Journal Article
Jazyk:	English
Vydavateľské údaje:	New York, NY IEEE 01.12.2013 Institute of Electrical and Electronics Engineers The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Predmet:	Adaptive dynamic programming (ADP) Algorithm design and analysis Algorithms Applied sciences Approximation algorithms Artificial intelligence Back propagation backpropagation through time Computer science; control theory; systems Control theory. Systems Convergence dual heuristic programming (DHP) Dynamic programming Dynamical systems Equations Exact sciences and technology Learning Learning and adaptive systems Mathematical models Neural networks Optimal control Policies Trajectory value-gradient learning Vectors Backpropagation Gradient dual heuristic programming (DHP) neural networks backpropagation through time Neural network value-gradient learning Function approximation Modeling Backpropagation algorithm Adaptive dynamic programming (ADP) Smooth function Heuristic method Greedy algorithm Dynamic programming Learning algorithm
ISSN:	2162-237X, 2162-2388, 2162-2388
On-line prístup:	Získať plný text
Tagy:	Pridať tag Žiadne tagy, Buďte prvý, kto otaguje tento záznam!

Buďte prvý, kto okomentuje tento záznam!