An Equivalence Between Adaptive Dynamic Programming With a Critic and Backpropagation Through Time

We consider the adaptive dynamic programming technique called Dual Heuristic Programming (DHP), which is designed to learn a critic function, when using learned model functions of the environment. DHP is designed for optimizing control problems in large and continuous state spaces. We extend DHP int...

Celý popis

Uloženo v:

Podrobná bibliografie
Vydáno v:	IEEE transaction on neural networks and learning systems Ročník 24; číslo 12; s. 2088 - 2100
Hlavní autoři:	Fairbank, Michael, Alonso, Eduardo, Prokhorov, Danil
Médium:	Journal Article
Jazyk:	angličtina
Vydáno:	New York, NY IEEE 01.12.2013 Institute of Electrical and Electronics Engineers The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Témata:	Adaptive dynamic programming (ADP) Algorithm design and analysis Algorithms Applied sciences Approximation algorithms Artificial intelligence Back propagation backpropagation through time Computer science; control theory; systems Control theory. Systems Convergence dual heuristic programming (DHP) Dynamic programming Dynamical systems Equations Exact sciences and technology Learning Learning and adaptive systems Mathematical models Neural networks Optimal control Policies Trajectory value-gradient learning Vectors Backpropagation Gradient dual heuristic programming (DHP) neural networks backpropagation through time Neural network value-gradient learning Function approximation Modeling Backpropagation algorithm Adaptive dynamic programming (ADP) Smooth function Heuristic method Greedy algorithm Dynamic programming Learning algorithm
ISSN:	2162-237X, 2162-2388, 2162-2388
On-line přístup:	Získat plný text
Tagy:	Přidat tag Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!

Buďte první, kdo okomentuje tento záznam!