An Equivalence Between Adaptive Dynamic Programming With a Critic and Backpropagation Through Time

We consider the adaptive dynamic programming technique called Dual Heuristic Programming (DHP), which is designed to learn a critic function, when using learned model functions of the environment. DHP is designed for optimizing control problems in large and continuous state spaces. We extend DHP int...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE transaction on neural networks and learning systems Vol. 24; no. 12; pp. 2088 - 2100
Main Authors:	Fairbank, Michael, Alonso, Eduardo, Prokhorov, Danil
Format:	Journal Article
Language:	English
Published:	New York, NY IEEE 01.12.2013 Institute of Electrical and Electronics Engineers The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects:	Adaptive dynamic programming (ADP) Algorithm design and analysis Algorithms Applied sciences Approximation algorithms Artificial intelligence Back propagation backpropagation through time Computer science; control theory; systems Control theory. Systems Convergence dual heuristic programming (DHP) Dynamic programming Dynamical systems Equations Exact sciences and technology Learning Learning and adaptive systems Mathematical models Neural networks Optimal control Policies Trajectory value-gradient learning Vectors Backpropagation Gradient dual heuristic programming (DHP) neural networks backpropagation through time Neural network value-gradient learning Function approximation Modeling Backpropagation algorithm Adaptive dynamic programming (ADP) Smooth function Heuristic method Greedy algorithm Dynamic programming Learning algorithm
ISSN:	2162-237X, 2162-2388, 2162-2388
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Be the first to leave a comment!