An Equivalence Between Adaptive Dynamic Programming With a Critic and Backpropagation Through Time
We consider the adaptive dynamic programming technique called Dual Heuristic Programming (DHP), which is designed to learn a critic function, when using learned model functions of the environment. DHP is designed for optimizing control problems in large and continuous state spaces. We extend DHP int...
Uložené v:
| Vydané v: | IEEE transaction on neural networks and learning systems Ročník 24; číslo 12; s. 2088 - 2100 |
|---|---|
| Hlavní autori: | , , |
| Médium: | Journal Article |
| Jazyk: | English |
| Vydavateľské údaje: |
New York, NY
IEEE
01.12.2013
Institute of Electrical and Electronics Engineers The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Predmet: | |
| ISSN: | 2162-237X, 2162-2388, 2162-2388 |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
Buďte prvý, kto okomentuje tento záznam!