An Equivalence Between Adaptive Dynamic Programming With a Critic and Backpropagation Through Time

We consider the adaptive dynamic programming technique called Dual Heuristic Programming (DHP), which is designed to learn a critic function, when using learned model functions of the environment. DHP is designed for optimizing control problems in large and continuous state spaces. We extend DHP int...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:IEEE transaction on neural networks and learning systems Ročník 24; číslo 12; s. 2088 - 2100
Hlavní autoři: Fairbank, Michael, Alonso, Eduardo, Prokhorov, Danil
Médium: Journal Article
Jazyk:angličtina
Vydáno: New York, NY IEEE 01.12.2013
Institute of Electrical and Electronics Engineers
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Témata:
ISSN:2162-237X, 2162-2388, 2162-2388
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Abstract We consider the adaptive dynamic programming technique called Dual Heuristic Programming (DHP), which is designed to learn a critic function, when using learned model functions of the environment. DHP is designed for optimizing control problems in large and continuous state spaces. We extend DHP into a new algorithm that we call Value-Gradient Learning, VGL(λ), and prove equivalence of an instance of the new algorithm to Backpropagation Through Time for Control with a greedy policy. Not only does this equivalence provide a link between these two different approaches, but it also enables our variant of DHP to have guaranteed convergence, under certain smoothness conditions and a greedy policy, when using a general smooth nonlinear function approximator for the critic. We consider several experimental scenarios including some that prove divergence of DHP under a greedy policy, which contrasts against our proven-convergent algorithm.
AbstractList We consider the adaptive dynamic programming technique called Dual Heuristic Programming (DHP), which is designed to learn a critic function, when using learned model functions of the environment. DHP is designed for optimizing control problems in large and continuous state spaces. We extend DHP into a new algorithm that we call Value-Gradient Learning, rm VGL ( lambda ) , and prove equivalence of an instance of the new algorithm to Backpropagation Through Time for Control with a greedy policy. Not only does this equivalence provide a link between these two different approaches, but it also enables our variant of DHP to have guaranteed convergence, under certain smoothness conditions and a greedy policy, when using a general smooth nonlinear function approximator for the critic. We consider several experimental scenarios including some that prove divergence of DHP under a greedy policy, which contrasts against our proven-convergent algorithm.
We consider the adaptive dynamic programming technique called Dual Heuristic Programming (DHP), which is designed to learn a critic function, when using learned model functions of the environment. DHP is designed for optimizing control problems in large and continuous state spaces. We extend DHP into a new algorithm that we call Value-Gradient Learning, [Formula Omitted], and prove equivalence of an instance of the new algorithm to Backpropagation Through Time for Control with a greedy policy. Not only does this equivalence provide a link between these two different approaches, but it also enables our variant of DHP to have guaranteed convergence, under certain smoothness conditions and a greedy policy, when using a general smooth nonlinear function approximator for the critic. We consider several experimental scenarios including some that prove divergence of DHP under a greedy policy, which contrasts against our proven-convergent algorithm.
We consider the adaptive dynamic programming technique called Dual Heuristic Programming (DHP), which is designed to learn a critic function, when using learned model functions of the environment. DHP is designed for optimizing control problems in large and continuous state spaces. We extend DHP into a new algorithm that we call Value-Gradient Learning, VGL(λ), and prove equivalence of an instance of the new algorithm to Backpropagation Through Time for Control with a greedy policy. Not only does this equivalence provide a link between these two different approaches, but it also enables our variant of DHP to have guaranteed convergence, under certain smoothness conditions and a greedy policy, when using a general smooth nonlinear function approximator for the critic. We consider several experimental scenarios including some that prove divergence of DHP under a greedy policy, which contrasts against our proven-convergent algorithm.We consider the adaptive dynamic programming technique called Dual Heuristic Programming (DHP), which is designed to learn a critic function, when using learned model functions of the environment. DHP is designed for optimizing control problems in large and continuous state spaces. We extend DHP into a new algorithm that we call Value-Gradient Learning, VGL(λ), and prove equivalence of an instance of the new algorithm to Backpropagation Through Time for Control with a greedy policy. Not only does this equivalence provide a link between these two different approaches, but it also enables our variant of DHP to have guaranteed convergence, under certain smoothness conditions and a greedy policy, when using a general smooth nonlinear function approximator for the critic. We consider several experimental scenarios including some that prove divergence of DHP under a greedy policy, which contrasts against our proven-convergent algorithm.
We consider the adaptive dynamic programming technique called Dual Heuristic Programming (DHP), which is designed to learn a critic function, when using learned model functions of the environment. DHP is designed for optimizing control problems in large and continuous state spaces. We extend DHP into a new algorithm that we call Value-Gradient Learning, VGL(λ), and prove equivalence of an instance of the new algorithm to Backpropagation Through Time for Control with a greedy policy. Not only does this equivalence provide a link between these two different approaches, but it also enables our variant of DHP to have guaranteed convergence, under certain smoothness conditions and a greedy policy, when using a general smooth nonlinear function approximator for the critic. We consider several experimental scenarios including some that prove divergence of DHP under a greedy policy, which contrasts against our proven-convergent algorithm.
Author Fairbank, Michael
Prokhorov, Danil
Alonso, Eduardo
Author_xml – sequence: 1
  givenname: Michael
  surname: Fairbank
  fullname: Fairbank, Michael
  email: michael.fairbank@virgin.net
  organization: Dept. of Comput. Sci., City Univ. London, London, UK
– sequence: 2
  givenname: Eduardo
  surname: Alonso
  fullname: Alonso, Eduardo
  email: e.alonso@city.ac.uk
  organization: Dept. of Comput. Sci., City Univ. London, London, UK
– sequence: 3
  givenname: Danil
  surname: Prokhorov
  fullname: Prokhorov, Danil
  email: dvprokhorov@gmail.com
  organization: Toyota Res. Inst. NA, Ann Arbor, MI, USA
BackLink http://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=28074600$$DView record in Pascal Francis
https://www.ncbi.nlm.nih.gov/pubmed/24805225$$D View this record in MEDLINE/PubMed
BookMark eNqFkV1v0zAUhiM0xMbYHwAJWUJI3LTYjj8vuzI-pGogUQR30Ylz0nokTmcnm_bvl66lSLsA39iSn9fH5zzPs6PQBcyyl4xOGaP2_fLycvF9yinLp5xrprV5kp1wpviE58YcHc7613F2ltIVHZeiUgn7LDvmwlDJuTzJylkgF9eDv4EGg0Nyjv0tYiCzCja9v0Hy4S5A6x35FrtVhLb1YUV--n5NgMyj78cbCBU5B_d7E7sNrKD3XSDLdeyG1ZosfYsvsqc1NAnP9vtp9uPjxXL-ebL4-unLfLaYOMFEPwGpJColDDfcViVwNLnRQoM1rpRUYV2LMi-rGmmpbc0ZaKwEc1IBLdHZ_DR7t3t3_Mj1gKkvWp8cNg0E7IZUMMlzQfNc6P-jQglppbF0RN88Qq-6IYaxkZESRlnFLBup13tqKFusik30LcS74s-gR-DtHoDkoKkjBOfTX85QLRTdluM7zsUupYj1AWG02IovHsQXW_HFXvwYMo9CzvcPIvoIvvl39NUu6hHxUEtJY6ym-T0Sa7k7
CODEN ITNNAL
CitedBy_id crossref_primary_10_1016_j_jfranklin_2014_11_008
crossref_primary_10_1016_j_arcontrol_2019_01_003
crossref_primary_10_1109_TNNLS_2014_2297991
crossref_primary_10_1109_TSMC_2018_2868510
crossref_primary_10_1109_TNNLS_2021_3071545
crossref_primary_10_1177_01423312221149776
crossref_primary_10_1016_j_ins_2015_04_044
crossref_primary_10_1109_TNNLS_2016_2586303
crossref_primary_10_1109_TNNLS_2015_2401334
crossref_primary_10_1109_TFUZZ_2019_2912349
crossref_primary_10_2514_1_G001154
crossref_primary_10_1109_TNNLS_2020_3021037
crossref_primary_10_1177_10597123231166416
crossref_primary_10_1109_TNNLS_2019_2919338
crossref_primary_10_1007_s11768_021_00056_w
crossref_primary_10_1109_TNNLS_2016_2541020
crossref_primary_10_1016_j_neucom_2018_10_059
crossref_primary_10_1016_j_conengprac_2021_104807
crossref_primary_10_1109_TNNLS_2015_2504035
crossref_primary_10_1177_1059712315589355
crossref_primary_10_1109_TNNLS_2015_2487972
crossref_primary_10_1109_TNNLS_2016_2516948
crossref_primary_10_1177_01423312231156946
crossref_primary_10_1007_s00034_017_0572_z
crossref_primary_10_1016_j_neunet_2024_107034
crossref_primary_10_1109_TNNLS_2014_2306201
crossref_primary_10_1007_s11042_018_5856_1
Cites_doi 10.1109/TNNLS.2012.2205268
10.1109/IJCNN.2012.6252791
10.1109/IJCNN.2012.6252569
10.1109/TIA.2003.809438
10.1162/089976600300015961
10.1109/21.229449
10.1109/ICNN.1993.298623
10.1109/5.58337
10.1007/BF00115009
10.1109/MCI.2009.932261
10.1109/ICSMC.1997.633056
10.1016/0893-6080(90)90088-3
10.1109/TSMCB.2008.926614
10.1117/12.343068
10.1109/ACC.2011.5991378
10.1109/ICNN.1997.616109
10.1109/TSMC.1983.6313077
10.1109/72.623201
10.1016/B978-1-55860-377-6.50013-X
ContentType Journal Article
Copyright 2015 INIST-CNRS
Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) Dec 2013
Copyright_xml – notice: 2015 INIST-CNRS
– notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) Dec 2013
DBID 97E
RIA
RIE
AAYXX
CITATION
IQODW
NPM
7QF
7QO
7QP
7QQ
7QR
7SC
7SE
7SP
7SR
7TA
7TB
7TK
7U5
8BQ
8FD
F28
FR3
H8D
JG9
JQ2
KR7
L7M
L~C
L~D
P64
7X8
DOI 10.1109/TNNLS.2013.2271778
DatabaseName IEEE Xplore (IEEE)
IEEE All-Society Periodicals Package (ASPP) 1998–Present
IEEE Xplore Digital Library (LUT)
CrossRef
Pascal-Francis
PubMed
Aluminium Industry Abstracts
Biotechnology Research Abstracts
Calcium & Calcified Tissue Abstracts
Ceramic Abstracts
Chemoreception Abstracts
Computer and Information Systems Abstracts
Corrosion Abstracts
Electronics & Communications Abstracts
Engineered Materials Abstracts
Materials Business File
Mechanical & Transportation Engineering Abstracts
Neurosciences Abstracts
Solid State and Superconductivity Abstracts
METADEX
Technology Research Database
ANTE: Abstracts in New Technology & Engineering
Engineering Research Database
Aerospace Database
Materials Research Database
ProQuest Computer Science Collection
Civil Engineering Abstracts
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
Biotechnology and BioEngineering Abstracts
MEDLINE - Academic
DatabaseTitle CrossRef
PubMed
Materials Research Database
Technology Research Database
Computer and Information Systems Abstracts – Academic
Mechanical & Transportation Engineering Abstracts
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
Materials Business File
Aerospace Database
Engineered Materials Abstracts
Biotechnology Research Abstracts
Chemoreception Abstracts
Advanced Technologies Database with Aerospace
ANTE: Abstracts in New Technology & Engineering
Civil Engineering Abstracts
Aluminium Industry Abstracts
Electronics & Communications Abstracts
Ceramic Abstracts
Neurosciences Abstracts
METADEX
Biotechnology and BioEngineering Abstracts
Computer and Information Systems Abstracts Professional
Solid State and Superconductivity Abstracts
Engineering Research Database
Calcium & Calcified Tissue Abstracts
Corrosion Abstracts
MEDLINE - Academic
DatabaseTitleList Technology Research Database
Materials Research Database
MEDLINE - Academic

PubMed
Database_xml – sequence: 1
  dbid: NPM
  name: PubMed
  url: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 2
  dbid: RIE
  name: IEEE Xplore Digital Library (LUT)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
– sequence: 3
  dbid: 7X8
  name: MEDLINE - Academic
  url: https://search.proquest.com/medline
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
Applied Sciences
EISSN 2162-2388
EndPage 2100
ExternalDocumentID 3118401891
24805225
28074600
10_1109_TNNLS_2013_2271778
6588970
Genre orig-research
Journal Article
GroupedDBID 0R~
4.4
5VS
6IK
97E
AAJGR
AARMG
AASAJ
AAWTH
ABAZT
ABQJQ
ABVLG
ACIWK
ACPRK
AENEX
AFRAH
AGQYO
AGSQL
AHBIQ
AKJIK
AKQYR
ALMA_UNASSIGNED_HOLDINGS
ATWAV
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
EBS
EJD
IFIPE
IPLJI
JAVBF
M43
MS~
O9-
OCL
PQQKQ
RIA
RIE
RNS
AAYXX
CITATION
IQODW
RIG
NPM
7QF
7QO
7QP
7QQ
7QR
7SC
7SE
7SP
7SR
7TA
7TB
7TK
7U5
8BQ
8FD
F28
FR3
H8D
JG9
JQ2
KR7
L7M
L~C
L~D
P64
7X8
ID FETCH-LOGICAL-c414t-a565e66482829dba2e838747a98cb506eff4b3bdfe0b79f21a7ed41c56a0bec93
IEDL.DBID RIE
ISICitedReferencesCount 33
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000326940600015&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 2162-237X
2162-2388
IngestDate Thu Oct 02 09:44:48 EDT 2025
Thu Oct 02 10:55:03 EDT 2025
Mon Jun 30 04:18:58 EDT 2025
Thu Apr 03 07:04:29 EDT 2025
Wed Apr 02 07:37:51 EDT 2025
Sat Nov 29 01:39:48 EST 2025
Tue Nov 18 22:18:28 EST 2025
Tue Aug 26 16:42:11 EDT 2025
IsPeerReviewed false
IsScholarly true
Issue 12
Keywords Backpropagation
Gradient
dual heuristic programming (DHP)
neural networks
backpropagation through time
Neural network
value-gradient learning
Function approximation
Modeling
Backpropagation algorithm
Adaptive dynamic programming (ADP)
Smooth function
Heuristic method
Greedy algorithm
Dynamic programming
Learning algorithm
Language English
License https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html
CC BY 4.0
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c414t-a565e66482829dba2e838747a98cb506eff4b3bdfe0b79f21a7ed41c56a0bec93
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ObjectType-Article-2
ObjectType-Feature-1
content type line 23
PMID 24805225
PQID 1448696191
PQPubID 85436
PageCount 13
ParticipantIDs crossref_primary_10_1109_TNNLS_2013_2271778
ieee_primary_6588970
proquest_miscellaneous_1464595890
crossref_citationtrail_10_1109_TNNLS_2013_2271778
pubmed_primary_24805225
pascalfrancis_primary_28074600
proquest_journals_1448696191
proquest_miscellaneous_1523403347
PublicationCentury 2000
PublicationDate 2013-12-01
PublicationDateYYYYMMDD 2013-12-01
PublicationDate_xml – month: 12
  year: 2013
  text: 2013-12-01
  day: 01
PublicationDecade 2010
PublicationPlace New York, NY
PublicationPlace_xml – name: New York, NY
– name: United States
– name: Piscataway
PublicationTitle IEEE transaction on neural networks and learning systems
PublicationTitleAbbrev TNNLS
PublicationTitleAlternate IEEE Trans Neural Netw Learn Syst
PublicationYear 2013
Publisher IEEE
Institute of Electrical and Electronics Engineers
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml – name: IEEE
– name: Institute of Electrical and Electronics Engineers
– name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References ref14
ref11
ref10
pontryagin (ref12) 1962; 4
ref1
ref17
florian (ref29) 2007
ref19
ref18
barto (ref28) 1983; 13
werbos (ref3) 1992
ferrari (ref5) 2004
fairbank (ref26) 2012
ref24
ref23
ref25
ref20
fairbank (ref13) 2011
ref22
ref21
werbos (ref15) 1992
ref27
ref8
ref9
ref4
ref6
howard (ref16) 1960
fairbank (ref7) 2008
bellman (ref2) 1957
References_xml – start-page: 283
  year: 1992
  ident: ref15
  publication-title: Handbook of Intelligent Control
– year: 2007
  ident: ref29
  publication-title: Correct Equations for the Dynamics of the cart-pole system
– ident: ref6
  doi: 10.1109/TNNLS.2012.2205268
– ident: ref8
  doi: 10.1109/IJCNN.2012.6252791
– year: 1957
  ident: ref2
  publication-title: Dynamic Programming
– start-page: 42
  year: 1960
  ident: ref16
  publication-title: Dynamic Programming and Markov Processes
– ident: ref14
  doi: 10.1109/IJCNN.2012.6252569
– start-page: 493
  year: 1992
  ident: ref3
  publication-title: Handbook of Intelligent Control
– ident: ref10
  doi: 10.1109/TIA.2003.809438
– ident: ref24
  doi: 10.1162/089976600300015961
– volume: 4
  year: 1962
  ident: ref12
  publication-title: The Mathematical Theory of Optimal Processes
– year: 2008
  ident: ref7
  publication-title: Reinforcement learning by value gradients
– ident: ref25
  doi: 10.1109/21.229449
– ident: ref27
  doi: 10.1109/ICNN.1993.298623
– ident: ref23
  doi: 10.1109/5.58337
– year: 2012
  ident: ref26
  publication-title: Reinforcement Learning and Approximate Dynamic Programming for Feedback Control
– year: 2011
  ident: ref13
  publication-title: The local optimality of reinforcement learning by value gradients and its relationship to policy gradient learning
– ident: ref9
  doi: 10.1007/BF00115009
– ident: ref1
  doi: 10.1109/MCI.2009.932261
– ident: ref19
  doi: 10.1109/ICSMC.1997.633056
– ident: ref22
  doi: 10.1016/0893-6080(90)90088-3
– start-page: 65
  year: 2004
  ident: ref5
  publication-title: Handbook of Learning and Approximate Dynamic Programming
– ident: ref17
  doi: 10.1109/TSMCB.2008.926614
– ident: ref20
  doi: 10.1117/12.343068
– ident: ref18
  doi: 10.1109/ACC.2011.5991378
– ident: ref11
  doi: 10.1109/ICNN.1997.616109
– volume: 13
  start-page: 834
  year: 1983
  ident: ref28
  article-title: Neuronlike adaptive elements that can solve difficult learning control problems
  publication-title: IEEE Trans Syst Man Cybern
  doi: 10.1109/TSMC.1983.6313077
– ident: ref4
  doi: 10.1109/72.623201
– ident: ref21
  doi: 10.1016/B978-1-55860-377-6.50013-X
SSID ssj0000605649
Score 2.286866
Snippet We consider the adaptive dynamic programming technique called Dual Heuristic Programming (DHP), which is designed to learn a critic function, when using...
SourceID proquest
pubmed
pascalfrancis
crossref
ieee
SourceType Aggregation Database
Index Database
Enrichment Source
Publisher
StartPage 2088
SubjectTerms Adaptive dynamic programming (ADP)
Algorithm design and analysis
Algorithms
Applied sciences
Approximation algorithms
Artificial intelligence
Back propagation
backpropagation through time
Computer science; control theory; systems
Control theory. Systems
Convergence
dual heuristic programming (DHP)
Dynamic programming
Dynamical systems
Equations
Exact sciences and technology
Learning
Learning and adaptive systems
Mathematical models
Neural networks
Optimal control
Policies
Trajectory
value-gradient learning
Vectors
Title An Equivalence Between Adaptive Dynamic Programming With a Critic and Backpropagation Through Time
URI https://ieeexplore.ieee.org/document/6588970
https://www.ncbi.nlm.nih.gov/pubmed/24805225
https://www.proquest.com/docview/1448696191
https://www.proquest.com/docview/1464595890
https://www.proquest.com/docview/1523403347
Volume 24
WOSCitedRecordID wos000326940600015&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVIEE
  databaseName: IEEE Xplore Digital Library (LUT)
  customDbUrl:
  eissn: 2162-2388
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0000605649
  issn: 2162-237X
  databaseCode: RIE
  dateStart: 20120101
  isFulltext: true
  titleUrlDefault: https://ieeexplore.ieee.org/
  providerName: IEEE
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1Lb9QwELZKxYELBcojUFZG4gZpHdvx47iFVhyqVSUW2Fvkp1gB2dLd5fcztrMBJKjELUocKc43k_kmHs-H0EvaOBZa7mugCr7mzslaQyCttbdAttsgTG7H8PFCzmZqsdCXe-j1uBcmhJCLz8JxOsxr-X7ltulX2QlES6UlJOi3pBRlr9b4P4UALxeZ7dJG0JoyudjtkSH6ZD6bXbxPhVzsmFLIYGTS6aM89fNPItm_haSssZIqJM0aXlIs6hb_pp85DJ0f_N8E7qG7A93E02If99Fe6B-gg52UAx48-xDZaY_Pvm-XYHfpBD4t5Vt46s1V-iDit0W5Hl-Weq5vEPHwp-XmMza4qCVg03t8atwXmAJ8pDLgeF5UgHDaaPIQfTg_m795Vw_yC7XjDd_UBrheEIKrtNjqraFBMQXZh9HK2ZaIECO3zPoYiJU60sbI4HnjWmEIWIZmj9B-v-rDE4Q90ZD6WBOYipzTaKNpDZCVGCkTxIUKNTsEOjf0Jk8SGV-7nKMQ3WUAuwRgNwBYoVfjPVelM8eNow8THOPIAYkKTf4AeryemwQBGazQ0Q75bvDuNaRLXAkwad1U6MV4GfwyLbaYPqy2aUxq09MqTW4Y01LGCWNcVuhxsapfDzAY59O_P_gzdCdNrxTWHKH9zfU2PEe33Y_Ncn09AQdZqEl2kJ8WXQnj
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1Nb9QwEB1VBQkuLVCggVKMxA3SOrbz4eOWtipiiSqxwN4ixx9i1ZIt3V1-P2M7G0AqlbhFsSPZeePMm3g8D-A1yzS3uTApUgWTCq3LVKIjTaVpkWzntlChHMOXcVnX1XQqzzfg7XAWxlobks_sgb8Me_lmrlf-V9khestKlhig38mFYDSe1hr-qFBk5kXguywrWMp4OV2fkqHycFLX408-lYsfMIYxTOmV-pjwFf29TPYfTimorPgcSbXA1-SivsW_CWhwRKfb_zeFB7DVE04yihbyEDZs9wi212IOpF_bO9COOnLyYzVDy_M3yFFM4CIjo678J5EcR-16ch4zur6jzyNfZ8tvRJGol0BUZ8iR0hc4BfxMBcjJJOoAEX_U5DF8Pj2ZvDtLewGGVItMLFOFbM8Whaj8dqtpFbMVrzD-ULLSbU4L65xoeWucpW0pHctUaY3IdF4oirYh-RPY7Oad3QViqMTgp1WWVw6xc61TuUK64hzjBdU2gWyNQKP76uReJOOyCVEKlU0AsPEANj2ACbwZnrmKtTlu7b3j4Rh69kgksP8X0EN7KBOEdDCBvTXyTb--FxgwiapAo5ZZAq-GZlyZfrtFdXa-8n18oZ68kvSWPjnjgnIuygSeRqv6PYDeOJ_dPPCXcO9s8nHcjN_XH57DfT_VmGazB5vL65V9AXf1z-Vscb0flskvxT4MQg
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=An+Equivalence+Between+Adaptive+Dynamic+Programming+With+a+Critic+and+Backpropagation+Through+Time&rft.jtitle=IEEE+transaction+on+neural+networks+and+learning+systems&rft.au=Fairbank%2C+Michael&rft.au=Alonso%2C+Eduardo&rft.au=Prokhorov%2C+Danil&rft.date=2013-12-01&rft.issn=2162-237X&rft.eissn=2162-2388&rft.volume=24&rft.issue=12&rft.spage=2088&rft.epage=2100&rft_id=info:doi/10.1109%2FTNNLS.2013.2271778&rft.externalDBID=n%2Fa&rft.externalDocID=10_1109_TNNLS_2013_2271778
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2162-237X&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2162-237X&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2162-237X&client=summon