An Equivalence Between Adaptive Dynamic Programming With a Critic and Backpropagation Through Time
We consider the adaptive dynamic programming technique called Dual Heuristic Programming (DHP), which is designed to learn a critic function, when using learned model functions of the environment. DHP is designed for optimizing control problems in large and continuous state spaces. We extend DHP int...
Saved in:
| Published in: | IEEE transaction on neural networks and learning systems Vol. 24; no. 12; pp. 2088 - 2100 |
|---|---|
| Main Authors: | , , |
| Format: | Journal Article |
| Language: | English |
| Published: |
New York, NY
IEEE
01.12.2013
Institute of Electrical and Electronics Engineers The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Subjects: | |
| ISSN: | 2162-237X, 2162-2388, 2162-2388 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Abstract | We consider the adaptive dynamic programming technique called Dual Heuristic Programming (DHP), which is designed to learn a critic function, when using learned model functions of the environment. DHP is designed for optimizing control problems in large and continuous state spaces. We extend DHP into a new algorithm that we call Value-Gradient Learning, VGL(λ), and prove equivalence of an instance of the new algorithm to Backpropagation Through Time for Control with a greedy policy. Not only does this equivalence provide a link between these two different approaches, but it also enables our variant of DHP to have guaranteed convergence, under certain smoothness conditions and a greedy policy, when using a general smooth nonlinear function approximator for the critic. We consider several experimental scenarios including some that prove divergence of DHP under a greedy policy, which contrasts against our proven-convergent algorithm. |
|---|---|
| AbstractList | We consider the adaptive dynamic programming technique called Dual Heuristic Programming (DHP), which is designed to learn a critic function, when using learned model functions of the environment. DHP is designed for optimizing control problems in large and continuous state spaces. We extend DHP into a new algorithm that we call Value-Gradient Learning, [Formula Omitted], and prove equivalence of an instance of the new algorithm to Backpropagation Through Time for Control with a greedy policy. Not only does this equivalence provide a link between these two different approaches, but it also enables our variant of DHP to have guaranteed convergence, under certain smoothness conditions and a greedy policy, when using a general smooth nonlinear function approximator for the critic. We consider several experimental scenarios including some that prove divergence of DHP under a greedy policy, which contrasts against our proven-convergent algorithm. We consider the adaptive dynamic programming technique called Dual Heuristic Programming (DHP), which is designed to learn a critic function, when using learned model functions of the environment. DHP is designed for optimizing control problems in large and continuous state spaces. We extend DHP into a new algorithm that we call Value-Gradient Learning, VGL(λ), and prove equivalence of an instance of the new algorithm to Backpropagation Through Time for Control with a greedy policy. Not only does this equivalence provide a link between these two different approaches, but it also enables our variant of DHP to have guaranteed convergence, under certain smoothness conditions and a greedy policy, when using a general smooth nonlinear function approximator for the critic. We consider several experimental scenarios including some that prove divergence of DHP under a greedy policy, which contrasts against our proven-convergent algorithm.We consider the adaptive dynamic programming technique called Dual Heuristic Programming (DHP), which is designed to learn a critic function, when using learned model functions of the environment. DHP is designed for optimizing control problems in large and continuous state spaces. We extend DHP into a new algorithm that we call Value-Gradient Learning, VGL(λ), and prove equivalence of an instance of the new algorithm to Backpropagation Through Time for Control with a greedy policy. Not only does this equivalence provide a link between these two different approaches, but it also enables our variant of DHP to have guaranteed convergence, under certain smoothness conditions and a greedy policy, when using a general smooth nonlinear function approximator for the critic. We consider several experimental scenarios including some that prove divergence of DHP under a greedy policy, which contrasts against our proven-convergent algorithm. We consider the adaptive dynamic programming technique called Dual Heuristic Programming (DHP), which is designed to learn a critic function, when using learned model functions of the environment. DHP is designed for optimizing control problems in large and continuous state spaces. We extend DHP into a new algorithm that we call Value-Gradient Learning, VGL(λ), and prove equivalence of an instance of the new algorithm to Backpropagation Through Time for Control with a greedy policy. Not only does this equivalence provide a link between these two different approaches, but it also enables our variant of DHP to have guaranteed convergence, under certain smoothness conditions and a greedy policy, when using a general smooth nonlinear function approximator for the critic. We consider several experimental scenarios including some that prove divergence of DHP under a greedy policy, which contrasts against our proven-convergent algorithm. We consider the adaptive dynamic programming technique called Dual Heuristic Programming (DHP), which is designed to learn a critic function, when using learned model functions of the environment. DHP is designed for optimizing control problems in large and continuous state spaces. We extend DHP into a new algorithm that we call Value-Gradient Learning, rm VGL ( lambda ) , and prove equivalence of an instance of the new algorithm to Backpropagation Through Time for Control with a greedy policy. Not only does this equivalence provide a link between these two different approaches, but it also enables our variant of DHP to have guaranteed convergence, under certain smoothness conditions and a greedy policy, when using a general smooth nonlinear function approximator for the critic. We consider several experimental scenarios including some that prove divergence of DHP under a greedy policy, which contrasts against our proven-convergent algorithm. |
| Author | Fairbank, Michael Prokhorov, Danil Alonso, Eduardo |
| Author_xml | – sequence: 1 givenname: Michael surname: Fairbank fullname: Fairbank, Michael email: michael.fairbank@virgin.net organization: Dept. of Comput. Sci., City Univ. London, London, UK – sequence: 2 givenname: Eduardo surname: Alonso fullname: Alonso, Eduardo email: e.alonso@city.ac.uk organization: Dept. of Comput. Sci., City Univ. London, London, UK – sequence: 3 givenname: Danil surname: Prokhorov fullname: Prokhorov, Danil email: dvprokhorov@gmail.com organization: Toyota Res. Inst. NA, Ann Arbor, MI, USA |
| BackLink | http://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=28074600$$DView record in Pascal Francis https://www.ncbi.nlm.nih.gov/pubmed/24805225$$D View this record in MEDLINE/PubMed |
| BookMark | eNqFkV1v0zAUhiM0xMbYHwAJWUJI3LTYjj8vuzI-pGogUQR30Ylz0nokTmcnm_bvl66lSLsA39iSn9fH5zzPs6PQBcyyl4xOGaP2_fLycvF9yinLp5xrprV5kp1wpviE58YcHc7613F2ltIVHZeiUgn7LDvmwlDJuTzJylkgF9eDv4EGg0Nyjv0tYiCzCja9v0Hy4S5A6x35FrtVhLb1YUV--n5NgMyj78cbCBU5B_d7E7sNrKD3XSDLdeyG1ZosfYsvsqc1NAnP9vtp9uPjxXL-ebL4-unLfLaYOMFEPwGpJColDDfcViVwNLnRQoM1rpRUYV2LMi-rGmmpbc0ZaKwEc1IBLdHZ_DR7t3t3_Mj1gKkvWp8cNg0E7IZUMMlzQfNc6P-jQglppbF0RN88Qq-6IYaxkZESRlnFLBup13tqKFusik30LcS74s-gR-DtHoDkoKkjBOfTX85QLRTdluM7zsUupYj1AWG02IovHsQXW_HFXvwYMo9CzvcPIvoIvvl39NUu6hHxUEtJY6ym-T0Sa7k7 |
| CODEN | ITNNAL |
| CitedBy_id | crossref_primary_10_1016_j_jfranklin_2014_11_008 crossref_primary_10_1016_j_arcontrol_2019_01_003 crossref_primary_10_1109_TNNLS_2014_2297991 crossref_primary_10_1109_TSMC_2018_2868510 crossref_primary_10_1109_TNNLS_2021_3071545 crossref_primary_10_1177_01423312221149776 crossref_primary_10_1016_j_ins_2015_04_044 crossref_primary_10_1109_TNNLS_2016_2586303 crossref_primary_10_1109_TNNLS_2015_2401334 crossref_primary_10_1109_TFUZZ_2019_2912349 crossref_primary_10_2514_1_G001154 crossref_primary_10_1109_TNNLS_2020_3021037 crossref_primary_10_1177_10597123231166416 crossref_primary_10_1109_TNNLS_2019_2919338 crossref_primary_10_1007_s11768_021_00056_w crossref_primary_10_1109_TNNLS_2016_2541020 crossref_primary_10_1016_j_neucom_2018_10_059 crossref_primary_10_1016_j_conengprac_2021_104807 crossref_primary_10_1109_TNNLS_2015_2504035 crossref_primary_10_1177_1059712315589355 crossref_primary_10_1109_TNNLS_2015_2487972 crossref_primary_10_1109_TNNLS_2016_2516948 crossref_primary_10_1177_01423312231156946 crossref_primary_10_1007_s00034_017_0572_z crossref_primary_10_1016_j_neunet_2024_107034 crossref_primary_10_1109_TNNLS_2014_2306201 crossref_primary_10_1007_s11042_018_5856_1 |
| Cites_doi | 10.1109/TNNLS.2012.2205268 10.1109/IJCNN.2012.6252791 10.1109/IJCNN.2012.6252569 10.1109/TIA.2003.809438 10.1162/089976600300015961 10.1109/21.229449 10.1109/ICNN.1993.298623 10.1109/5.58337 10.1007/BF00115009 10.1109/MCI.2009.932261 10.1109/ICSMC.1997.633056 10.1016/0893-6080(90)90088-3 10.1109/TSMCB.2008.926614 10.1117/12.343068 10.1109/ACC.2011.5991378 10.1109/ICNN.1997.616109 10.1109/TSMC.1983.6313077 10.1109/72.623201 10.1016/B978-1-55860-377-6.50013-X |
| ContentType | Journal Article |
| Copyright | 2015 INIST-CNRS Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) Dec 2013 |
| Copyright_xml | – notice: 2015 INIST-CNRS – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) Dec 2013 |
| DBID | 97E RIA RIE AAYXX CITATION IQODW NPM 7QF 7QO 7QP 7QQ 7QR 7SC 7SE 7SP 7SR 7TA 7TB 7TK 7U5 8BQ 8FD F28 FR3 H8D JG9 JQ2 KR7 L7M L~C L~D P64 7X8 |
| DOI | 10.1109/TNNLS.2013.2271778 |
| DatabaseName | IEEE All-Society Periodicals Package (ASPP) 2005–Present IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Electronic Library (IEL) CrossRef Pascal-Francis PubMed Aluminium Industry Abstracts Biotechnology Research Abstracts Calcium & Calcified Tissue Abstracts Ceramic Abstracts Chemoreception Abstracts Computer and Information Systems Abstracts Corrosion Abstracts Electronics & Communications Abstracts Engineered Materials Abstracts Materials Business File Mechanical & Transportation Engineering Abstracts Neurosciences Abstracts Solid State and Superconductivity Abstracts METADEX Technology Research Database ANTE: Abstracts in New Technology & Engineering Engineering Research Database Aerospace Database Materials Research Database ProQuest Computer Science Collection Civil Engineering Abstracts Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional Biotechnology and BioEngineering Abstracts MEDLINE - Academic |
| DatabaseTitle | CrossRef PubMed Materials Research Database Technology Research Database Computer and Information Systems Abstracts – Academic Mechanical & Transportation Engineering Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts Materials Business File Aerospace Database Engineered Materials Abstracts Biotechnology Research Abstracts Chemoreception Abstracts Advanced Technologies Database with Aerospace ANTE: Abstracts in New Technology & Engineering Civil Engineering Abstracts Aluminium Industry Abstracts Electronics & Communications Abstracts Ceramic Abstracts Neurosciences Abstracts METADEX Biotechnology and BioEngineering Abstracts Computer and Information Systems Abstracts Professional Solid State and Superconductivity Abstracts Engineering Research Database Calcium & Calcified Tissue Abstracts Corrosion Abstracts MEDLINE - Academic |
| DatabaseTitleList | Materials Research Database MEDLINE - Academic PubMed Technology Research Database |
| Database_xml | – sequence: 1 dbid: NPM name: PubMed url: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database – sequence: 2 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher – sequence: 3 dbid: 7X8 name: MEDLINE - Academic url: https://search.proquest.com/medline sourceTypes: Aggregation Database |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Computer Science Applied Sciences |
| EISSN | 2162-2388 |
| EndPage | 2100 |
| ExternalDocumentID | 3118401891 24805225 28074600 10_1109_TNNLS_2013_2271778 6588970 |
| Genre | orig-research Journal Article |
| GroupedDBID | 0R~ 4.4 5VS 6IK 97E AAJGR AARMG AASAJ AAWTH ABAZT ABQJQ ABVLG ACIWK ACPRK AENEX AFRAH AGQYO AGSQL AHBIQ AKJIK AKQYR ALMA_UNASSIGNED_HOLDINGS ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ EBS EJD IFIPE IPLJI JAVBF M43 MS~ O9- OCL PQQKQ RIA RIE RNS AAYXX CITATION IQODW RIG NPM 7QF 7QO 7QP 7QQ 7QR 7SC 7SE 7SP 7SR 7TA 7TB 7TK 7U5 8BQ 8FD F28 FR3 H8D JG9 JQ2 KR7 L7M L~C L~D P64 7X8 |
| ID | FETCH-LOGICAL-c414t-a565e66482829dba2e838747a98cb506eff4b3bdfe0b79f21a7ed41c56a0bec93 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 33 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000326940600015&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 2162-237X 2162-2388 |
| IngestDate | Thu Oct 02 09:44:48 EDT 2025 Thu Oct 02 10:55:03 EDT 2025 Mon Jun 30 04:18:58 EDT 2025 Thu Apr 03 07:04:29 EDT 2025 Wed Apr 02 07:37:51 EDT 2025 Sat Nov 29 01:39:48 EST 2025 Tue Nov 18 22:18:28 EST 2025 Tue Aug 26 16:42:11 EDT 2025 |
| IsPeerReviewed | false |
| IsScholarly | true |
| Issue | 12 |
| Keywords | Backpropagation Gradient dual heuristic programming (DHP) neural networks backpropagation through time Neural network value-gradient learning Function approximation Modeling Backpropagation algorithm Adaptive dynamic programming (ADP) Smooth function Heuristic method Greedy algorithm Dynamic programming Learning algorithm |
| Language | English |
| License | https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html CC BY 4.0 |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c414t-a565e66482829dba2e838747a98cb506eff4b3bdfe0b79f21a7ed41c56a0bec93 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 ObjectType-Article-2 ObjectType-Feature-1 content type line 23 |
| PMID | 24805225 |
| PQID | 1448696191 |
| PQPubID | 85436 |
| PageCount | 13 |
| ParticipantIDs | crossref_primary_10_1109_TNNLS_2013_2271778 ieee_primary_6588970 proquest_miscellaneous_1464595890 crossref_citationtrail_10_1109_TNNLS_2013_2271778 pubmed_primary_24805225 pascalfrancis_primary_28074600 proquest_journals_1448696191 proquest_miscellaneous_1523403347 |
| PublicationCentury | 2000 |
| PublicationDate | 2013-12-01 |
| PublicationDateYYYYMMDD | 2013-12-01 |
| PublicationDate_xml | – month: 12 year: 2013 text: 2013-12-01 day: 01 |
| PublicationDecade | 2010 |
| PublicationPlace | New York, NY |
| PublicationPlace_xml | – name: New York, NY – name: United States – name: Piscataway |
| PublicationTitle | IEEE transaction on neural networks and learning systems |
| PublicationTitleAbbrev | TNNLS |
| PublicationTitleAlternate | IEEE Trans Neural Netw Learn Syst |
| PublicationYear | 2013 |
| Publisher | IEEE Institute of Electrical and Electronics Engineers The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Publisher_xml | – name: IEEE – name: Institute of Electrical and Electronics Engineers – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| References | ref14 ref11 ref10 pontryagin (ref12) 1962; 4 ref1 ref17 florian (ref29) 2007 ref19 ref18 barto (ref28) 1983; 13 werbos (ref3) 1992 ferrari (ref5) 2004 fairbank (ref26) 2012 ref24 ref23 ref25 ref20 fairbank (ref13) 2011 ref22 ref21 werbos (ref15) 1992 ref27 ref8 ref9 ref4 ref6 howard (ref16) 1960 fairbank (ref7) 2008 bellman (ref2) 1957 |
| References_xml | – start-page: 283 year: 1992 ident: ref15 publication-title: Handbook of Intelligent Control – year: 2007 ident: ref29 publication-title: Correct Equations for the Dynamics of the cart-pole system – ident: ref6 doi: 10.1109/TNNLS.2012.2205268 – ident: ref8 doi: 10.1109/IJCNN.2012.6252791 – year: 1957 ident: ref2 publication-title: Dynamic Programming – start-page: 42 year: 1960 ident: ref16 publication-title: Dynamic Programming and Markov Processes – ident: ref14 doi: 10.1109/IJCNN.2012.6252569 – start-page: 493 year: 1992 ident: ref3 publication-title: Handbook of Intelligent Control – ident: ref10 doi: 10.1109/TIA.2003.809438 – ident: ref24 doi: 10.1162/089976600300015961 – volume: 4 year: 1962 ident: ref12 publication-title: The Mathematical Theory of Optimal Processes – year: 2008 ident: ref7 publication-title: Reinforcement learning by value gradients – ident: ref25 doi: 10.1109/21.229449 – ident: ref27 doi: 10.1109/ICNN.1993.298623 – ident: ref23 doi: 10.1109/5.58337 – year: 2012 ident: ref26 publication-title: Reinforcement Learning and Approximate Dynamic Programming for Feedback Control – year: 2011 ident: ref13 publication-title: The local optimality of reinforcement learning by value gradients and its relationship to policy gradient learning – ident: ref9 doi: 10.1007/BF00115009 – ident: ref1 doi: 10.1109/MCI.2009.932261 – ident: ref19 doi: 10.1109/ICSMC.1997.633056 – ident: ref22 doi: 10.1016/0893-6080(90)90088-3 – start-page: 65 year: 2004 ident: ref5 publication-title: Handbook of Learning and Approximate Dynamic Programming – ident: ref17 doi: 10.1109/TSMCB.2008.926614 – ident: ref20 doi: 10.1117/12.343068 – ident: ref18 doi: 10.1109/ACC.2011.5991378 – ident: ref11 doi: 10.1109/ICNN.1997.616109 – volume: 13 start-page: 834 year: 1983 ident: ref28 article-title: Neuronlike adaptive elements that can solve difficult learning control problems publication-title: IEEE Trans Syst Man Cybern doi: 10.1109/TSMC.1983.6313077 – ident: ref4 doi: 10.1109/72.623201 – ident: ref21 doi: 10.1016/B978-1-55860-377-6.50013-X |
| SSID | ssj0000605649 |
| Score | 2.2867706 |
| Snippet | We consider the adaptive dynamic programming technique called Dual Heuristic Programming (DHP), which is designed to learn a critic function, when using... |
| SourceID | proquest pubmed pascalfrancis crossref ieee |
| SourceType | Aggregation Database Index Database Enrichment Source Publisher |
| StartPage | 2088 |
| SubjectTerms | Adaptive dynamic programming (ADP) Algorithm design and analysis Algorithms Applied sciences Approximation algorithms Artificial intelligence Back propagation backpropagation through time Computer science; control theory; systems Control theory. Systems Convergence dual heuristic programming (DHP) Dynamic programming Dynamical systems Equations Exact sciences and technology Learning Learning and adaptive systems Mathematical models Neural networks Optimal control Policies Trajectory value-gradient learning Vectors |
| Title | An Equivalence Between Adaptive Dynamic Programming With a Critic and Backpropagation Through Time |
| URI | https://ieeexplore.ieee.org/document/6588970 https://www.ncbi.nlm.nih.gov/pubmed/24805225 https://www.proquest.com/docview/1448696191 https://www.proquest.com/docview/1464595890 https://www.proquest.com/docview/1523403347 |
| Volume | 24 |
| WOSCitedRecordID | wos000326940600015&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVIEE databaseName: IEEE Electronic Library (IEL) customDbUrl: eissn: 2162-2388 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0000605649 issn: 2162-237X databaseCode: RIE dateStart: 20120101 isFulltext: true titleUrlDefault: https://ieeexplore.ieee.org/ providerName: IEEE |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1Lb9QwEB6VigMXCpRHoKyMxA3SOo7jx3ELrThUq0ossLfIr4gVkC3dXX4_HjsbQIJK3KzEkex8M_aMPTMfwEtThaCtDCU3DS25s6JU3tBSOucqX0nZpPSxjxdyNlOLhb7cg9djLkwIIQWfhWNsprt8v3JbPCo7ibul0jI66LekFDlXazxPodEuF8naZZVgJavlYpcjQ_XJfDa7eI-BXPUxY9GDkcjTxzjW80eS7N-2pMSxghGSZh1_UpfZLf5tfqZt6Pzg_yZwD-4O5iaZZvm4D3uhfwAHOyoHMmj2IdhpT86-b5dR7vABOc3hW2TqzRUuiORtZq4nlzme61vc8cin5eYzMSSzJRDTe3Jq3Jc4hbhIJcDJPLMAEUw0eQgfzs_mb96VA_1C6XjFN6WJtl4Qgiu8bPXWsKBqFb0Po5WzDRWh67itre8CtVJ3rDIyeF65RhgaJUPXj2C_X_XhCRDGPbOOM-8Fcl1z0wlGmWUmtvAetoBqh0DrhtrkSJHxtU0-CtVtArBFANsBwAJejd9c5cocN_Y-RDjGngMSBUz-AHp8n4oERWOwgKMd8u2g3evoLnEldHQ9qwJejK-jXuJli-nDaot9sExPozS9oU_Dak7rmssCHmep-jWAQTif_n3gz-AOTi8H1hzB_uZ6G57Dbfdjs1xfT6KCLNQkKchPkM0J8A |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3db9MwED9NAwleNmB8BMYwEm-QzXEcO37sYNMQpZpEgb5F_oqoGOlYW_5-fHYaQIJJvFnJRbLzO9t39t39AF7owntlpM-5rmjOrRF57TTNpbW2cIWUVUwf-zSWk0k9m6nzLXg15MJ472PwmT_EZrzLdwu7xqOyo7Bb1koGB_1GxTmjKVtrOFGhwTIX0d5lhWA5K-VskyVD1dF0Mhl_wFCu8pCx4MNIZOpjHCv6I032b5tSZFnBGEm9DL-pTfwW_zZA40Z0uvt_Q7gDO73BSUZJQ-7Clu_uwe6GzIH0c3sPzKgjJ9_X86B5-IAcpwAuMnL6EpdE8iZx15PzFNH1Lex55PN89YVokvgSiO4cOdb2axhCWKYi5GSaeIAIpprch4-nJ9PXZ3lPwJBbXvBVroO154XgNV63OqOZr8s6-B9a1dZUVPi25aY0rvXUSNWyQkvveGEroWnQDVU-gO1u0flHQBh3zFjOnBPIds11KxhlhunQwpvYDIoNAo3tq5MjScZFE70UqpoIYIMANj2AGbwcvrlMtTmuld5DOAbJHokMDv4AengfywQFczCD_Q3yTT-_l8Fh4rVQwfksMng-vA4zE69bdOcXa5TBQj1Vreg1MhUrOS1LLjN4mLTqVwd65Xz8944_g1tn0_fjZvx28u4J3MahpjCbfdheXa39U7hpf6zmy6uDOE1-AkFdDE8 |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=An+Equivalence+Between+Adaptive+Dynamic+Programming+With+a+Critic+and+Backpropagation+Through+Time&rft.jtitle=IEEE+transaction+on+neural+networks+and+learning+systems&rft.au=Fairbank%2C+Michael&rft.au=Alonso%2C+Eduardo&rft.au=Prokhorov%2C+Danil&rft.date=2013-12-01&rft.pub=The+Institute+of+Electrical+and+Electronics+Engineers%2C+Inc.+%28IEEE%29&rft.issn=2162-237X&rft.eissn=2162-2388&rft.volume=24&rft.issue=12&rft.spage=2088&rft_id=info:doi/10.1109%2FTNNLS.2013.2271778&rft.externalDBID=NO_FULL_TEXT&rft.externalDocID=3118401891 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2162-237X&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2162-237X&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2162-237X&client=summon |