Reinforcement Learning-Based Linear Quadratic Regulation of Continuous-Time Systems Using Dynamic Output Feedback
In this paper, we propose a model-free solution to the linear quadratic regulation (LQR) problem of continuous-time systems based on reinforcement learning using dynamic output feedback. The design objective is to learn the optimal control parameters by using only the measurable input-output data, w...
Uloženo v:
| Vydáno v: | IEEE transactions on cybernetics Ročník 50; číslo 11; s. 4670 - 4679 |
|---|---|
| Hlavní autoři: | , |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
United States
IEEE
01.11.2020
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Témata: | |
| ISSN: | 2168-2267, 2168-2275, 2168-2275 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Abstract | In this paper, we propose a model-free solution to the linear quadratic regulation (LQR) problem of continuous-time systems based on reinforcement learning using dynamic output feedback. The design objective is to learn the optimal control parameters by using only the measurable input-output data, without requiring model information. A state parametrization scheme is presented which reconstructs the system state based on the filtered input and output signals. Based on this parametrization, two new output feedback adaptive dynamic programming Bellman equations are derived for the LQR problem based on policy iteration and value iteration (VI). Unlike the existing output feedback methods for continuous-time systems, the need to apply discrete approximation is obviated. In contrast with the static output feedback controllers, the proposed method can also handle systems that are state feedback stabilizable but not static output feedback stabilizable. An advantage of this scheme is that it stands immune to the exploration bias issue. Moreover, it does not require a discounted cost function and, thus, ensures the closed-loop stability and the optimality of the solution. Compared with earlier output feedback results, the proposed VI method does not require an initially stabilizing policy. We show that the estimates of the control parameters converge to those obtained by solving the LQR algebraic Riccati equation. A comprehensive simulation study is carried out to verify the proposed algorithms. |
|---|---|
| AbstractList | In this paper, we propose a model-free solution to the linear quadratic regulation (LQR) problem of continuous-time systems based on reinforcement learning using dynamic output feedback. The design objective is to learn the optimal control parameters by using only the measurable input-output data, without requiring model information. A state parametrization scheme is presented which reconstructs the system state based on the filtered input and output signals. Based on this parametrization, two new output feedback adaptive dynamic programming Bellman equations are derived for the LQR problem based on policy iteration and value iteration (VI). Unlike the existing output feedback methods for continuous-time systems, the need to apply discrete approximation is obviated. In contrast with the static output feedback controllers, the proposed method can also handle systems that are state feedback stabilizable but not static output feedback stabilizable. An advantage of this scheme is that it stands immune to the exploration bias issue. Moreover, it does not require a discounted cost function and, thus, ensures the closed-loop stability and the optimality of the solution. Compared with earlier output feedback results, the proposed VI method does not require an initially stabilizing policy. We show that the estimates of the control parameters converge to those obtained by solving the LQR algebraic Riccati equation. A comprehensive simulation study is carried out to verify the proposed algorithms. In this paper, we propose a model-free solution to the linear quadratic regulation (LQR) problem of continuous-time systems based on reinforcement learning using dynamic output feedback. The design objective is to learn the optimal control parameters by using only the measurable input-output data, without requiring model information. A state parametrization scheme is presented which reconstructs the system state based on the filtered input and output signals. Based on this parametrization, two new output feedback adaptive dynamic programming Bellman equations are derived for the LQR problem based on policy iteration and value iteration (VI). Unlike the existing output feedback methods for continuous-time systems, the need to apply discrete approximation is obviated. In contrast with the static output feedback controllers, the proposed method can also handle systems that are state feedback stabilizable but not static output feedback stabilizable. An advantage of this scheme is that it stands immune to the exploration bias issue. Moreover, it does not require a discounted cost function and, thus, ensures the closed-loop stability and the optimality of the solution. Compared with earlier output feedback results, the proposed VI method does not require an initially stabilizing policy. We show that the estimates of the control parameters converge to those obtained by solving the LQR algebraic Riccati equation. A comprehensive simulation study is carried out to verify the proposed algorithms.In this paper, we propose a model-free solution to the linear quadratic regulation (LQR) problem of continuous-time systems based on reinforcement learning using dynamic output feedback. The design objective is to learn the optimal control parameters by using only the measurable input-output data, without requiring model information. A state parametrization scheme is presented which reconstructs the system state based on the filtered input and output signals. Based on this parametrization, two new output feedback adaptive dynamic programming Bellman equations are derived for the LQR problem based on policy iteration and value iteration (VI). Unlike the existing output feedback methods for continuous-time systems, the need to apply discrete approximation is obviated. In contrast with the static output feedback controllers, the proposed method can also handle systems that are state feedback stabilizable but not static output feedback stabilizable. An advantage of this scheme is that it stands immune to the exploration bias issue. Moreover, it does not require a discounted cost function and, thus, ensures the closed-loop stability and the optimality of the solution. Compared with earlier output feedback results, the proposed VI method does not require an initially stabilizing policy. We show that the estimates of the control parameters converge to those obtained by solving the LQR algebraic Riccati equation. A comprehensive simulation study is carried out to verify the proposed algorithms. |
| Author | Rizvi, Syed Ali Asad Lin, Zongli |
| Author_xml | – sequence: 1 givenname: Syed Ali Asad orcidid: 0000-0003-1412-8841 surname: Rizvi fullname: Rizvi, Syed Ali Asad email: sr9gs@virginia.edu organization: Charles L. Brown Department of Electrical and Computer Engineering, University of Virginia, Charlottesville, VA, USA – sequence: 2 givenname: Zongli orcidid: 0000-0003-1589-1443 surname: Lin fullname: Lin, Zongli email: zl5y@virginia.edu organization: Charles L. Brown Department of Electrical and Computer Engineering, University of Virginia, Charlottesville, VA, USA |
| BackLink | https://www.ncbi.nlm.nih.gov/pubmed/30605117$$D View this record in MEDLINE/PubMed |
| BookMark | eNp9kU9vEzEQxS1UREvpB0BIyFIvvWzwn13bOdLQAlKkipIeOFmOPa5cdr2p7T3k2-OQkEMP-OLR0-_NjOa9RSdxjIDQe0pmlJL5p9Xi1_WMEapmTCkhefcKnTEqVMOY7E6OtZCn6CLnJ1KfqtJcvUGnnAjSUSrP0PM9hOjHZGGAWPASTIohPjbXJoPDyxCrgH9MxiVTgsX38Dj1tRojHj1ejLGEOI1TblZhAPxzmwsMGT_k2gJ_2UYzVM_dVDZTwbcAbm3s73fotTd9hovDf44ebm9Wi2_N8u7r98XnZWN5Oy-NlYJR1bag5o4S2QnXOqOMBE8UER3hjnBgvqrSW-vAd2TtLTBHWusU8fwcXe37btL4PEEuegjZQt-bCHVjXe_DCWWUyYpevkCfxinFup1mbSc6JmUrKvXxQE3rAZzepDCYtNX_jlkBugdsGnNO4I8IJXqXmd5lpneZ6UNm1SNfeGwofw9ckgn9f50f9s4AAMdJShDCpeJ_AEqZo_s |
| CODEN | ITCEB8 |
| CitedBy_id | crossref_primary_10_1109_TSMC_2024_3495821 crossref_primary_10_1016_j_oceaneng_2023_115700 crossref_primary_10_1109_TCYB_2025_3540967 crossref_primary_10_1002_acs_3326 crossref_primary_10_1016_j_oceaneng_2022_111186 crossref_primary_10_3390_electronics13122424 crossref_primary_10_1016_j_conengprac_2023_105675 crossref_primary_10_1016_j_neucom_2021_06_073 crossref_primary_10_1007_s11424_025_4358_2 crossref_primary_10_1109_ACCESS_2022_3208058 crossref_primary_10_1109_TNNLS_2020_3045087 crossref_primary_10_1109_TCYB_2023_3338197 crossref_primary_10_1016_j_automatica_2023_111261 crossref_primary_10_1109_TCYB_2019_2926631 crossref_primary_10_1109_TAI_2022_3215671 crossref_primary_10_1109_TCSI_2025_3560303 crossref_primary_10_1109_TCYB_2021_3051456 crossref_primary_10_1109_TCYB_2020_3029077 crossref_primary_10_1109_TCYB_2025_3582874 crossref_primary_10_1109_TAC_2025_3532182 crossref_primary_10_1016_j_jfranklin_2024_107439 crossref_primary_10_1109_TAC_2023_3247461 crossref_primary_10_1002_acs_3898 crossref_primary_10_1109_ACCESS_2020_3031891 crossref_primary_10_1109_TIE_2023_3292873 crossref_primary_10_1049_cth2_12682 crossref_primary_10_1109_TCYB_2022_3227313 crossref_primary_10_3390_aerospace12010030 crossref_primary_10_1016_j_fss_2025_109565 crossref_primary_10_1109_TSMC_2024_3506695 crossref_primary_10_1109_TCYB_2024_3472020 crossref_primary_10_1016_j_procs_2023_11_019 crossref_primary_10_1109_TCSII_2024_3483909 crossref_primary_10_1109_TCYB_2020_2989419 crossref_primary_10_1109_TSMC_2024_3431453 crossref_primary_10_1109_TCYB_2020_3029825 crossref_primary_10_1007_s11768_022_00081_3 crossref_primary_10_1109_TCYB_2021_3086223 crossref_primary_10_1109_TAC_2025_3535281 crossref_primary_10_1109_TCYB_2022_3202864 crossref_primary_10_1016_j_automatica_2024_111601 crossref_primary_10_1017_S0263574722001138 crossref_primary_10_1109_TIM_2025_3548807 crossref_primary_10_1109_TIE_2024_3426038 crossref_primary_10_1109_TCYB_2020_2981480 crossref_primary_10_3390_pr13061791 crossref_primary_10_1109_TCSII_2023_3266431 crossref_primary_10_1109_TNNLS_2023_3333551 crossref_primary_10_1016_j_automatica_2024_111848 crossref_primary_10_1109_TIE_2023_3247734 crossref_primary_10_1016_j_neucom_2021_01_070 crossref_primary_10_1109_TNNLS_2021_3137548 crossref_primary_10_1109_JAS_2023_123759 crossref_primary_10_1109_TAES_2023_3235873 crossref_primary_10_1002_rnc_6374 crossref_primary_10_1109_TAC_2023_3275732 crossref_primary_10_1109_TCNS_2024_3462549 crossref_primary_10_1002_acs_3832 crossref_primary_10_1109_TNSE_2024_3382400 crossref_primary_10_1109_TSMC_2023_3344017 crossref_primary_10_1007_s11432_024_4209_9 crossref_primary_10_1109_ACCESS_2024_3395249 crossref_primary_10_1109_TCSII_2024_3495679 crossref_primary_10_1109_TASE_2025_3593481 crossref_primary_10_1007_s12204_022_2460_3 crossref_primary_10_1109_TCYB_2019_2953218 crossref_primary_10_1007_s11424_025_4535_3 crossref_primary_10_1016_j_comcom_2024_108011 crossref_primary_10_1109_TMECH_2020_3016326 crossref_primary_10_1109_TCYB_2021_3103148 crossref_primary_10_1049_iet_cta_2020_0255 crossref_primary_10_1109_LCSYS_2024_3523244 crossref_primary_10_1177_09596518231172264 crossref_primary_10_1109_JAS_2023_123843 |
| Cites_doi | 10.1080/00207179.2013.863432 10.1007/s11768-011-0166-4 10.1109/ACC.2016.7526571 10.1109/ACC.1994.735224 10.1016/j.automatica.2018.05.027 10.1109/TCYB.2014.2313915 10.1109/MCAS.2009.933854 10.1016/j.automatica.2008.08.017 10.1109/TCST.2014.2322778 10.1109/TSMCB.2010.2043839 10.1109/MCS.2012.2214134 10.1016/j.automatica.2012.06.096 10.1109/TAC.1968.1098829 10.1109/TAC.2016.2548662 10.1109/TCYB.2014.2384016 10.1016/j.automatica.2016.05.003 10.1109/TCYB.2015.2477810 10.1109/TNN.2009.2027233 10.1109/TAC.2011.2122570 10.1109/CDC.2017.8263836 10.1109/TAC.2016.2616644 |
| ContentType | Journal Article |
| Copyright | Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020 |
| Copyright_xml | – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020 |
| DBID | 97E RIA RIE AAYXX CITATION NPM 7SC 7SP 7TB 8FD F28 FR3 H8D JQ2 L7M L~C L~D 7X8 |
| DOI | 10.1109/TCYB.2018.2886735 |
| DatabaseName | IEEE All-Society Periodicals Package (ASPP) 2005–Present IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Electronic Library (IEL) CrossRef PubMed Computer and Information Systems Abstracts Electronics & Communications Abstracts Mechanical & Transportation Engineering Abstracts Technology Research Database ANTE: Abstracts in New Technology & Engineering Engineering Research Database Aerospace Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional MEDLINE - Academic |
| DatabaseTitle | CrossRef PubMed Aerospace Database Technology Research Database Computer and Information Systems Abstracts – Academic Mechanical & Transportation Engineering Abstracts Electronics & Communications Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts Engineering Research Database Advanced Technologies Database with Aerospace ANTE: Abstracts in New Technology & Engineering Computer and Information Systems Abstracts Professional MEDLINE - Academic |
| DatabaseTitleList | PubMed MEDLINE - Academic Aerospace Database |
| Database_xml | – sequence: 1 dbid: NPM name: PubMed url: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database – sequence: 2 dbid: RIE name: IEEE/IET Electronic Library (IEL) (UW System Shared) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher – sequence: 3 dbid: 7X8 name: MEDLINE - Academic url: https://search.proquest.com/medline sourceTypes: Aggregation Database |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Sciences (General) |
| EISSN | 2168-2275 |
| EndPage | 4679 |
| ExternalDocumentID | 30605117 10_1109_TCYB_2018_2886735 8600378 |
| Genre | orig-research Journal Article |
| GroupedDBID | 0R~ 4.4 6IK 97E AAJGR AARMG AASAJ AAWTH ABAZT ABQJQ ABVLG ACIWK AENEX AGQYO AGSQL AHBIQ AKJIK AKQYR ALMA_UNASSIGNED_HOLDINGS ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ EBS EJD HZ~ IFIPE IPLJI JAVBF M43 O9- OCL PQQKQ RIA RIE RNS AAYXX CITATION NPM RIG 7SC 7SP 7TB 8FD F28 FR3 H8D JQ2 L7M L~C L~D 7X8 |
| ID | FETCH-LOGICAL-c349t-c7621844e89d10756d4da8a7ef0806503d03e2fd4d7fccdef50bfce2d04cd80f3 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 92 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000583709500010&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 2168-2267 2168-2275 |
| IngestDate | Sun Sep 28 01:44:13 EDT 2025 Mon Jun 30 06:52:22 EDT 2025 Thu Jan 02 22:35:05 EST 2025 Tue Nov 18 22:32:14 EST 2025 Sat Nov 29 02:02:26 EST 2025 Wed Aug 27 02:31:51 EDT 2025 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 11 |
| Language | English |
| License | https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html https://doi.org/10.15223/policy-029 https://doi.org/10.15223/policy-037 |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c349t-c7621844e89d10756d4da8a7ef0806503d03e2fd4d7fccdef50bfce2d04cd80f3 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 |
| ORCID | 0000-0003-1412-8841 0000-0003-1589-1443 |
| PMID | 30605117 |
| PQID | 2456527746 |
| PQPubID | 85422 |
| PageCount | 10 |
| ParticipantIDs | ieee_primary_8600378 proquest_journals_2456527746 crossref_primary_10_1109_TCYB_2018_2886735 proquest_miscellaneous_2163012127 pubmed_primary_30605117 crossref_citationtrail_10_1109_TCYB_2018_2886735 |
| PublicationCentury | 2000 |
| PublicationDate | 2020-11-01 |
| PublicationDateYYYYMMDD | 2020-11-01 |
| PublicationDate_xml | – month: 11 year: 2020 text: 2020-11-01 day: 01 |
| PublicationDecade | 2020 |
| PublicationPlace | United States |
| PublicationPlace_xml | – name: United States – name: Piscataway |
| PublicationTitle | IEEE transactions on cybernetics |
| PublicationTitleAbbrev | TCYB |
| PublicationTitleAlternate | IEEE Trans Cybern |
| PublicationYear | 2020 |
| Publisher | IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Publisher_xml | – name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| References | ref13 ref24 ref23 ref15 ref14 landelius (ref8) 1997 ref20 ref11 ref22 ref10 ref21 lewis (ref9) 1995 ref2 lewis (ref12) 2012; 32 ref1 ref17 ref16 ref19 ref18 ref7 ref4 ref6 ref5 gao (ref3) 2014 |
| References_xml | – year: 1997 ident: ref8 article-title: Reinforcement learning and distributed local model synthesis – ident: ref15 doi: 10.1080/00207179.2013.863432 – ident: ref19 doi: 10.1007/s11768-011-0166-4 – year: 1995 ident: ref9 publication-title: Optimal Control – ident: ref18 doi: 10.1109/ACC.2016.7526571 – ident: ref2 doi: 10.1109/ACC.1994.735224 – ident: ref17 doi: 10.1016/j.automatica.2018.05.027 – ident: ref23 doi: 10.1109/TCYB.2014.2313915 – ident: ref11 doi: 10.1109/MCAS.2009.933854 – ident: ref20 doi: 10.1016/j.automatica.2008.08.017 – ident: ref24 doi: 10.1109/TCST.2014.2322778 – ident: ref10 doi: 10.1109/TSMCB.2010.2043839 – volume: 32 start-page: 76 year: 2012 ident: ref12 article-title: Reinforcement learning and feedback control: Using natural decision methods to design optimal adaptive controllers publication-title: IEEE Control Syst Mag doi: 10.1109/MCS.2012.2214134 – ident: ref5 doi: 10.1016/j.automatica.2012.06.096 – ident: ref7 doi: 10.1109/TAC.1968.1098829 – ident: ref4 doi: 10.1109/TAC.2016.2548662 – ident: ref6 doi: 10.1109/TCYB.2014.2384016 – ident: ref1 doi: 10.1016/j.automatica.2016.05.003 – ident: ref13 doi: 10.1109/TCYB.2015.2477810 – ident: ref22 doi: 10.1109/TNN.2009.2027233 – ident: ref21 doi: 10.1109/TAC.2011.2122570 – ident: ref16 doi: 10.1109/CDC.2017.8263836 – start-page: 2085 year: 2014 ident: ref3 article-title: Adaptive and optimal output feedback control of linear systems: An adaptive dynamic programming approach publication-title: Proc World Congr Intell Control Autom (WCICA) – ident: ref14 doi: 10.1109/TAC.2016.2616644 |
| SSID | ssj0000816898 |
| Score | 2.5337243 |
| Snippet | In this paper, we propose a model-free solution to the linear quadratic regulation (LQR) problem of continuous-time systems based on reinforcement learning... |
| SourceID | proquest pubmed crossref ieee |
| SourceType | Aggregation Database Index Database Enrichment Source Publisher |
| StartPage | 4670 |
| SubjectTerms | Adaptive dynamic programming (ADP) Algorithms Continuous time systems Cost function Dynamic programming Feedback control Iterative methods Learning Linear quadratic regulator linear quadratic regulator (LQR) Mathematical model Mathematical models Optimal control Optimization Output feedback Parameter estimation Parameterization reinforcement learning (RL) Riccati equation Stability analysis State feedback |
| Title | Reinforcement Learning-Based Linear Quadratic Regulation of Continuous-Time Systems Using Dynamic Output Feedback |
| URI | https://ieeexplore.ieee.org/document/8600378 https://www.ncbi.nlm.nih.gov/pubmed/30605117 https://www.proquest.com/docview/2456527746 https://www.proquest.com/docview/2163012127 |
| Volume | 50 |
| WOSCitedRecordID | wos000583709500010&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVIEE databaseName: IEEE/IET Electronic Library (IEL) (UW System Shared) customDbUrl: eissn: 2168-2275 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0000816898 issn: 2168-2267 databaseCode: RIE dateStart: 20130101 isFulltext: true titleUrlDefault: https://ieeexplore.ieee.org/ providerName: IEEE |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1JT9wwFH4C1EMvBUpbUha5Ug9tVYMnztjOkW3UE20RlaanKOMFIVCGMkl_f99zPOFSkLhFjrMo33P8Fvv7AD4iqlr7UvGS0veFkpYb52a8rk09CjKE0dhFsQl9fm6m0_LHCnwd9sJ47-PiM39Ah7GW7-a2o1TZoVFEl2JWYVVr1e_VGvIpUUAiSt_meMDRq9CpiDkS5eHlye9jWsdlDnJjlJYkWIPOMlpkVCp7mJGixMrj3macdSbrz3vfDXiVvEt21JvDJqz45jVspvG7YJ8SyfTnLfhz4SNnqo3pQZZoVq_4Mc5qjmGEig3sZ1c7shDLLnrJegSRzQMjSqvrppt3C05bSFiiPWdx_QE77UXu2feuvetaNsH5cVbbmzfwa3J2efKNJ_kFbmVRttzifxLjv8Kb0mGQOFaucIig9kFQNVZIJ6TPA7bqYK3zYSxmwfrcicI6I4J8C2vNvPHbwLRFr00ZOZJhXDhTG4txm5XoHM0U-qcmA7GEoLKJm5wkMm6rGKOIsiIAKwKwSgBm8GW45K4n5niq8xahM3RMwGSwu8S5SkN3UcVKcI5escrgw3AaBx1VUurG47et0MQkkeHlOoN3vX0M916a1fv_P3MHXuYUssftjLuw1t53fg9e2L_t9eJ-Hy17avajZf8DAzTx6Q |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1Lb9QwEB6VggQXoBRooICROADCrTdOYudIC6siygLVIpWTlfUDVUXZ0k34_cw43nABJG6R4zyUbxzPw_4-gGeIqlK-rnhN6fuikpZr5xa8aXQzCTKESemi2ISazfTpaf1pA16Ne2G893Hxmd-jw1jLd0vbU6psX1dEl6KvwNWyKHIx7NYaMypRQiKK3-Z4wNGvUKmMORH1_vzw6wGt5NJ7udaVkiRZg-4y2mTUKvs9J0WRlb_7m3Hemd76vze-DTeTf8leDwaxBRu-vQNbaQSv2PNEM_1iG36c-MiaamOCkCWi1W_8AOc1xzBGxQb2uW8c2YhlJ4NoPcLIloERqdVZ2y_7FadNJCwRn7O4AoG9GWTu2ce-u-g7NsUZctHY87vwZfp2fnjEkwADt7KoO27xT4kRYOF17TBMLCtXOMRQ-SCoHiukE9LnAVtVsNb5UIpFsD53orBOiyDvwWa7bP0OMGXRb6u0nMhQFk432mLkZiW6R4sKPVSdgVhDYGxiJyeRjO8mRimiNgSgIQBNAjCDl-MlFwM1x786bxM6Y8cETAa7a5xNGrwrE2vBOfrFVQZPx9M47KiW0rQev61BE5NEh5erDO4P9jHee21WD_78zCdw_Wj-4dgcv5u9fwg3cgrg4-bGXdjsLnv_CK7Zn93Z6vJxtO9fCZj0SA |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Reinforcement+Learning-Based+Linear+Quadratic+Regulation+of+Continuous-Time+Systems+Using+Dynamic+Output+Feedback&rft.jtitle=IEEE+transactions+on+cybernetics&rft.au=Rizvi%2C+Syed+Ali+Asad&rft.au=Lin%2C+Zongli&rft.date=2020-11-01&rft.pub=IEEE&rft.issn=2168-2267&rft.volume=50&rft.issue=11&rft.spage=4670&rft.epage=4679&rft_id=info:doi/10.1109%2FTCYB.2018.2886735&rft_id=info%3Apmid%2F30605117&rft.externalDocID=8600378 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2168-2267&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2168-2267&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2168-2267&client=summon |