Output feedback adaptive dynamic programming for linear differential zero-sum games
This paper addresses the problem of finding optimal output feedback strategies for solving linear differential zero-sum games using a model-free approach based on adaptive dynamic programming (ADP). In contrast to their discrete-time counterparts, differential games involve continuous-time dynamics...
Gespeichert in:
| Veröffentlicht in: | Automatica (Oxford) Jg. 122; S. 109272 |
|---|---|
| Hauptverfasser: | , |
| Format: | Journal Article |
| Sprache: | Englisch |
| Veröffentlicht: |
Elsevier Ltd
01.12.2020
|
| Schlagworte: | |
| ISSN: | 0005-1098, 1873-2836 |
| Online-Zugang: | Volltext |
| Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
| Abstract | This paper addresses the problem of finding optimal output feedback strategies for solving linear differential zero-sum games using a model-free approach based on adaptive dynamic programming (ADP). In contrast to their discrete-time counterparts, differential games involve continuous-time dynamics and existing ADP approaches to their solutions require access to full measurement of the internal state. This difficulty is due to the fact that direct translation of the discrete-time output feedback ADP results requires derivatives of the input and output measurements, which is generally prohibitive in practice. This work aims to overcome this difficulty and presents a new embedded filtering based observer approach towards designing output feedback ADP algorithms for solving the differential zero-sum game problem. Two output feedback ADP algorithms based respectively on policy iteration and value iteration are developed. The proposed scheme is completely online in nature and works without requiring information of the system dynamics. In addition, this work also addresses the excitation bias problem encountered in output feedback ADP methods, which typically requires a discounting factor for its mitigation. We show that the proposed scheme is bias-free, and therefore, does not require a discounting factor. It is shown that the proposed algorithms converge to the solution obtained by solving the game algebraic Riccati equation. Two numerical examples are demonstrated to validate the proposed scheme. |
|---|---|
| AbstractList | This paper addresses the problem of finding optimal output feedback strategies for solving linear differential zero-sum games using a model-free approach based on adaptive dynamic programming (ADP). In contrast to their discrete-time counterparts, differential games involve continuous-time dynamics and existing ADP approaches to their solutions require access to full measurement of the internal state. This difficulty is due to the fact that direct translation of the discrete-time output feedback ADP results requires derivatives of the input and output measurements, which is generally prohibitive in practice. This work aims to overcome this difficulty and presents a new embedded filtering based observer approach towards designing output feedback ADP algorithms for solving the differential zero-sum game problem. Two output feedback ADP algorithms based respectively on policy iteration and value iteration are developed. The proposed scheme is completely online in nature and works without requiring information of the system dynamics. In addition, this work also addresses the excitation bias problem encountered in output feedback ADP methods, which typically requires a discounting factor for its mitigation. We show that the proposed scheme is bias-free, and therefore, does not require a discounting factor. It is shown that the proposed algorithms converge to the solution obtained by solving the game algebraic Riccati equation. Two numerical examples are demonstrated to validate the proposed scheme. |
| ArticleNumber | 109272 |
| Author | Rizvi, Syed Ali Asad Lin, Zongli |
| Author_xml | – sequence: 1 givenname: Syed Ali Asad surname: Rizvi fullname: Rizvi, Syed Ali Asad email: sr9gs@virginia.edu – sequence: 2 givenname: Zongli surname: Lin fullname: Lin, Zongli email: zl5y@virginia.edu |
| BookMark | eNqNkNtKAzEQhoNUsFbfIS-wNcme0htBiyco9EK9DrPJpKTubpYkLdSnd0sFwRu9GubwfzDfJZn0vkdCKGdzznh1s53DLvkOktMwF0wcxwtRizMy5bLOMyHzakKmjLEyGzfyglzGuB3bgksxJa_rXRp2iVpE04D-oGBgSG6P1Bx66JymQ_CbAF3n-g21PtDW9QiBGmctBuyTg5Z-YvBZ3HV0Ax3GK3JuoY14_V1n5P3x4W35nK3WTy_Lu1Wmcy5T1rAF17zSAmqsF7q2hjWiMqXMizovdKltwSuJAKySuShLU3HDhS0ta6QGqPIZkSeuDj7GgFYNwXUQDoozdZSjtupHjjrKUSc5Y_T2V1S7NJ75PgVw7X8A9ycAjg_uHQYVtcNeo3EBdVLGu78hX8T3i00 |
| CitedBy_id | crossref_primary_10_1016_j_automatica_2024_111601 crossref_primary_10_12677_AIRR_2023_124040 crossref_primary_10_1109_TCYB_2025_3541815 crossref_primary_10_1080_00051144_2025_2497616 crossref_primary_10_3390_aerospace11040281 crossref_primary_10_1016_j_neucom_2021_06_073 crossref_primary_10_1016_j_isatra_2021_12_017 crossref_primary_10_1002_oca_2834 crossref_primary_10_1109_TIV_2024_3401068 crossref_primary_10_1109_TSMC_2025_3560421 crossref_primary_10_1016_j_neucom_2025_130157 crossref_primary_10_1109_TAC_2022_3172590 crossref_primary_10_1109_TAC_2024_3352447 crossref_primary_10_1109_TAC_2023_3274863 crossref_primary_10_1109_TASE_2023_3324643 crossref_primary_10_1109_TCYB_2025_3582874 crossref_primary_10_1080_00207721_2025_2456028 crossref_primary_10_1109_ACCESS_2022_3168032 crossref_primary_10_1109_TAC_2023_3275732 crossref_primary_10_1109_TAC_2025_3532182 crossref_primary_10_1007_s12555_021_1021_0 crossref_primary_10_1002_acs_3512 crossref_primary_10_1109_TCNS_2023_3338242 crossref_primary_10_1002_rnc_6932 crossref_primary_10_1007_s11432_024_4209_9 crossref_primary_10_1016_j_automatica_2022_110768 crossref_primary_10_1109_TAC_2022_3228969 crossref_primary_10_1109_TCYB_2023_3277558 crossref_primary_10_1016_j_jfranklin_2022_02_034 crossref_primary_10_1109_TNNLS_2024_3362800 crossref_primary_10_1109_TCSII_2024_3483909 crossref_primary_10_1080_00207179_2024_2309194 crossref_primary_10_1016_j_neucom_2023_01_060 crossref_primary_10_1109_TAES_2024_3392874 crossref_primary_10_1109_TCYB_2021_3103148 crossref_primary_10_1109_TAES_2024_3431511 crossref_primary_10_1016_j_automatica_2025_112201 crossref_primary_10_1080_00207179_2025_2479191 crossref_primary_10_1016_j_automatica_2024_111551 crossref_primary_10_1007_s00521_022_07010_0 |
| Cites_doi | 10.1016/j.automatica.2015.08.017 10.1109/MCI.2009.932261 10.1109/TASE.2014.2300532 10.1109/TAC.2008.2006108 10.1109/TSMCB.2010.2043839 10.1002/acs.2830 10.1007/s11768-011-0166-4 10.1109/TNNLS.2015.2464080 10.23919/ACC.2018.8431290 10.1016/j.automatica.2010.10.033 10.1016/j.automatica.2016.05.008 10.1016/j.automatica.2016.05.003 10.1109/TCYB.2015.2477810 10.1109/TNNLS.2015.2461452 10.1016/j.ins.2012.08.012 10.1109/TCYB.2016.2523878 10.1016/j.automatica.2011.03.005 10.1016/j.automatica.2018.05.027 10.1080/00207179.2013.790562 10.1109/TCYB.2016.2611613 10.1016/j.automatica.2012.06.096 10.1109/TNNLS.2016.2638863 10.1109/TCYB.2014.2319577 10.1109/TCYB.2015.2488680 10.1109/TAC.2016.2616644 |
| ContentType | Journal Article |
| Copyright | 2020 Elsevier Ltd |
| Copyright_xml | – notice: 2020 Elsevier Ltd |
| DBID | AAYXX CITATION |
| DOI | 10.1016/j.automatica.2020.109272 |
| DatabaseName | CrossRef |
| DatabaseTitle | CrossRef |
| DatabaseTitleList | |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Engineering |
| EISSN | 1873-2836 |
| ExternalDocumentID | 10_1016_j_automatica_2020_109272 S0005109820304714 |
| GroupedDBID | --K --M -~X .DC .~1 0R~ 1B1 1~. 1~5 23N 3R3 4.4 457 4G. 5GY 5VS 6TJ 7-5 71M 8P~ 9JN 9JO AAAKF AAAKG AABNK AACTN AAEDT AAEDW AAIAV AAIKJ AAKOC AALRI AAOAW AAQFI AAQXK AARIN AAXUO ABDEX ABFNM ABFRF ABJNI ABMAC ABUCO ABXDB ABYKQ ACBEA ACDAQ ACGFO ACGFS ACNNM ACRLP ADBBV ADEZE ADIYS ADMUD ADTZH AEBSH AECPX AEFWE AEKER AENEX AFFNX AFKWA AFTJW AGHFR AGUBO AGYEJ AHHHB AHJVU AHPGS AI. AIEXJ AIKHN AITUG AJBFU AJOXV ALMA_UNASSIGNED_HOLDINGS AMFUW AMRAJ APLSM ASPBG AVWKF AXJTR AZFZN BJAXD BKOJK BLXMC CS3 EBS EFJIC EFLBG EJD EO8 EO9 EP2 EP3 F5P FDB FEDTE FGOYB FIRID FNPLU FYGXN G-2 G-Q GBLVA HAMUX HLZ HVGLF HZ~ H~9 IHE J1W JJJVA K-O KOM LG9 LY7 M41 MO0 N9A O-L O9- OAUVE OZT P-8 P-9 P2P PC. Q38 R2- RIG ROL RPZ RXW SBC SDF SDG SDP SES SET SEW SPC SPCBC SSB SSD SST SSZ T5K T9H TAE TN5 VH1 WH7 WUQ X6Y XFK XPP ZMT ~G- 77I 9DU AATTM AAXKI AAYWO AAYXX ABUFD ABWVN ACLOT ACRPL ACVFH ADCNI ADNMO AEIPS AEUPX AFJKZ AFPUW AGQPQ AIGII AIIUN AKBMS AKRWK AKYEP ANKPU APXCP CITATION EFKBS ~HD |
| ID | FETCH-LOGICAL-c318t-b091c16c2a7e79c7fd0b26d5834734c5cf4168eaa0683255d61d12f5f0b8caa63 |
| ISICitedReferencesCount | 42 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000598166000022&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 0005-1098 |
| IngestDate | Tue Nov 18 22:11:03 EST 2025 Sat Nov 29 07:32:54 EST 2025 Fri Feb 23 02:46:43 EST 2024 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Keywords | Approximate dynamic programming Zero-sum games Adaptive dynamic programming Output feedback |
| Language | English |
| LinkModel | OpenURL |
| MergedId | FETCHMERGED-LOGICAL-c318t-b091c16c2a7e79c7fd0b26d5834734c5cf4168eaa0683255d61d12f5f0b8caa63 |
| ParticipantIDs | crossref_primary_10_1016_j_automatica_2020_109272 crossref_citationtrail_10_1016_j_automatica_2020_109272 elsevier_sciencedirect_doi_10_1016_j_automatica_2020_109272 |
| PublicationCentury | 2000 |
| PublicationDate | December 2020 2020-12-00 |
| PublicationDateYYYYMMDD | 2020-12-01 |
| PublicationDate_xml | – month: 12 year: 2020 text: December 2020 |
| PublicationDecade | 2020 |
| PublicationTitle | Automatica (Oxford) |
| PublicationYear | 2020 |
| Publisher | Elsevier Ltd |
| Publisher_xml | – name: Elsevier Ltd |
| References | Liu, Huang, Wang, Wei (b10) 2013; 86 Stevens, Lewis (b18) 2003 Li, Liu, Wang (b9) 2014; 11 Fu, Fu, Chai (b4) 2015; 26 Zhao, Zhang, Wang, Zhu (b28) 2016; 46 Lanzon, Feng, Anderson, Rotkowitz (b7) 2008; 53 Wei, Liu, Lin, Song (b23) 2018; 29 Wu, Luo (b25) 2013; 222 Wei, Song, Yan (b24) 2015; 27 Zhang, Jiang, Luo, Xiao (b26) 2016; 47 Rizvi, Lin (b16) 2018; 95 Moghadam, Lewis (b13) 2019; 33 Modares, Lewis, Jiang (b12) 2016; 46 Wang, Zhang, Liu (b22) 2009; 4 Bian, Jiang (b2) 2015 Gao, Jiang, Jiang, Chai (b5) 2016; 72 Zhang, Wei, Liu (b27) 2011; 47 Jiang, Jiang (b6) 2012; 48 Rizvi, S. A. A., & Lin, Z. (2018b). Output Feedback reinforcement learning control for the continuous-time linear quadratic regulator problem. In (pp. 3417–3422). Luo, Wu, Huang (b11) 2014; 45 Vamvoudakis, Lewis (b20) 2011; 47 Vamvoudakis (b19) 2015; 61 Postoyan, Buşoniu, Nešić, Daafouz (b15) 2016; 62 Lewis, Vamvoudakis (b8) 2011; 41 Bian, Jiang (b3) 2016; 71 Başar, Bernhard (b1) 2008 Narendra, Annaswamy (b14) 1989 Vrabie, Lewis (b21) 2011; 9 Zhong, He (b29) 2016; 47 Moghadam (10.1016/j.automatica.2020.109272_b13) 2019; 33 Vamvoudakis (10.1016/j.automatica.2020.109272_b20) 2011; 47 Zhang (10.1016/j.automatica.2020.109272_b27) 2011; 47 Lanzon (10.1016/j.automatica.2020.109272_b7) 2008; 53 Zhao (10.1016/j.automatica.2020.109272_b28) 2016; 46 Bian (10.1016/j.automatica.2020.109272_b3) 2016; 71 Vrabie (10.1016/j.automatica.2020.109272_b21) 2011; 9 Liu (10.1016/j.automatica.2020.109272_b10) 2013; 86 Lewis (10.1016/j.automatica.2020.109272_b8) 2011; 41 Vamvoudakis (10.1016/j.automatica.2020.109272_b19) 2015; 61 Başar (10.1016/j.automatica.2020.109272_b1) 2008 Wang (10.1016/j.automatica.2020.109272_b22) 2009; 4 Fu (10.1016/j.automatica.2020.109272_b4) 2015; 26 Modares (10.1016/j.automatica.2020.109272_b12) 2016; 46 Wei (10.1016/j.automatica.2020.109272_b23) 2018; 29 Wu (10.1016/j.automatica.2020.109272_b25) 2013; 222 Wei (10.1016/j.automatica.2020.109272_b24) 2015; 27 Bian (10.1016/j.automatica.2020.109272_b2) 2015 Stevens (10.1016/j.automatica.2020.109272_b18) 2003 10.1016/j.automatica.2020.109272_b17 Postoyan (10.1016/j.automatica.2020.109272_b15) 2016; 62 Zhang (10.1016/j.automatica.2020.109272_b26) 2016; 47 Jiang (10.1016/j.automatica.2020.109272_b6) 2012; 48 Zhong (10.1016/j.automatica.2020.109272_b29) 2016; 47 Gao (10.1016/j.automatica.2020.109272_b5) 2016; 72 Li (10.1016/j.automatica.2020.109272_b9) 2014; 11 Luo (10.1016/j.automatica.2020.109272_b11) 2014; 45 Narendra (10.1016/j.automatica.2020.109272_b14) 1989 Rizvi (10.1016/j.automatica.2020.109272_b16) 2018; 95 |
| References_xml | – volume: 11 start-page: 706 year: 2014 end-page: 714 ident: b9 article-title: Integral reinforcement learning for linear continuous-time zero-sum games with completely unknown dynamics publication-title: IEEE Transactions on Automation Science and Engineering – volume: 47 start-page: 3331 year: 2016 end-page: 3340 ident: b26 article-title: Discrete-time nonzero-sum games for multiplayer using policy iteration based adaptive dynamic programming algorithms publication-title: IEEE Transactions on Cybernetics – start-page: 7610 year: 2015 end-page: 7615 ident: b2 article-title: Data-driven robust optimal control design for uncertain cascaded systems using value iteration publication-title: Proceedings of the 54th annual conference on decision and control (CDC) – volume: 46 start-page: 2401 year: 2016 end-page: 2410 ident: b12 article-title: Optimal output-feedback control of unknown continuous-time linear systems using off-policy reinforcement learning publication-title: IEEE Transactions on Cybernetics – volume: 86 start-page: 1554 year: 2013 end-page: 1566 ident: b10 article-title: Neural network observer based optimal control for unknown nonlinear systems using adaptive dynamic programming publication-title: International Journal of Control – volume: 222 start-page: 472 year: 2013 end-page: 485 ident: b25 article-title: Simultaneous policy update algorithms for learning the solution of linear continuous-time H-infinity state feedback control publication-title: Information Sciences – year: 2003 ident: b18 article-title: Aircraft control and simulation – volume: 4 start-page: 39 year: 2009 end-page: 47 ident: b22 article-title: Adaptive dynamic programming: An introduction publication-title: IEEE Computational Intelligence Magazine – volume: 61 start-page: 274 year: 2015 end-page: 281 ident: b19 article-title: Non-zero sum Nash Q-learning for unknown deterministic continuous-time linear systems publication-title: Automatica – volume: 29 start-page: 957 year: 2018 end-page: 969 ident: b23 article-title: Adaptive dynamic programming for discrete-time zero-sum games publication-title: IEEE Transactions on Neural Networks and Learning Systems – volume: 47 start-page: 1556 year: 2011 end-page: 1569 ident: b20 article-title: Multi-player non-zero-sum games: Online adaptive learning solution of coupled Hamilton–Jacobi equations publication-title: Automatica – volume: 95 start-page: 213 year: 2018 end-page: 221 ident: b16 article-title: Output feedback Q-learning for discrete-time linear zero-sum games with application to the H-infinity control publication-title: Automatica – volume: 41 start-page: 14 year: 2011 end-page: 25 ident: b8 article-title: Reinforcement learning for partially observable dynamic processes: adaptive dynamic programming using measured output data publication-title: IEEE Transactions on Systems, Man and Cybernetics, Part B (Cybernetics) – year: 1989 ident: b14 article-title: Stable adaptive systems – reference: Rizvi, S. A. A., & Lin, Z. (2018b). Output Feedback reinforcement learning control for the continuous-time linear quadratic regulator problem. In – volume: 33 start-page: 300 year: 2019 end-page: 314 ident: b13 article-title: Output-feedback H-infinity quadratic tracking control of linear systems using reinforcement learning publication-title: International Journal of Adaptive Control and Signal Processing – volume: 53 start-page: 2280 year: 2008 end-page: 2291 ident: b7 article-title: Computing the positive stabilizing solution to algebraic Riccati equations with an indefinite quadratic term via a recursive method publication-title: IEEE Transactions on Automatic Control – volume: 45 start-page: 65 year: 2014 end-page: 76 ident: b11 article-title: Off-policy reinforcement learning for publication-title: IEEE Transactions on Cybernetics – volume: 47 start-page: 207 year: 2011 end-page: 214 ident: b27 article-title: An iterative adaptive dynamic programming method for solving a class of nonlinear zero-sum differential games publication-title: Automatica – volume: 47 start-page: 683 year: 2016 end-page: 694 ident: b29 article-title: An event-triggered ADP control approach for continuous-time system with unknown internal states publication-title: IEEE Transactions on Cybernetics – volume: 27 start-page: 444 year: 2015 end-page: 458 ident: b24 article-title: Data-driven zero-sum neuro-optimal control for a class of continuous-time unknown nonlinear systems with disturbance using ADP publication-title: IEEE Transactions on Neural Networks and Learning systems – volume: 26 start-page: 3314 year: 2015 end-page: 3319 ident: b4 article-title: Robust adaptive dynamic programming of two-player zero-sum games for continuous-time linear systems publication-title: IEEE Transactions on Neural Networks and Learning Systems – year: 2008 ident: b1 article-title: H-infinity optimal control and related minimax design problems: A dynamic game approach – volume: 9 start-page: 353 year: 2011 end-page: 360 ident: b21 article-title: Adaptive dynamic programming for online solution of a zero-sum differential game publication-title: Journal of Control Theory and Applications – volume: 62 start-page: 2736 year: 2016 end-page: 2749 ident: b15 article-title: Stability analysis of discrete-time infinite-horizon optimal control with discounted cost publication-title: IEEE Transactions on Automatic Control – volume: 71 start-page: 348 year: 2016 end-page: 360 ident: b3 article-title: Value iteration and adaptive dynamic programming for data-driven adaptive optimal control design publication-title: Automatica – reference: (pp. 3417–3422). – volume: 72 start-page: 37 year: 2016 end-page: 45 ident: b5 article-title: Output-feedback adaptive optimal control of interconnected systems based on robust adaptive dynamic programming publication-title: Automatica – volume: 46 start-page: 854 year: 2016 end-page: 865 ident: b28 article-title: Experience replay for optimal control of nonzero-sum game systems with unknown dynamics publication-title: IEEE Transactions on Cybernetics – volume: 48 start-page: 2699 year: 2012 end-page: 2704 ident: b6 article-title: Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics publication-title: Automatica – volume: 61 start-page: 274 year: 2015 ident: 10.1016/j.automatica.2020.109272_b19 article-title: Non-zero sum Nash Q-learning for unknown deterministic continuous-time linear systems publication-title: Automatica doi: 10.1016/j.automatica.2015.08.017 – volume: 4 start-page: 39 issue: 2 year: 2009 ident: 10.1016/j.automatica.2020.109272_b22 article-title: Adaptive dynamic programming: An introduction publication-title: IEEE Computational Intelligence Magazine doi: 10.1109/MCI.2009.932261 – volume: 11 start-page: 706 issue: 3 year: 2014 ident: 10.1016/j.automatica.2020.109272_b9 article-title: Integral reinforcement learning for linear continuous-time zero-sum games with completely unknown dynamics publication-title: IEEE Transactions on Automation Science and Engineering doi: 10.1109/TASE.2014.2300532 – year: 2003 ident: 10.1016/j.automatica.2020.109272_b18 – volume: 53 start-page: 2280 issue: 10 year: 2008 ident: 10.1016/j.automatica.2020.109272_b7 article-title: Computing the positive stabilizing solution to algebraic Riccati equations with an indefinite quadratic term via a recursive method publication-title: IEEE Transactions on Automatic Control doi: 10.1109/TAC.2008.2006108 – volume: 41 start-page: 14 issue: 1 year: 2011 ident: 10.1016/j.automatica.2020.109272_b8 article-title: Reinforcement learning for partially observable dynamic processes: adaptive dynamic programming using measured output data publication-title: IEEE Transactions on Systems, Man and Cybernetics, Part B (Cybernetics) doi: 10.1109/TSMCB.2010.2043839 – volume: 33 start-page: 300 year: 2019 ident: 10.1016/j.automatica.2020.109272_b13 article-title: Output-feedback H-infinity quadratic tracking control of linear systems using reinforcement learning publication-title: International Journal of Adaptive Control and Signal Processing doi: 10.1002/acs.2830 – volume: 9 start-page: 353 issue: 3 year: 2011 ident: 10.1016/j.automatica.2020.109272_b21 article-title: Adaptive dynamic programming for online solution of a zero-sum differential game publication-title: Journal of Control Theory and Applications doi: 10.1007/s11768-011-0166-4 – volume: 27 start-page: 444 issue: 2 year: 2015 ident: 10.1016/j.automatica.2020.109272_b24 article-title: Data-driven zero-sum neuro-optimal control for a class of continuous-time unknown nonlinear systems with disturbance using ADP publication-title: IEEE Transactions on Neural Networks and Learning systems doi: 10.1109/TNNLS.2015.2464080 – ident: 10.1016/j.automatica.2020.109272_b17 doi: 10.23919/ACC.2018.8431290 – year: 2008 ident: 10.1016/j.automatica.2020.109272_b1 – volume: 47 start-page: 207 issue: 1 year: 2011 ident: 10.1016/j.automatica.2020.109272_b27 article-title: An iterative adaptive dynamic programming method for solving a class of nonlinear zero-sum differential games publication-title: Automatica doi: 10.1016/j.automatica.2010.10.033 – volume: 72 start-page: 37 year: 2016 ident: 10.1016/j.automatica.2020.109272_b5 article-title: Output-feedback adaptive optimal control of interconnected systems based on robust adaptive dynamic programming publication-title: Automatica doi: 10.1016/j.automatica.2016.05.008 – volume: 71 start-page: 348 year: 2016 ident: 10.1016/j.automatica.2020.109272_b3 article-title: Value iteration and adaptive dynamic programming for data-driven adaptive optimal control design publication-title: Automatica doi: 10.1016/j.automatica.2016.05.003 – volume: 46 start-page: 2401 issue: 11 year: 2016 ident: 10.1016/j.automatica.2020.109272_b12 article-title: Optimal output-feedback control of unknown continuous-time linear systems using off-policy reinforcement learning publication-title: IEEE Transactions on Cybernetics doi: 10.1109/TCYB.2015.2477810 – volume: 26 start-page: 3314 issue: 12 year: 2015 ident: 10.1016/j.automatica.2020.109272_b4 article-title: Robust adaptive dynamic programming of two-player zero-sum games for continuous-time linear systems publication-title: IEEE Transactions on Neural Networks and Learning Systems doi: 10.1109/TNNLS.2015.2461452 – volume: 222 start-page: 472 year: 2013 ident: 10.1016/j.automatica.2020.109272_b25 article-title: Simultaneous policy update algorithms for learning the solution of linear continuous-time H-infinity state feedback control publication-title: Information Sciences doi: 10.1016/j.ins.2012.08.012 – volume: 47 start-page: 683 issue: 3 year: 2016 ident: 10.1016/j.automatica.2020.109272_b29 article-title: An event-triggered ADP control approach for continuous-time system with unknown internal states publication-title: IEEE Transactions on Cybernetics doi: 10.1109/TCYB.2016.2523878 – volume: 47 start-page: 1556 issue: 8 year: 2011 ident: 10.1016/j.automatica.2020.109272_b20 article-title: Multi-player non-zero-sum games: Online adaptive learning solution of coupled Hamilton–Jacobi equations publication-title: Automatica doi: 10.1016/j.automatica.2011.03.005 – start-page: 7610 year: 2015 ident: 10.1016/j.automatica.2020.109272_b2 article-title: Data-driven robust optimal control design for uncertain cascaded systems using value iteration – year: 1989 ident: 10.1016/j.automatica.2020.109272_b14 – volume: 95 start-page: 213 year: 2018 ident: 10.1016/j.automatica.2020.109272_b16 article-title: Output feedback Q-learning for discrete-time linear zero-sum games with application to the H-infinity control publication-title: Automatica doi: 10.1016/j.automatica.2018.05.027 – volume: 86 start-page: 1554 issue: 9 year: 2013 ident: 10.1016/j.automatica.2020.109272_b10 article-title: Neural network observer based optimal control for unknown nonlinear systems using adaptive dynamic programming publication-title: International Journal of Control doi: 10.1080/00207179.2013.790562 – volume: 47 start-page: 3331 issue: 10 year: 2016 ident: 10.1016/j.automatica.2020.109272_b26 article-title: Discrete-time nonzero-sum games for multiplayer using policy iteration based adaptive dynamic programming algorithms publication-title: IEEE Transactions on Cybernetics doi: 10.1109/TCYB.2016.2611613 – volume: 48 start-page: 2699 issue: 10 year: 2012 ident: 10.1016/j.automatica.2020.109272_b6 article-title: Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics publication-title: Automatica doi: 10.1016/j.automatica.2012.06.096 – volume: 29 start-page: 957 issue: 4 year: 2018 ident: 10.1016/j.automatica.2020.109272_b23 article-title: Adaptive dynamic programming for discrete-time zero-sum games publication-title: IEEE Transactions on Neural Networks and Learning Systems doi: 10.1109/TNNLS.2016.2638863 – volume: 45 start-page: 65 issue: 1 year: 2014 ident: 10.1016/j.automatica.2020.109272_b11 article-title: Off-policy reinforcement learning for H∞ control design publication-title: IEEE Transactions on Cybernetics doi: 10.1109/TCYB.2014.2319577 – volume: 46 start-page: 854 issue: 3 year: 2016 ident: 10.1016/j.automatica.2020.109272_b28 article-title: Experience replay for optimal control of nonzero-sum game systems with unknown dynamics publication-title: IEEE Transactions on Cybernetics doi: 10.1109/TCYB.2015.2488680 – volume: 62 start-page: 2736 issue: 6 year: 2016 ident: 10.1016/j.automatica.2020.109272_b15 article-title: Stability analysis of discrete-time infinite-horizon optimal control with discounted cost publication-title: IEEE Transactions on Automatic Control doi: 10.1109/TAC.2016.2616644 |
| SSID | ssj0004182 |
| Score | 2.517005 |
| Snippet | This paper addresses the problem of finding optimal output feedback strategies for solving linear differential zero-sum games using a model-free approach based... |
| SourceID | crossref elsevier |
| SourceType | Enrichment Source Index Database Publisher |
| StartPage | 109272 |
| SubjectTerms | Adaptive dynamic programming Approximate dynamic programming Output feedback Zero-sum games |
| Title | Output feedback adaptive dynamic programming for linear differential zero-sum games |
| URI | https://dx.doi.org/10.1016/j.automatica.2020.109272 |
| Volume | 122 |
| WOSCitedRecordID | wos000598166000022&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVESC databaseName: ScienceDirect database customDbUrl: eissn: 1873-2836 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0004182 issn: 0005-1098 databaseCode: AIEXJ dateStart: 19950101 isFulltext: true titleUrlDefault: https://www.sciencedirect.com providerName: Elsevier |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV3PT9swGLU24MAOiG2gMcbkw25VUOLEsStOFWKCHdgETKp2iRz_QAWWViVFhb-ez7GdZgwJhrRLFCW14-a9fPliPb8PoS9GGaW4MRGExSzKuGFRSQiPtErTpMySBM42xSbY8TEfDvs_vKHCdVNOgFUVn8_7k_8KNRwDsO3S2X-Au-0UDsA-gA5bgB22zwL--6yezOqegddSKeRlTygxafRByhWfD5Ks30FDaRNNMW1LpdR2Dv1OT8cRjLp3LsISkWBVO6vHjc2raJxK504c304nnIzubhqBwOktpLKDqxEQQKhW9uMcC36N7eLh7oQD6Yo3QhC15qWueHQbRN3qYh8G4SxxFXn-itBusuDC6nP8aHftRXYXTf40xX7wsmolhEGddlEseipsT4Xr6TVaJoz2IdAtD44Oht8Wi2UT7izk_b_w8i4n-nt8VI_nLJ085GwdrfkPCDxwwL9Fr3T1Dr3p2Eq-R6eOAjhQAAcKYE8B3KEABgCxowDuUgAHCuCGAhvo59eDs_3DyBfPiCSE6ToqIRGUSS6JYJr1JTMqLkmuKE8zlmaSSgOpONdCxDkEdUpVnqiEGGrikksh8nQTLVXjSn9AWMMPUlFqkRG4e7HkcVpaY0L4NKBU0HwLsXB7Cumd5W2Bk6viKZC2UNK2nDh3lWe02QsIFD5LdNlfARR7svXHF1xxG60unoNPaKmezvQOWpE39eh6-tnz6x5DHY8X |
| linkProvider | Elsevier |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Output+feedback+adaptive+dynamic+programming+for+linear+differential+zero-sum+games&rft.jtitle=Automatica+%28Oxford%29&rft.au=Rizvi%2C+Syed+Ali+Asad&rft.au=Lin%2C+Zongli&rft.date=2020-12-01&rft.issn=0005-1098&rft.volume=122&rft.spage=109272&rft_id=info:doi/10.1016%2Fj.automatica.2020.109272&rft.externalDBID=n%2Fa&rft.externalDocID=10_1016_j_automatica_2020_109272 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0005-1098&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0005-1098&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0005-1098&client=summon |