Output feedback adaptive dynamic programming for linear differential zero-sum games

This paper addresses the problem of finding optimal output feedback strategies for solving linear differential zero-sum games using a model-free approach based on adaptive dynamic programming (ADP). In contrast to their discrete-time counterparts, differential games involve continuous-time dynamics...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Automatica (Oxford) Jg. 122; S. 109272
Hauptverfasser: Rizvi, Syed Ali Asad, Lin, Zongli
Format: Journal Article
Sprache:Englisch
Veröffentlicht: Elsevier Ltd 01.12.2020
Schlagworte:
ISSN:0005-1098, 1873-2836
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Abstract This paper addresses the problem of finding optimal output feedback strategies for solving linear differential zero-sum games using a model-free approach based on adaptive dynamic programming (ADP). In contrast to their discrete-time counterparts, differential games involve continuous-time dynamics and existing ADP approaches to their solutions require access to full measurement of the internal state. This difficulty is due to the fact that direct translation of the discrete-time output feedback ADP results requires derivatives of the input and output measurements, which is generally prohibitive in practice. This work aims to overcome this difficulty and presents a new embedded filtering based observer approach towards designing output feedback ADP algorithms for solving the differential zero-sum game problem. Two output feedback ADP algorithms based respectively on policy iteration and value iteration are developed. The proposed scheme is completely online in nature and works without requiring information of the system dynamics. In addition, this work also addresses the excitation bias problem encountered in output feedback ADP methods, which typically requires a discounting factor for its mitigation. We show that the proposed scheme is bias-free, and therefore, does not require a discounting factor. It is shown that the proposed algorithms converge to the solution obtained by solving the game algebraic Riccati equation. Two numerical examples are demonstrated to validate the proposed scheme.
AbstractList This paper addresses the problem of finding optimal output feedback strategies for solving linear differential zero-sum games using a model-free approach based on adaptive dynamic programming (ADP). In contrast to their discrete-time counterparts, differential games involve continuous-time dynamics and existing ADP approaches to their solutions require access to full measurement of the internal state. This difficulty is due to the fact that direct translation of the discrete-time output feedback ADP results requires derivatives of the input and output measurements, which is generally prohibitive in practice. This work aims to overcome this difficulty and presents a new embedded filtering based observer approach towards designing output feedback ADP algorithms for solving the differential zero-sum game problem. Two output feedback ADP algorithms based respectively on policy iteration and value iteration are developed. The proposed scheme is completely online in nature and works without requiring information of the system dynamics. In addition, this work also addresses the excitation bias problem encountered in output feedback ADP methods, which typically requires a discounting factor for its mitigation. We show that the proposed scheme is bias-free, and therefore, does not require a discounting factor. It is shown that the proposed algorithms converge to the solution obtained by solving the game algebraic Riccati equation. Two numerical examples are demonstrated to validate the proposed scheme.
ArticleNumber 109272
Author Rizvi, Syed Ali Asad
Lin, Zongli
Author_xml – sequence: 1
  givenname: Syed Ali Asad
  surname: Rizvi
  fullname: Rizvi, Syed Ali Asad
  email: sr9gs@virginia.edu
– sequence: 2
  givenname: Zongli
  surname: Lin
  fullname: Lin, Zongli
  email: zl5y@virginia.edu
BookMark eNqNkNtKAzEQhoNUsFbfIS-wNcme0htBiyco9EK9DrPJpKTubpYkLdSnd0sFwRu9GubwfzDfJZn0vkdCKGdzznh1s53DLvkOktMwF0wcxwtRizMy5bLOMyHzakKmjLEyGzfyglzGuB3bgksxJa_rXRp2iVpE04D-oGBgSG6P1Bx66JymQ_CbAF3n-g21PtDW9QiBGmctBuyTg5Z-YvBZ3HV0Ax3GK3JuoY14_V1n5P3x4W35nK3WTy_Lu1Wmcy5T1rAF17zSAmqsF7q2hjWiMqXMizovdKltwSuJAKySuShLU3HDhS0ta6QGqPIZkSeuDj7GgFYNwXUQDoozdZSjtupHjjrKUSc5Y_T2V1S7NJ75PgVw7X8A9ycAjg_uHQYVtcNeo3EBdVLGu78hX8T3i00
CitedBy_id crossref_primary_10_1016_j_automatica_2024_111601
crossref_primary_10_12677_AIRR_2023_124040
crossref_primary_10_1109_TCYB_2025_3541815
crossref_primary_10_1080_00051144_2025_2497616
crossref_primary_10_3390_aerospace11040281
crossref_primary_10_1016_j_neucom_2021_06_073
crossref_primary_10_1016_j_isatra_2021_12_017
crossref_primary_10_1002_oca_2834
crossref_primary_10_1109_TIV_2024_3401068
crossref_primary_10_1109_TSMC_2025_3560421
crossref_primary_10_1016_j_neucom_2025_130157
crossref_primary_10_1109_TAC_2022_3172590
crossref_primary_10_1109_TAC_2024_3352447
crossref_primary_10_1109_TAC_2023_3274863
crossref_primary_10_1109_TASE_2023_3324643
crossref_primary_10_1109_TCYB_2025_3582874
crossref_primary_10_1080_00207721_2025_2456028
crossref_primary_10_1109_ACCESS_2022_3168032
crossref_primary_10_1109_TAC_2023_3275732
crossref_primary_10_1109_TAC_2025_3532182
crossref_primary_10_1007_s12555_021_1021_0
crossref_primary_10_1002_acs_3512
crossref_primary_10_1109_TCNS_2023_3338242
crossref_primary_10_1002_rnc_6932
crossref_primary_10_1007_s11432_024_4209_9
crossref_primary_10_1016_j_automatica_2022_110768
crossref_primary_10_1109_TAC_2022_3228969
crossref_primary_10_1109_TCYB_2023_3277558
crossref_primary_10_1016_j_jfranklin_2022_02_034
crossref_primary_10_1109_TNNLS_2024_3362800
crossref_primary_10_1109_TCSII_2024_3483909
crossref_primary_10_1080_00207179_2024_2309194
crossref_primary_10_1016_j_neucom_2023_01_060
crossref_primary_10_1109_TAES_2024_3392874
crossref_primary_10_1109_TCYB_2021_3103148
crossref_primary_10_1109_TAES_2024_3431511
crossref_primary_10_1016_j_automatica_2025_112201
crossref_primary_10_1080_00207179_2025_2479191
crossref_primary_10_1016_j_automatica_2024_111551
crossref_primary_10_1007_s00521_022_07010_0
Cites_doi 10.1016/j.automatica.2015.08.017
10.1109/MCI.2009.932261
10.1109/TASE.2014.2300532
10.1109/TAC.2008.2006108
10.1109/TSMCB.2010.2043839
10.1002/acs.2830
10.1007/s11768-011-0166-4
10.1109/TNNLS.2015.2464080
10.23919/ACC.2018.8431290
10.1016/j.automatica.2010.10.033
10.1016/j.automatica.2016.05.008
10.1016/j.automatica.2016.05.003
10.1109/TCYB.2015.2477810
10.1109/TNNLS.2015.2461452
10.1016/j.ins.2012.08.012
10.1109/TCYB.2016.2523878
10.1016/j.automatica.2011.03.005
10.1016/j.automatica.2018.05.027
10.1080/00207179.2013.790562
10.1109/TCYB.2016.2611613
10.1016/j.automatica.2012.06.096
10.1109/TNNLS.2016.2638863
10.1109/TCYB.2014.2319577
10.1109/TCYB.2015.2488680
10.1109/TAC.2016.2616644
ContentType Journal Article
Copyright 2020 Elsevier Ltd
Copyright_xml – notice: 2020 Elsevier Ltd
DBID AAYXX
CITATION
DOI 10.1016/j.automatica.2020.109272
DatabaseName CrossRef
DatabaseTitle CrossRef
DatabaseTitleList
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISSN 1873-2836
ExternalDocumentID 10_1016_j_automatica_2020_109272
S0005109820304714
GroupedDBID --K
--M
-~X
.DC
.~1
0R~
1B1
1~.
1~5
23N
3R3
4.4
457
4G.
5GY
5VS
6TJ
7-5
71M
8P~
9JN
9JO
AAAKF
AAAKG
AABNK
AACTN
AAEDT
AAEDW
AAIAV
AAIKJ
AAKOC
AALRI
AAOAW
AAQFI
AAQXK
AARIN
AAXUO
ABDEX
ABFNM
ABFRF
ABJNI
ABMAC
ABUCO
ABXDB
ABYKQ
ACBEA
ACDAQ
ACGFO
ACGFS
ACNNM
ACRLP
ADBBV
ADEZE
ADIYS
ADMUD
ADTZH
AEBSH
AECPX
AEFWE
AEKER
AENEX
AFFNX
AFKWA
AFTJW
AGHFR
AGUBO
AGYEJ
AHHHB
AHJVU
AHPGS
AI.
AIEXJ
AIKHN
AITUG
AJBFU
AJOXV
ALMA_UNASSIGNED_HOLDINGS
AMFUW
AMRAJ
APLSM
ASPBG
AVWKF
AXJTR
AZFZN
BJAXD
BKOJK
BLXMC
CS3
EBS
EFJIC
EFLBG
EJD
EO8
EO9
EP2
EP3
F5P
FDB
FEDTE
FGOYB
FIRID
FNPLU
FYGXN
G-2
G-Q
GBLVA
HAMUX
HLZ
HVGLF
HZ~
H~9
IHE
J1W
JJJVA
K-O
KOM
LG9
LY7
M41
MO0
N9A
O-L
O9-
OAUVE
OZT
P-8
P-9
P2P
PC.
Q38
R2-
RIG
ROL
RPZ
RXW
SBC
SDF
SDG
SDP
SES
SET
SEW
SPC
SPCBC
SSB
SSD
SST
SSZ
T5K
T9H
TAE
TN5
VH1
WH7
WUQ
X6Y
XFK
XPP
ZMT
~G-
77I
9DU
AATTM
AAXKI
AAYWO
AAYXX
ABUFD
ABWVN
ACLOT
ACRPL
ACVFH
ADCNI
ADNMO
AEIPS
AEUPX
AFJKZ
AFPUW
AGQPQ
AIGII
AIIUN
AKBMS
AKRWK
AKYEP
ANKPU
APXCP
CITATION
EFKBS
~HD
ID FETCH-LOGICAL-c318t-b091c16c2a7e79c7fd0b26d5834734c5cf4168eaa0683255d61d12f5f0b8caa63
ISICitedReferencesCount 42
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000598166000022&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 0005-1098
IngestDate Tue Nov 18 22:11:03 EST 2025
Sat Nov 29 07:32:54 EST 2025
Fri Feb 23 02:46:43 EST 2024
IsPeerReviewed true
IsScholarly true
Keywords Approximate dynamic programming
Zero-sum games
Adaptive dynamic programming
Output feedback
Language English
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-c318t-b091c16c2a7e79c7fd0b26d5834734c5cf4168eaa0683255d61d12f5f0b8caa63
ParticipantIDs crossref_primary_10_1016_j_automatica_2020_109272
crossref_citationtrail_10_1016_j_automatica_2020_109272
elsevier_sciencedirect_doi_10_1016_j_automatica_2020_109272
PublicationCentury 2000
PublicationDate December 2020
2020-12-00
PublicationDateYYYYMMDD 2020-12-01
PublicationDate_xml – month: 12
  year: 2020
  text: December 2020
PublicationDecade 2020
PublicationTitle Automatica (Oxford)
PublicationYear 2020
Publisher Elsevier Ltd
Publisher_xml – name: Elsevier Ltd
References Liu, Huang, Wang, Wei (b10) 2013; 86
Stevens, Lewis (b18) 2003
Li, Liu, Wang (b9) 2014; 11
Fu, Fu, Chai (b4) 2015; 26
Zhao, Zhang, Wang, Zhu (b28) 2016; 46
Lanzon, Feng, Anderson, Rotkowitz (b7) 2008; 53
Wei, Liu, Lin, Song (b23) 2018; 29
Wu, Luo (b25) 2013; 222
Wei, Song, Yan (b24) 2015; 27
Zhang, Jiang, Luo, Xiao (b26) 2016; 47
Rizvi, Lin (b16) 2018; 95
Moghadam, Lewis (b13) 2019; 33
Modares, Lewis, Jiang (b12) 2016; 46
Wang, Zhang, Liu (b22) 2009; 4
Bian, Jiang (b2) 2015
Gao, Jiang, Jiang, Chai (b5) 2016; 72
Zhang, Wei, Liu (b27) 2011; 47
Jiang, Jiang (b6) 2012; 48
Rizvi, S. A. A., & Lin, Z. (2018b). Output Feedback reinforcement learning control for the continuous-time linear quadratic regulator problem. In
(pp. 3417–3422).
Luo, Wu, Huang (b11) 2014; 45
Vamvoudakis, Lewis (b20) 2011; 47
Vamvoudakis (b19) 2015; 61
Postoyan, Buşoniu, Nešić, Daafouz (b15) 2016; 62
Lewis, Vamvoudakis (b8) 2011; 41
Bian, Jiang (b3) 2016; 71
Başar, Bernhard (b1) 2008
Narendra, Annaswamy (b14) 1989
Vrabie, Lewis (b21) 2011; 9
Zhong, He (b29) 2016; 47
Moghadam (10.1016/j.automatica.2020.109272_b13) 2019; 33
Vamvoudakis (10.1016/j.automatica.2020.109272_b20) 2011; 47
Zhang (10.1016/j.automatica.2020.109272_b27) 2011; 47
Lanzon (10.1016/j.automatica.2020.109272_b7) 2008; 53
Zhao (10.1016/j.automatica.2020.109272_b28) 2016; 46
Bian (10.1016/j.automatica.2020.109272_b3) 2016; 71
Vrabie (10.1016/j.automatica.2020.109272_b21) 2011; 9
Liu (10.1016/j.automatica.2020.109272_b10) 2013; 86
Lewis (10.1016/j.automatica.2020.109272_b8) 2011; 41
Vamvoudakis (10.1016/j.automatica.2020.109272_b19) 2015; 61
Başar (10.1016/j.automatica.2020.109272_b1) 2008
Wang (10.1016/j.automatica.2020.109272_b22) 2009; 4
Fu (10.1016/j.automatica.2020.109272_b4) 2015; 26
Modares (10.1016/j.automatica.2020.109272_b12) 2016; 46
Wei (10.1016/j.automatica.2020.109272_b23) 2018; 29
Wu (10.1016/j.automatica.2020.109272_b25) 2013; 222
Wei (10.1016/j.automatica.2020.109272_b24) 2015; 27
Bian (10.1016/j.automatica.2020.109272_b2) 2015
Stevens (10.1016/j.automatica.2020.109272_b18) 2003
10.1016/j.automatica.2020.109272_b17
Postoyan (10.1016/j.automatica.2020.109272_b15) 2016; 62
Zhang (10.1016/j.automatica.2020.109272_b26) 2016; 47
Jiang (10.1016/j.automatica.2020.109272_b6) 2012; 48
Zhong (10.1016/j.automatica.2020.109272_b29) 2016; 47
Gao (10.1016/j.automatica.2020.109272_b5) 2016; 72
Li (10.1016/j.automatica.2020.109272_b9) 2014; 11
Luo (10.1016/j.automatica.2020.109272_b11) 2014; 45
Narendra (10.1016/j.automatica.2020.109272_b14) 1989
Rizvi (10.1016/j.automatica.2020.109272_b16) 2018; 95
References_xml – volume: 11
  start-page: 706
  year: 2014
  end-page: 714
  ident: b9
  article-title: Integral reinforcement learning for linear continuous-time zero-sum games with completely unknown dynamics
  publication-title: IEEE Transactions on Automation Science and Engineering
– volume: 47
  start-page: 3331
  year: 2016
  end-page: 3340
  ident: b26
  article-title: Discrete-time nonzero-sum games for multiplayer using policy iteration based adaptive dynamic programming algorithms
  publication-title: IEEE Transactions on Cybernetics
– start-page: 7610
  year: 2015
  end-page: 7615
  ident: b2
  article-title: Data-driven robust optimal control design for uncertain cascaded systems using value iteration
  publication-title: Proceedings of the 54th annual conference on decision and control (CDC)
– volume: 46
  start-page: 2401
  year: 2016
  end-page: 2410
  ident: b12
  article-title: Optimal output-feedback control of unknown continuous-time linear systems using off-policy reinforcement learning
  publication-title: IEEE Transactions on Cybernetics
– volume: 86
  start-page: 1554
  year: 2013
  end-page: 1566
  ident: b10
  article-title: Neural network observer based optimal control for unknown nonlinear systems using adaptive dynamic programming
  publication-title: International Journal of Control
– volume: 222
  start-page: 472
  year: 2013
  end-page: 485
  ident: b25
  article-title: Simultaneous policy update algorithms for learning the solution of linear continuous-time H-infinity state feedback control
  publication-title: Information Sciences
– year: 2003
  ident: b18
  article-title: Aircraft control and simulation
– volume: 4
  start-page: 39
  year: 2009
  end-page: 47
  ident: b22
  article-title: Adaptive dynamic programming: An introduction
  publication-title: IEEE Computational Intelligence Magazine
– volume: 61
  start-page: 274
  year: 2015
  end-page: 281
  ident: b19
  article-title: Non-zero sum Nash Q-learning for unknown deterministic continuous-time linear systems
  publication-title: Automatica
– volume: 29
  start-page: 957
  year: 2018
  end-page: 969
  ident: b23
  article-title: Adaptive dynamic programming for discrete-time zero-sum games
  publication-title: IEEE Transactions on Neural Networks and Learning Systems
– volume: 47
  start-page: 1556
  year: 2011
  end-page: 1569
  ident: b20
  article-title: Multi-player non-zero-sum games: Online adaptive learning solution of coupled Hamilton–Jacobi equations
  publication-title: Automatica
– volume: 95
  start-page: 213
  year: 2018
  end-page: 221
  ident: b16
  article-title: Output feedback Q-learning for discrete-time linear zero-sum games with application to the H-infinity control
  publication-title: Automatica
– volume: 41
  start-page: 14
  year: 2011
  end-page: 25
  ident: b8
  article-title: Reinforcement learning for partially observable dynamic processes: adaptive dynamic programming using measured output data
  publication-title: IEEE Transactions on Systems, Man and Cybernetics, Part B (Cybernetics)
– year: 1989
  ident: b14
  article-title: Stable adaptive systems
– reference: Rizvi, S. A. A., & Lin, Z. (2018b). Output Feedback reinforcement learning control for the continuous-time linear quadratic regulator problem. In
– volume: 33
  start-page: 300
  year: 2019
  end-page: 314
  ident: b13
  article-title: Output-feedback H-infinity quadratic tracking control of linear systems using reinforcement learning
  publication-title: International Journal of Adaptive Control and Signal Processing
– volume: 53
  start-page: 2280
  year: 2008
  end-page: 2291
  ident: b7
  article-title: Computing the positive stabilizing solution to algebraic Riccati equations with an indefinite quadratic term via a recursive method
  publication-title: IEEE Transactions on Automatic Control
– volume: 45
  start-page: 65
  year: 2014
  end-page: 76
  ident: b11
  article-title: Off-policy reinforcement learning for
  publication-title: IEEE Transactions on Cybernetics
– volume: 47
  start-page: 207
  year: 2011
  end-page: 214
  ident: b27
  article-title: An iterative adaptive dynamic programming method for solving a class of nonlinear zero-sum differential games
  publication-title: Automatica
– volume: 47
  start-page: 683
  year: 2016
  end-page: 694
  ident: b29
  article-title: An event-triggered ADP control approach for continuous-time system with unknown internal states
  publication-title: IEEE Transactions on Cybernetics
– volume: 27
  start-page: 444
  year: 2015
  end-page: 458
  ident: b24
  article-title: Data-driven zero-sum neuro-optimal control for a class of continuous-time unknown nonlinear systems with disturbance using ADP
  publication-title: IEEE Transactions on Neural Networks and Learning systems
– volume: 26
  start-page: 3314
  year: 2015
  end-page: 3319
  ident: b4
  article-title: Robust adaptive dynamic programming of two-player zero-sum games for continuous-time linear systems
  publication-title: IEEE Transactions on Neural Networks and Learning Systems
– year: 2008
  ident: b1
  article-title: H-infinity optimal control and related minimax design problems: A dynamic game approach
– volume: 9
  start-page: 353
  year: 2011
  end-page: 360
  ident: b21
  article-title: Adaptive dynamic programming for online solution of a zero-sum differential game
  publication-title: Journal of Control Theory and Applications
– volume: 62
  start-page: 2736
  year: 2016
  end-page: 2749
  ident: b15
  article-title: Stability analysis of discrete-time infinite-horizon optimal control with discounted cost
  publication-title: IEEE Transactions on Automatic Control
– volume: 71
  start-page: 348
  year: 2016
  end-page: 360
  ident: b3
  article-title: Value iteration and adaptive dynamic programming for data-driven adaptive optimal control design
  publication-title: Automatica
– reference: (pp. 3417–3422).
– volume: 72
  start-page: 37
  year: 2016
  end-page: 45
  ident: b5
  article-title: Output-feedback adaptive optimal control of interconnected systems based on robust adaptive dynamic programming
  publication-title: Automatica
– volume: 46
  start-page: 854
  year: 2016
  end-page: 865
  ident: b28
  article-title: Experience replay for optimal control of nonzero-sum game systems with unknown dynamics
  publication-title: IEEE Transactions on Cybernetics
– volume: 48
  start-page: 2699
  year: 2012
  end-page: 2704
  ident: b6
  article-title: Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics
  publication-title: Automatica
– volume: 61
  start-page: 274
  year: 2015
  ident: 10.1016/j.automatica.2020.109272_b19
  article-title: Non-zero sum Nash Q-learning for unknown deterministic continuous-time linear systems
  publication-title: Automatica
  doi: 10.1016/j.automatica.2015.08.017
– volume: 4
  start-page: 39
  issue: 2
  year: 2009
  ident: 10.1016/j.automatica.2020.109272_b22
  article-title: Adaptive dynamic programming: An introduction
  publication-title: IEEE Computational Intelligence Magazine
  doi: 10.1109/MCI.2009.932261
– volume: 11
  start-page: 706
  issue: 3
  year: 2014
  ident: 10.1016/j.automatica.2020.109272_b9
  article-title: Integral reinforcement learning for linear continuous-time zero-sum games with completely unknown dynamics
  publication-title: IEEE Transactions on Automation Science and Engineering
  doi: 10.1109/TASE.2014.2300532
– year: 2003
  ident: 10.1016/j.automatica.2020.109272_b18
– volume: 53
  start-page: 2280
  issue: 10
  year: 2008
  ident: 10.1016/j.automatica.2020.109272_b7
  article-title: Computing the positive stabilizing solution to algebraic Riccati equations with an indefinite quadratic term via a recursive method
  publication-title: IEEE Transactions on Automatic Control
  doi: 10.1109/TAC.2008.2006108
– volume: 41
  start-page: 14
  issue: 1
  year: 2011
  ident: 10.1016/j.automatica.2020.109272_b8
  article-title: Reinforcement learning for partially observable dynamic processes: adaptive dynamic programming using measured output data
  publication-title: IEEE Transactions on Systems, Man and Cybernetics, Part B (Cybernetics)
  doi: 10.1109/TSMCB.2010.2043839
– volume: 33
  start-page: 300
  year: 2019
  ident: 10.1016/j.automatica.2020.109272_b13
  article-title: Output-feedback H-infinity quadratic tracking control of linear systems using reinforcement learning
  publication-title: International Journal of Adaptive Control and Signal Processing
  doi: 10.1002/acs.2830
– volume: 9
  start-page: 353
  issue: 3
  year: 2011
  ident: 10.1016/j.automatica.2020.109272_b21
  article-title: Adaptive dynamic programming for online solution of a zero-sum differential game
  publication-title: Journal of Control Theory and Applications
  doi: 10.1007/s11768-011-0166-4
– volume: 27
  start-page: 444
  issue: 2
  year: 2015
  ident: 10.1016/j.automatica.2020.109272_b24
  article-title: Data-driven zero-sum neuro-optimal control for a class of continuous-time unknown nonlinear systems with disturbance using ADP
  publication-title: IEEE Transactions on Neural Networks and Learning systems
  doi: 10.1109/TNNLS.2015.2464080
– ident: 10.1016/j.automatica.2020.109272_b17
  doi: 10.23919/ACC.2018.8431290
– year: 2008
  ident: 10.1016/j.automatica.2020.109272_b1
– volume: 47
  start-page: 207
  issue: 1
  year: 2011
  ident: 10.1016/j.automatica.2020.109272_b27
  article-title: An iterative adaptive dynamic programming method for solving a class of nonlinear zero-sum differential games
  publication-title: Automatica
  doi: 10.1016/j.automatica.2010.10.033
– volume: 72
  start-page: 37
  year: 2016
  ident: 10.1016/j.automatica.2020.109272_b5
  article-title: Output-feedback adaptive optimal control of interconnected systems based on robust adaptive dynamic programming
  publication-title: Automatica
  doi: 10.1016/j.automatica.2016.05.008
– volume: 71
  start-page: 348
  year: 2016
  ident: 10.1016/j.automatica.2020.109272_b3
  article-title: Value iteration and adaptive dynamic programming for data-driven adaptive optimal control design
  publication-title: Automatica
  doi: 10.1016/j.automatica.2016.05.003
– volume: 46
  start-page: 2401
  issue: 11
  year: 2016
  ident: 10.1016/j.automatica.2020.109272_b12
  article-title: Optimal output-feedback control of unknown continuous-time linear systems using off-policy reinforcement learning
  publication-title: IEEE Transactions on Cybernetics
  doi: 10.1109/TCYB.2015.2477810
– volume: 26
  start-page: 3314
  issue: 12
  year: 2015
  ident: 10.1016/j.automatica.2020.109272_b4
  article-title: Robust adaptive dynamic programming of two-player zero-sum games for continuous-time linear systems
  publication-title: IEEE Transactions on Neural Networks and Learning Systems
  doi: 10.1109/TNNLS.2015.2461452
– volume: 222
  start-page: 472
  year: 2013
  ident: 10.1016/j.automatica.2020.109272_b25
  article-title: Simultaneous policy update algorithms for learning the solution of linear continuous-time H-infinity state feedback control
  publication-title: Information Sciences
  doi: 10.1016/j.ins.2012.08.012
– volume: 47
  start-page: 683
  issue: 3
  year: 2016
  ident: 10.1016/j.automatica.2020.109272_b29
  article-title: An event-triggered ADP control approach for continuous-time system with unknown internal states
  publication-title: IEEE Transactions on Cybernetics
  doi: 10.1109/TCYB.2016.2523878
– volume: 47
  start-page: 1556
  issue: 8
  year: 2011
  ident: 10.1016/j.automatica.2020.109272_b20
  article-title: Multi-player non-zero-sum games: Online adaptive learning solution of coupled Hamilton–Jacobi equations
  publication-title: Automatica
  doi: 10.1016/j.automatica.2011.03.005
– start-page: 7610
  year: 2015
  ident: 10.1016/j.automatica.2020.109272_b2
  article-title: Data-driven robust optimal control design for uncertain cascaded systems using value iteration
– year: 1989
  ident: 10.1016/j.automatica.2020.109272_b14
– volume: 95
  start-page: 213
  year: 2018
  ident: 10.1016/j.automatica.2020.109272_b16
  article-title: Output feedback Q-learning for discrete-time linear zero-sum games with application to the H-infinity control
  publication-title: Automatica
  doi: 10.1016/j.automatica.2018.05.027
– volume: 86
  start-page: 1554
  issue: 9
  year: 2013
  ident: 10.1016/j.automatica.2020.109272_b10
  article-title: Neural network observer based optimal control for unknown nonlinear systems using adaptive dynamic programming
  publication-title: International Journal of Control
  doi: 10.1080/00207179.2013.790562
– volume: 47
  start-page: 3331
  issue: 10
  year: 2016
  ident: 10.1016/j.automatica.2020.109272_b26
  article-title: Discrete-time nonzero-sum games for multiplayer using policy iteration based adaptive dynamic programming algorithms
  publication-title: IEEE Transactions on Cybernetics
  doi: 10.1109/TCYB.2016.2611613
– volume: 48
  start-page: 2699
  issue: 10
  year: 2012
  ident: 10.1016/j.automatica.2020.109272_b6
  article-title: Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics
  publication-title: Automatica
  doi: 10.1016/j.automatica.2012.06.096
– volume: 29
  start-page: 957
  issue: 4
  year: 2018
  ident: 10.1016/j.automatica.2020.109272_b23
  article-title: Adaptive dynamic programming for discrete-time zero-sum games
  publication-title: IEEE Transactions on Neural Networks and Learning Systems
  doi: 10.1109/TNNLS.2016.2638863
– volume: 45
  start-page: 65
  issue: 1
  year: 2014
  ident: 10.1016/j.automatica.2020.109272_b11
  article-title: Off-policy reinforcement learning for H∞ control design
  publication-title: IEEE Transactions on Cybernetics
  doi: 10.1109/TCYB.2014.2319577
– volume: 46
  start-page: 854
  issue: 3
  year: 2016
  ident: 10.1016/j.automatica.2020.109272_b28
  article-title: Experience replay for optimal control of nonzero-sum game systems with unknown dynamics
  publication-title: IEEE Transactions on Cybernetics
  doi: 10.1109/TCYB.2015.2488680
– volume: 62
  start-page: 2736
  issue: 6
  year: 2016
  ident: 10.1016/j.automatica.2020.109272_b15
  article-title: Stability analysis of discrete-time infinite-horizon optimal control with discounted cost
  publication-title: IEEE Transactions on Automatic Control
  doi: 10.1109/TAC.2016.2616644
SSID ssj0004182
Score 2.517005
Snippet This paper addresses the problem of finding optimal output feedback strategies for solving linear differential zero-sum games using a model-free approach based...
SourceID crossref
elsevier
SourceType Enrichment Source
Index Database
Publisher
StartPage 109272
SubjectTerms Adaptive dynamic programming
Approximate dynamic programming
Output feedback
Zero-sum games
Title Output feedback adaptive dynamic programming for linear differential zero-sum games
URI https://dx.doi.org/10.1016/j.automatica.2020.109272
Volume 122
WOSCitedRecordID wos000598166000022&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVESC
  databaseName: ScienceDirect database
  customDbUrl:
  eissn: 1873-2836
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0004182
  issn: 0005-1098
  databaseCode: AIEXJ
  dateStart: 19950101
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV3PT9swGLU24MAOiG2gMcbkw25VUOLEsStOFWKCHdgETKp2iRz_QAWWViVFhb-ez7GdZgwJhrRLFCW14-a9fPliPb8PoS9GGaW4MRGExSzKuGFRSQiPtErTpMySBM42xSbY8TEfDvs_vKHCdVNOgFUVn8_7k_8KNRwDsO3S2X-Au-0UDsA-gA5bgB22zwL--6yezOqegddSKeRlTygxafRByhWfD5Ks30FDaRNNMW1LpdR2Dv1OT8cRjLp3LsISkWBVO6vHjc2raJxK504c304nnIzubhqBwOktpLKDqxEQQKhW9uMcC36N7eLh7oQD6Yo3QhC15qWueHQbRN3qYh8G4SxxFXn-itBusuDC6nP8aHftRXYXTf40xX7wsmolhEGddlEseipsT4Xr6TVaJoz2IdAtD44Oht8Wi2UT7izk_b_w8i4n-nt8VI_nLJ085GwdrfkPCDxwwL9Fr3T1Dr3p2Eq-R6eOAjhQAAcKYE8B3KEABgCxowDuUgAHCuCGAhvo59eDs_3DyBfPiCSE6ToqIRGUSS6JYJr1JTMqLkmuKE8zlmaSSgOpONdCxDkEdUpVnqiEGGrikksh8nQTLVXjSn9AWMMPUlFqkRG4e7HkcVpaY0L4NKBU0HwLsXB7Cumd5W2Bk6viKZC2UNK2nDh3lWe02QsIFD5LdNlfARR7svXHF1xxG60unoNPaKmezvQOWpE39eh6-tnz6x5DHY8X
linkProvider Elsevier
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Output+feedback+adaptive+dynamic+programming+for+linear+differential+zero-sum+games&rft.jtitle=Automatica+%28Oxford%29&rft.au=Rizvi%2C+Syed+Ali+Asad&rft.au=Lin%2C+Zongli&rft.date=2020-12-01&rft.issn=0005-1098&rft.volume=122&rft.spage=109272&rft_id=info:doi/10.1016%2Fj.automatica.2020.109272&rft.externalDBID=n%2Fa&rft.externalDocID=10_1016_j_automatica_2020_109272
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0005-1098&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0005-1098&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0005-1098&client=summon