Output feedback adaptive dynamic programming for linear differential zero-sum games

This paper addresses the problem of finding optimal output feedback strategies for solving linear differential zero-sum games using a model-free approach based on adaptive dynamic programming (ADP). In contrast to their discrete-time counterparts, differential games involve continuous-time dynamics...

Full description

Saved in:
Bibliographic Details
Published in:Automatica (Oxford) Vol. 122; p. 109272
Main Authors: Rizvi, Syed Ali Asad, Lin, Zongli
Format: Journal Article
Language:English
Published: Elsevier Ltd 01.12.2020
Subjects:
ISSN:0005-1098, 1873-2836
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:This paper addresses the problem of finding optimal output feedback strategies for solving linear differential zero-sum games using a model-free approach based on adaptive dynamic programming (ADP). In contrast to their discrete-time counterparts, differential games involve continuous-time dynamics and existing ADP approaches to their solutions require access to full measurement of the internal state. This difficulty is due to the fact that direct translation of the discrete-time output feedback ADP results requires derivatives of the input and output measurements, which is generally prohibitive in practice. This work aims to overcome this difficulty and presents a new embedded filtering based observer approach towards designing output feedback ADP algorithms for solving the differential zero-sum game problem. Two output feedback ADP algorithms based respectively on policy iteration and value iteration are developed. The proposed scheme is completely online in nature and works without requiring information of the system dynamics. In addition, this work also addresses the excitation bias problem encountered in output feedback ADP methods, which typically requires a discounting factor for its mitigation. We show that the proposed scheme is bias-free, and therefore, does not require a discounting factor. It is shown that the proposed algorithms converge to the solution obtained by solving the game algebraic Riccati equation. Two numerical examples are demonstrated to validate the proposed scheme.
ISSN:0005-1098
1873-2836
DOI:10.1016/j.automatica.2020.109272