Regret Lower Bounds for Unbiased Adaptive Control of Linear Quadratic Regulators

Gespeichert in:
Bibliographische Detailangaben
Titel: Regret Lower Bounds for Unbiased Adaptive Control of Linear Quadratic Regulators
Autoren: Ziemann, Ingvar, Sandberg, Henrik
Quelle: IEEE Control Systems Letters. 4(3):785-790
Schlagwörter: Adaptive control, Regulators, Convergence, Control theory, Reinforcement learning, Riccati equations, machine learning, estimation error, algorithm design and analysis
Beschreibung: We present lower bounds for the regret of adaptive control of the linear quadratic regulator. These are given in terms of problem specific expected regret lower bounds valid for unbiased policies linear in the state. Our approach is based on the insight that the adaptive control problem can, given our assumptions, be reduced to a sequential estimation problem. This enables the use of the Cramer-Rao information inequality which yields a scaling limit lower bound of logarithmic order. The bound features both information-theoretic and control-theoretic quantities. By leveraging existing results, we are able to show that the bound is tight in a special case.
Dateibeschreibung: electronic
Zugangs-URL: https://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-277665
https://kth.diva-portal.org/smash/get/diva2:1456341/FULLTEXT01.pdf
Datenbank: SwePub
Beschreibung
Abstract:We present lower bounds for the regret of adaptive control of the linear quadratic regulator. These are given in terms of problem specific expected regret lower bounds valid for unbiased policies linear in the state. Our approach is based on the insight that the adaptive control problem can, given our assumptions, be reduced to a sequential estimation problem. This enables the use of the Cramer-Rao information inequality which yields a scaling limit lower bound of logarithmic order. The bound features both information-theoretic and control-theoretic quantities. By leveraging existing results, we are able to show that the bound is tight in a special case.
ISSN:24751456
DOI:10.1109/LCSYS.2020.2982455