Explainable data-driven Q-learning control for a class of discrete-time linear autonomous systems

Uloženo v:
Podrobná bibliografie
Název: Explainable data-driven Q-learning control for a class of discrete-time linear autonomous systems
Autoři: Perrusquía, Adolfo, Zou, Mengbang, Guo, Weisi
Informace o vydavateli: Elsevier
//www.sciencedirect.com/science/article/pii/S0020025524011976
Rok vydání: 2024
Sbírka: Cranfield University: Collection of E-Research - CERES
Témata: Q-learning, State-transition function, Explainable Q-learning (XQL), Control policy
Popis: Explaining what a reinforcement learning (RL) control agent learns play a crucial role in the safety critical control domain. Most of the approaches in the state-of-the-art focused on imitation learning methods that uncover the hidden reward function of a given control policy. However, these approaches do not uncover what the RL agent learns effectively from the agent-environment interaction. The policy learned by the RL agent depends in how good the state transition mapping is inferred from the data. When the state transition mapping is wrongly inferred implies that the RL agent is not learning properly. This can compromise the safety of the surrounding environment and the agent itself. In this paper, we aim to uncover the elements learned by data-driven RL control agents in a special class of discrete-time linear autonomous systems. Here, the approach aims to add a new explainable dimension to data-driven control approaches to increase their trust and safe deployment. We focus on the classical data-driven Q-learning algorithm and propose an explainable Q-learning (XQL) algorithm that can be further expanded to other data-driven RL control agents. Simulation experiments are conducted to observe the effectiveness of the proposed approach under different scenarios using several discrete-time models of autonomous platforms. ; Information Sciences
Druh dokumentu: article in journal/newspaper
Popis souboru: Article number 121283; application/pdf
Jazyk: English
Relation: https://doi.org/10.1016/j.ins.2024.121283; https://dspace.lib.cranfield.ac.uk/handle/1826/22771
DOI: 10.1016/j.ins.2024.121283
Dostupnost: https://doi.org/10.1016/j.ins.2024.121283
https://dspace.lib.cranfield.ac.uk/handle/1826/22771
Rights: Attribution 4.0 International ; http://creativecommons.org/licenses/by/4.0/
Přístupové číslo: edsbas.1EDF1C0E
Databáze: BASE
Popis
Abstrakt:Explaining what a reinforcement learning (RL) control agent learns play a crucial role in the safety critical control domain. Most of the approaches in the state-of-the-art focused on imitation learning methods that uncover the hidden reward function of a given control policy. However, these approaches do not uncover what the RL agent learns effectively from the agent-environment interaction. The policy learned by the RL agent depends in how good the state transition mapping is inferred from the data. When the state transition mapping is wrongly inferred implies that the RL agent is not learning properly. This can compromise the safety of the surrounding environment and the agent itself. In this paper, we aim to uncover the elements learned by data-driven RL control agents in a special class of discrete-time linear autonomous systems. Here, the approach aims to add a new explainable dimension to data-driven control approaches to increase their trust and safe deployment. We focus on the classical data-driven Q-learning algorithm and propose an explainable Q-learning (XQL) algorithm that can be further expanded to other data-driven RL control agents. Simulation experiments are conducted to observe the effectiveness of the proposed approach under different scenarios using several discrete-time models of autonomous platforms. ; Information Sciences
DOI:10.1016/j.ins.2024.121283