Multi-agent reinforcement learning via distributed MPC as a function approximator

This paper presents a novel approach to multi-agent reinforcement learning (RL) for linear systems with convex polytopic constraints. Existing work on RL has demonstrated the use of model predictive control (MPC) as a function approximator for the policy and value functions. The current paper is the...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Automatica (Oxford) Jg. 167; S. 111803
Hauptverfasser:	Mallick, Samuel, Airaldi, Filippo, Dabiri, Azita, De Schutter, Bart
Format:	Journal Article
Sprache:	Englisch
Veröffentlicht:	Elsevier Ltd 01.09.2024
Schlagworte:	ADMM Distributed model predictive control Multi-agent reinforcement learning Networked systems Distributed model predictive control Networked systems Multi-agent reinforcement learning ADMM
ISSN:	0005-1098
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	This paper presents a novel approach to multi-agent reinforcement learning (RL) for linear systems with convex polytopic constraints. Existing work on RL has demonstrated the use of model predictive control (MPC) as a function approximator for the policy and value functions. The current paper is the first work to extend this idea to the multi-agent setting. We propose the use of a distributed MPC scheme as a function approximator, with a structure allowing for distributed learning and deployment. We then show that Q-learning updates can be performed distributively without introducing nonstationarity, by reconstructing a centralized learning update. The effectiveness of the approach is demonstrated on a numerical example.
ISSN:	0005-1098
DOI:	10.1016/j.automatica.2024.111803