Effective Communications: A Joint Learning and Communication Framework for Multi-Agent Reinforcement Learning Over Noisy Channels

We propose a novel formulation of the "effectiveness problem" in communications, put forth by Shannon and Weaver in their seminal work " The Mathematical Theory of Communication ", by considering multiple agents communicating over a noisy channel in order to achieve better coordi...

Full description

Saved in:
Bibliographic Details
Published in:IEEE journal on selected areas in communications Vol. 39; no. 8; pp. 2590 - 2603
Main Authors: Tung, Tze-Yang, Kobus, Szymon, Roig, Joan Pujol, Gunduz, Deniz
Format: Journal Article
Language:English
Published: New York IEEE 01.08.2021
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects:
ISSN:0733-8716, 1558-0008
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:We propose a novel formulation of the "effectiveness problem" in communications, put forth by Shannon and Weaver in their seminal work " The Mathematical Theory of Communication ", by considering multiple agents communicating over a noisy channel in order to achieve better coordination and cooperation in a multi-agent reinforcement learning (MARL) framework. Specifically, we consider a multi-agent partially observable Markov decision process (MA-POMDP), in which the agents, in addition to interacting with the environment, can also communicate with each other over a noisy communication channel. The noisy communication channel is considered explicitly as part of the dynamics of the environment, and the message each agent sends is part of the action that the agent can take. As a result, the agents learn not only to collaborate with each other but also to communicate "effectively" over a noisy channel. This framework generalizes both the traditional communication problem, where the main goal is to convey a message reliably over a noisy channel, and the "learning to communicate" framework that has received recent attention in the MARL literature, where the underlying communication channels are assumed to be error-free. We show via examples that the joint policy learned using the proposed framework is superior to that where the communication is considered separately from the underlying MA-POMDP. This is a very powerful framework, which has many real world applications, from autonomous vehicle planning to drone swarm control, and opens up the rich toolbox of deep reinforcement learning for the design of multi-user communication systems.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0733-8716
1558-0008
DOI:10.1109/JSAC.2021.3087248