Learning Constrained Resource Allocation Policies in Wireless Control Systems

Emerging applications in IoT systems employ wireless communication networks to exchange data between spatially distributed components of a control system. As wireless networks are noisy and subject to packet losses - which might impact the operation of the control system - proper distribution of com...

Full description

Saved in:
Bibliographic Details
Published in:Proceedings of the IEEE Conference on Decision & Control pp. 2615 - 2621
Main Authors: Lima, Vinicius, Eisen, Mark, Ribeiro, Alejandro
Format: Conference Proceeding
Language:English
Published: IEEE 14.12.2020
Subjects:
ISSN:2576-2370
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Emerging applications in IoT systems employ wireless communication networks to exchange data between spatially distributed components of a control system. As wireless networks are noisy and subject to packet losses - which might impact the operation of the control system - proper distribution of communication resources among components of the wireless control system sharing the communication network is essential to maintain the system operating reliably. Here, in particular, we study settings in which the decision maker must meet additional constraints on the control system while distributing communication resources. The existence of constraints and the infinite dimensionality of the problem make the resource allocation problem challenging. To reduce the dimensionality of the problem, we parameterize the resource allocation policy in terms of neural networks - high capability approximators - and leverage reinforcement learning techniques to design policies that do not require knowledge of plant dynamics or communication models. We further reformulate the resource allocation problem in the dual domain to handle the constraints, which leads naturally to a primal-dual policy gradient algorithm that alternates between updating the policy parameters via reinforcement learning iterations and updating a dual variable that enforces constraint satisfaction. We conclude the paper with numerical simulations that show the strong performance of the learned allocation policies against baseline resource allocation solutions.
ISSN:2576-2370
DOI:10.1109/CDC42340.2020.9303805