Learning Constrained Resource Allocation Policies in Wireless Control Systems

Emerging applications in IoT systems employ wireless communication networks to exchange data between spatially distributed components of a control system. As wireless networks are noisy and subject to packet losses - which might impact the operation of the control system - proper distribution of com...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Proceedings of the IEEE Conference on Decision & Control s. 2615 - 2621
Hlavní autoři: Lima, Vinicius, Eisen, Mark, Ribeiro, Alejandro
Médium: Konferenční příspěvek
Jazyk:angličtina
Vydáno: IEEE 14.12.2020
Témata:
ISSN:2576-2370
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Emerging applications in IoT systems employ wireless communication networks to exchange data between spatially distributed components of a control system. As wireless networks are noisy and subject to packet losses - which might impact the operation of the control system - proper distribution of communication resources among components of the wireless control system sharing the communication network is essential to maintain the system operating reliably. Here, in particular, we study settings in which the decision maker must meet additional constraints on the control system while distributing communication resources. The existence of constraints and the infinite dimensionality of the problem make the resource allocation problem challenging. To reduce the dimensionality of the problem, we parameterize the resource allocation policy in terms of neural networks - high capability approximators - and leverage reinforcement learning techniques to design policies that do not require knowledge of plant dynamics or communication models. We further reformulate the resource allocation problem in the dual domain to handle the constraints, which leads naturally to a primal-dual policy gradient algorithm that alternates between updating the policy parameters via reinforcement learning iterations and updating a dual variable that enforces constraint satisfaction. We conclude the paper with numerical simulations that show the strong performance of the learned allocation policies against baseline resource allocation solutions.
ISSN:2576-2370
DOI:10.1109/CDC42340.2020.9303805