Learning Constrained Resource Allocation Policies in Wireless Control Systems
Emerging applications in IoT systems employ wireless communication networks to exchange data between spatially distributed components of a control system. As wireless networks are noisy and subject to packet losses - which might impact the operation of the control system - proper distribution of com...
Uloženo v:
| Vydáno v: | Proceedings of the IEEE Conference on Decision & Control s. 2615 - 2621 |
|---|---|
| Hlavní autoři: | , , |
| Médium: | Konferenční příspěvek |
| Jazyk: | angličtina |
| Vydáno: |
IEEE
14.12.2020
|
| Témata: | |
| ISSN: | 2576-2370 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Shrnutí: | Emerging applications in IoT systems employ wireless communication networks to exchange data between spatially distributed components of a control system. As wireless networks are noisy and subject to packet losses - which might impact the operation of the control system - proper distribution of communication resources among components of the wireless control system sharing the communication network is essential to maintain the system operating reliably. Here, in particular, we study settings in which the decision maker must meet additional constraints on the control system while distributing communication resources. The existence of constraints and the infinite dimensionality of the problem make the resource allocation problem challenging. To reduce the dimensionality of the problem, we parameterize the resource allocation policy in terms of neural networks - high capability approximators - and leverage reinforcement learning techniques to design policies that do not require knowledge of plant dynamics or communication models. We further reformulate the resource allocation problem in the dual domain to handle the constraints, which leads naturally to a primal-dual policy gradient algorithm that alternates between updating the policy parameters via reinforcement learning iterations and updating a dual variable that enforces constraint satisfaction. We conclude the paper with numerical simulations that show the strong performance of the learned allocation policies against baseline resource allocation solutions. |
|---|---|
| ISSN: | 2576-2370 |
| DOI: | 10.1109/CDC42340.2020.9303805 |