Scheduling Real-time Wireless Traffic: A Network-aided Offline Reinforcement Learning Approach
Real-time traffic has stringent requirements in terms of latency, and deadline guarantees on packet delivery play a vital role in real-time IoT applications. Deadline-aware wireless scheduling of real-time traffic has been a long-standing open problem, despite significant efforts using analytical me...
Uloženo v:
| Vydáno v: | IEEE internet of things journal Ročník 10; číslo 24; s. 1 |
|---|---|
| Hlavní autoři: | , , , , |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
Piscataway
IEEE
15.12.2023
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Témata: | |
| ISSN: | 2327-4662, 2327-4662 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Shrnutí: | Real-time traffic has stringent requirements in terms of latency, and deadline guarantees on packet delivery play a vital role in real-time IoT applications. Deadline-aware wireless scheduling of real-time traffic has been a long-standing open problem, despite significant efforts using analytical methods. Departing from the conventional approaches, this work studies deadline-aware traffic scheduling by taking an offline reinforcement learning (RL) approach to train scheduling algorithms, ready to be used for online scheduling. To address the challenges therein, we propose a Network-Aided Offline RL (NA-ORL) framework for deadline-aware scheduling, by making use of the fact that the network dynamics follows a well-defined physics model. Specifically, in NA-ORL the initialization of the scheduling policy is obtained through behavior cloning with a good model-based scheduling algorithm, and the network-aided actor-critic (A-C) method is utilized to train a better scheduling policy with carefully designed states and reward function, thanks to its nature of policy improvement. Building on NA-ORL, we further devise a Network-Aided Offline Meta-RL (NA-MRL) algorithm to deal with the non-stationary network dynamics. Extensive experimental results demonstrate that the proposed NA-ORL and NA-MRL algorithms can achieve better performance over Adaptive Mixing over Non-Dominated links (AMIX-ND) and Largest-Deficit-First (LDF), in various scenarios for the deadline-aware wireless scheduling. |
|---|---|
| Bibliografie: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ISSN: | 2327-4662 2327-4662 |
| DOI: | 10.1109/JIOT.2023.3304969 |