Joint Codebook Selection and MCS Adaptation for MmWave eMBB Services Based on Deep Reinforcement Learning
This article investigates the joint codebook selection and modulation-coding-scheme (MCS) adaptation issue for the enhanced mobile broadband (eMBB) service in millimeter-wave (mmWave) cellular systems. The proposed scheme guarantees efficient mmWave eMBB service through an intelligent joint codebook...
Uloženo v:
| Vydáno v: | IEEE internet of things journal Ročník 11; číslo 19; s. 31545 - 31560 |
|---|---|
| Hlavní autoři: | , , |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
Piscataway
IEEE
01.10.2024
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Témata: | |
| ISSN: | 2327-4662, 2327-4662 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Shrnutí: | This article investigates the joint codebook selection and modulation-coding-scheme (MCS) adaptation issue for the enhanced mobile broadband (eMBB) service in millimeter-wave (mmWave) cellular systems. The proposed scheme guarantees efficient mmWave eMBB service through an intelligent joint codebook selection and MCS adaptation scheme that exploits deep reinforcement learning (DRL), referred to as DeepCM. DeepCM's objective maximizes the transmission data rate while satisfying a target block error rate (BLER) constraint. A first step formulates this joint problem into a two-time-scale system that performs MCS adaptation on a small-time scale, whereas a second step optimizes the codebook on a large-time scale. DeepCM introduces a new DRL algorithm, termed dual-deep Q-network (DQN), by incorporating the operations on two time scales into the original DQN. Dual-DQN essentially enables the operations on different time scales to benefit from each other, through closed-loop decision guidance and reward evaluation. Thereafter, to fulfill the preset BLER constraint, DeepCM uses a constrained <inline-formula> <tex-math notation="LaTeX">\epsilon </tex-math></inline-formula>-greedy strategy for decision-making and further modifies the conventional DRL training mechanism. Basically, DeepCM continuously adjusts the agent's feasible-action space toward the system objective. With the constrained dual-DQN, DeepCM can attain its goal even without any prior network information. Simulation results show that DeepCM, compared with Thompson Sampling-DRL, DRL-OLLA, and TS2 schemes, guarantees the target BLER requirement while yielding a much higher data rate. Various simulations demonstrate the powerful robustness of DeepCM under miscellaneous scenarios. Furthermore, DeepCM can handle well dynamic target-BLER change. |
|---|---|
| Bibliografie: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ISSN: | 2327-4662 2327-4662 |
| DOI: | 10.1109/JIOT.2024.3419900 |