Joint Codebook Selection and MCS Adaptation for MmWave eMBB Services Based on Deep Reinforcement Learning

This article investigates the joint codebook selection and modulation-coding-scheme (MCS) adaptation issue for the enhanced mobile broadband (eMBB) service in millimeter-wave (mmWave) cellular systems. The proposed scheme guarantees efficient mmWave eMBB service through an intelligent joint codebook...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:IEEE internet of things journal Ročník 11; číslo 19; s. 31545 - 31560
Hlavní autoři: Ye, Xiaowen, Fu, Liqun, Cioffi, John M.
Médium: Journal Article
Jazyk:angličtina
Vydáno: Piscataway IEEE 01.10.2024
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Témata:
ISSN:2327-4662, 2327-4662
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:This article investigates the joint codebook selection and modulation-coding-scheme (MCS) adaptation issue for the enhanced mobile broadband (eMBB) service in millimeter-wave (mmWave) cellular systems. The proposed scheme guarantees efficient mmWave eMBB service through an intelligent joint codebook selection and MCS adaptation scheme that exploits deep reinforcement learning (DRL), referred to as DeepCM. DeepCM's objective maximizes the transmission data rate while satisfying a target block error rate (BLER) constraint. A first step formulates this joint problem into a two-time-scale system that performs MCS adaptation on a small-time scale, whereas a second step optimizes the codebook on a large-time scale. DeepCM introduces a new DRL algorithm, termed dual-deep Q-network (DQN), by incorporating the operations on two time scales into the original DQN. Dual-DQN essentially enables the operations on different time scales to benefit from each other, through closed-loop decision guidance and reward evaluation. Thereafter, to fulfill the preset BLER constraint, DeepCM uses a constrained <inline-formula> <tex-math notation="LaTeX">\epsilon </tex-math></inline-formula>-greedy strategy for decision-making and further modifies the conventional DRL training mechanism. Basically, DeepCM continuously adjusts the agent's feasible-action space toward the system objective. With the constrained dual-DQN, DeepCM can attain its goal even without any prior network information. Simulation results show that DeepCM, compared with Thompson Sampling-DRL, DRL-OLLA, and TS2 schemes, guarantees the target BLER requirement while yielding a much higher data rate. Various simulations demonstrate the powerful robustness of DeepCM under miscellaneous scenarios. Furthermore, DeepCM can handle well dynamic target-BLER change.
Bibliografie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:2327-4662
2327-4662
DOI:10.1109/JIOT.2024.3419900