Neural Episodic Control-Based Adaptive Modulation and Coding Scheme for Inter-Satellite Communication Link

Inter-satellite links (ISLs) play an important role in the global navigation satellite system (GNSS), which is known as one of the key technologies for the next generation of navigation satellite systems. Deep reinforcement learning algorithms have achieved significant improvement over various wirel...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:IEEE access Ročník 9; s. 159175 - 159186
Hlavní autoři: Lee, Donggu, Sun, Young Ghyu, Sim, Isaac, Kim, Jae-Hyun, Shin, Yoan, Kim, Dong In, Kim, Jin Young
Médium: Journal Article
Jazyk:angličtina
Vydáno: Piscataway IEEE 2021
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Témata:
ISSN:2169-3536, 2169-3536
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Inter-satellite links (ISLs) play an important role in the global navigation satellite system (GNSS), which is known as one of the key technologies for the next generation of navigation satellite systems. Deep reinforcement learning algorithms have achieved significant improvement over various wireless communications systems. However, it has been reported that deep Q network (DQN) algorithm requires an enormous number of trials. To resolve this problem, in this paper we propose an adaptive modulation and coding scheme based on a neural episodic control (NEC) algorithm, which is one of deep reinforcement learning algorithms. The proposed scheme adjusts the modulation and coding scheme region boundaries with a differentiable neural dictionary of the NEC agent, which enables the effective integration of the previous experience. In addition, we propose a step-size varying algorithm to encourage the NEC agent to efficiently approach the suboptimal state. We confirm that the proposed scheme can reduce the number of trials to 1/8 compared to the previous work of the DQN-based adaptive modulation scheme. It is also confirmed that the proposed scheme requires the number of trials to the suboptimal state 1/5 of the fixed step-size dueling double DQN and 1/7 of the fixed step-size double DQN-based schemes, respectively. To further evaluate the proposed scheme, we employ an online learning loss evaluation algorithm that calculates the loss in time-step based on interaction records of the reinforcement learning agent and the derived modulation and coding scheme region boundaries.
Bibliografie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:2169-3536
2169-3536
DOI:10.1109/ACCESS.2021.3131714