Neural Episodic Control-Based Adaptive Modulation and Coding Scheme for Inter-Satellite Communication Link
Inter-satellite links (ISLs) play an important role in the global navigation satellite system (GNSS), which is known as one of the key technologies for the next generation of navigation satellite systems. Deep reinforcement learning algorithms have achieved significant improvement over various wirel...
Uloženo v:
| Vydáno v: | IEEE access Ročník 9; s. 159175 - 159186 |
|---|---|
| Hlavní autoři: | , , , , , , |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
Piscataway
IEEE
2021
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Témata: | |
| ISSN: | 2169-3536, 2169-3536 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Shrnutí: | Inter-satellite links (ISLs) play an important role in the global navigation satellite system (GNSS), which is known as one of the key technologies for the next generation of navigation satellite systems. Deep reinforcement learning algorithms have achieved significant improvement over various wireless communications systems. However, it has been reported that deep Q network (DQN) algorithm requires an enormous number of trials. To resolve this problem, in this paper we propose an adaptive modulation and coding scheme based on a neural episodic control (NEC) algorithm, which is one of deep reinforcement learning algorithms. The proposed scheme adjusts the modulation and coding scheme region boundaries with a differentiable neural dictionary of the NEC agent, which enables the effective integration of the previous experience. In addition, we propose a step-size varying algorithm to encourage the NEC agent to efficiently approach the suboptimal state. We confirm that the proposed scheme can reduce the number of trials to 1/8 compared to the previous work of the DQN-based adaptive modulation scheme. It is also confirmed that the proposed scheme requires the number of trials to the suboptimal state 1/5 of the fixed step-size dueling double DQN and 1/7 of the fixed step-size double DQN-based schemes, respectively. To further evaluate the proposed scheme, we employ an online learning loss evaluation algorithm that calculates the loss in time-step based on interaction records of the reinforcement learning agent and the derived modulation and coding scheme region boundaries. |
|---|---|
| Bibliografie: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ISSN: | 2169-3536 2169-3536 |
| DOI: | 10.1109/ACCESS.2021.3131714 |