Orthogonal Gradient Multi-Agent Deep Reinforcement Learning-based Algorithm for Spectrum Resource Optimization in UAV Swarm

In recent years, Unmanned Aerial Vehicle (UAV) swarm technology has made rapid progress, but communication problems remain a key factor limiting its widespread application. This study proposes an Orthogonal Gradient Multi-Agent Deep Reinforcement Learning-based Algorithm for Spectrum Resource Optimi...

Celý popis

Uloženo v:

Podrobná bibliografie
Vydáno v:	Physical communication Ročník 72; s. 102770
Hlavní autoři:	Lou, Dengke, Wan, Boyu, Zhang, Yu, Du, Yihang, Wan, Fayu, Chen, Yong
Médium:	Journal Article
Jazyk:	angličtina
Vydáno:	Elsevier B.V 01.10.2025
Témata:	Multi-agent Deep Reinforcement Learning (MADRL) Orthogonal Gradient (OG) Spectrum resource allocation Unmanned aerial vehicle (UAV) Unmanned aerial vehicle (UAV) Multi-agent Deep Reinforcement Learning (MADRL) Spectrum resource allocation Orthogonal Gradient (OG)
ISSN:	1874-4907
On-line přístup:	Získat plný text
Tagy:	Přidat tag Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!

Popis
Shrnutí:	In recent years, Unmanned Aerial Vehicle (UAV) swarm technology has made rapid progress, but communication problems remain a key factor limiting its widespread application. This study proposes an Orthogonal Gradient Multi-Agent Deep Reinforcement Learning-based Algorithm for Spectrum Resource Optimization in UAV swarm (OG-MADRL) to solve this problem. Specifically, to address the potential issue of gradient instability that may arise during the spectrum resource optimization process for UAV swarm, this study introduces an Orthogonal Gradient system (OG), which synergistically combines orthogonal initialization with global gradient clipping. This approach effectively alleviates both gradient explosion and vanishing phenomena, thereby enhancing training stability and accelerating convergence. To satisfy dynamic communication demands, a sophisticated reward mechanism is designed to enable flexible resource allocation to each UAV. To overcome the poor coordination in distributed training frameworks, the proposes algorithm is implemented within a Centralized Training and Distributed Execution (CTDE) framework, efficiently integrating global information to enable rapid responses to environmental changes. Simulation results show that, compared to existing algorithms, the OG-MADRL algorithm not only improves cluster throughput and enhances training stability but also exhibits superior adaptability and robustness in interference environment with dynamic communication demands.
ISSN:	1874-4907
DOI:	10.1016/j.phycom.2025.102770