MATD3 with multiple heterogeneous sub-networks for multi-agent encirclement-combat task

Based on the background of a multi-agent game with limited attack and defense capabilities and communication range, a game model is established to study the encirclement-combat problem of a red agent team against its blue target agent in this paper. Under the actor-critic framework of twin delayed m...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:The Journal of supercomputing Ročník 81; číslo 1; s. 279
Hlavní autori: Yuxin, Zhang, Enjiao, Zhao, Hong, Liang, Wentao, Zhou
Médium: Journal Article
Jazyk:English
Vydavateľské údaje: New York Springer Nature B.V 01.01.2025
Predmet:
ISSN:0920-8542, 1573-0484
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Popis
Shrnutí:Based on the background of a multi-agent game with limited attack and defense capabilities and communication range, a game model is established to study the encirclement-combat problem of a red agent team against its blue target agent in this paper. Under the actor-critic framework of twin delayed multi-agent deep deterministic policy gradient (MATD3) algorithm, the original MATD3 algorithm is improved according to the characteristics of the game scenario to solve the problem of agents’ number attenuation, sparse reward, and high extraction frequency of invalid experience in the original algorithm. The simulation shows that the algorithm designed in this paper has improved the convergence speed, learning efficiency, and stability compared with the original algorithm. The sub-networks make the algorithm more suitable for the game scenario where the number of agents is dynamically declining; the reward potential function and prioritized experience replay (PER) based on the importance weight improve the refinement of the difference between experiences and the utilization rate of superior experiences.
Bibliografia:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0920-8542
1573-0484
DOI:10.1007/s11227-024-06756-9