MATD3 with multiple heterogeneous sub-networks for multi-agent encirclement-combat task

Based on the background of a multi-agent game with limited attack and defense capabilities and communication range, a game model is established to study the encirclement-combat problem of a red agent team against its blue target agent in this paper. Under the actor-critic framework of twin delayed m...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:The Journal of supercomputing Jg. 81; H. 1; S. 279
Hauptverfasser: Yuxin, Zhang, Enjiao, Zhao, Hong, Liang, Wentao, Zhou
Format: Journal Article
Sprache:Englisch
Veröffentlicht: New York Springer Nature B.V 01.01.2025
Schlagworte:
ISSN:0920-8542, 1573-0484
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Based on the background of a multi-agent game with limited attack and defense capabilities and communication range, a game model is established to study the encirclement-combat problem of a red agent team against its blue target agent in this paper. Under the actor-critic framework of twin delayed multi-agent deep deterministic policy gradient (MATD3) algorithm, the original MATD3 algorithm is improved according to the characteristics of the game scenario to solve the problem of agents’ number attenuation, sparse reward, and high extraction frequency of invalid experience in the original algorithm. The simulation shows that the algorithm designed in this paper has improved the convergence speed, learning efficiency, and stability compared with the original algorithm. The sub-networks make the algorithm more suitable for the game scenario where the number of agents is dynamically declining; the reward potential function and prioritized experience replay (PER) based on the importance weight improve the refinement of the difference between experiences and the utilization rate of superior experiences.
Bibliographie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0920-8542
1573-0484
DOI:10.1007/s11227-024-06756-9