CVAE-based Far-sighted Intention Inference for Opponent Modeling in Multi-agent Reinforcement Learning

Most interactive environments are non-stationary for agents, as the behaviors of their opponents continually change, which can impair the performance of reinforcement learning algorithms. This impairment can be alleviated by modeling opponents to predict their future movements. To predict more preci...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Chinese Control Conference s. 5847 - 5851
Hlavní autoři: Pei, Yu, Xu, XiaoPeng, Liu, Zhong, Wang, Kuo, Zhu, Li, Wang, Dong
Médium: Konferenční příspěvek
Jazyk:angličtina
Vydáno: Technical Committee on Control Theory, Chinese Association of Automation 28.07.2024
Témata:
ISSN:1934-1768
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Most interactive environments are non-stationary for agents, as the behaviors of their opponents continually change, which can impair the performance of reinforcement learning algorithms. This impairment can be alleviated by modeling opponents to predict their future movements. To predict more precisely and further into the future compared to current opponent modeling approaches, we developed a CVAE-based Far-sighted Intention Inference method (CFI2), including a Trajectory Prediction Module (TPM) and a Trajectory Analysis Module (TAM). TPM synthesizes complex interactions between agents using an attention mechanism and achieves robust far-sighted prediction with a conditional variational autoencoder (CVAE). TAM enables agents to analyze trajectories by assigning attention to the predicted movements of their opponents, corresponding to their impacts on the future. We conducted experiments in Drone Game where CFI2 achieves significantly higher rewards more rapidly than baseline methods. It is proven that agents can make better decisions by incorporating long-term predictions, just like the decision-making process of humans.
ISSN:1934-1768
DOI:10.23919/CCC63176.2024.10662404