Adherence Improves Cooperation in Sequential Social Dilemmas

Social dilemmas have guided research on mutual cooperation for decades, especially the two-person social dilemma. Most famously, Tit-for-Tat performs very well in tournaments of the Prisoner’s Dilemma. Nevertheless, they treat the options to cooperate or defect only as an atomic action, which cannot...

Celý popis

Uložené v:

Podrobná bibliografia
Vydané v:	Applied sciences Ročník 12; číslo 16; s. 8004
Hlavní autori:	Yuan, Yuyu, Guo, Ting, Zhao, Pengqian, Jiang, Hongpu
Médium:	Journal Article
Jazyk:	English
Vydavateľské údaje:	Basel MDPI AG 01.08.2022
Predmet:	Algorithms Behavior Cooperation counterfactual reasoning intrinsic reward multi-agent reinforcement learning multi-agent system sequential social dilemmas
ISSN:	2076-3417, 2076-3417
On-line prístup:	Získať plný text
Tagy:	Pridať tag Žiadne tagy, Buďte prvý, kto otaguje tento záznam!

Popis
Shrnutí:	Social dilemmas have guided research on mutual cooperation for decades, especially the two-person social dilemma. Most famously, Tit-for-Tat performs very well in tournaments of the Prisoner’s Dilemma. Nevertheless, they treat the options to cooperate or defect only as an atomic action, which cannot satisfy the complexity of the real world. In recent research, these options to cooperate or defect were temporally extended. Here, we propose a novel adherence-based multi-agent reinforcement learning algorithm for achieving cooperation and coordination by rewarding agents who adhere to other agents. The evaluation of adherence is based on counterfactual reasoning. During training, each agent observes the changes in the actions of other agents by replacing its current action, thereby calculating the degree of adherence of other agents to its behavior. Using adherence as an intrinsic reward enables agents to consider the collective, thus promoting cooperation. In addition, the adherence rewards of all agents are calculated in a decentralized way. We experiment in sequential social dilemma environments, and the results demonstrate the potential for the algorithm to enhance cooperation and coordination and significantly increase the scores of the deep RL agents.
Bibliografia:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	2076-3417 2076-3417
DOI:	10.3390/app12168004