Best-Response Multiagent Learning in Non-Stationary Environments

This paper investigates a relatively new direction in Multiagent Reinforcement Learning. Most multiagent learning techniques focus on Nash equilibria as elements of both the learning algorithm and its evaluation criteria. In contrast, we propose a multiagent learning algorithm that is optimal in the...

Celý popis

Uloženo v:

Podrobná bibliografie
Vydáno v:	Autonomous Agents and Multiagent Systems: Proceedings, 3rd International Joint Conference, New York City, New York, 2004. s. 506 - 513
Hlavní autoři:	Weinberg, Michael, Rosenschein, Jeffrey S.
Médium:	Konferenční příspěvek
Jazyk:	angličtina
Vydáno:	Washington, DC, USA IEEE Computer Society 19.07.2004 IEEE
Edice:	ACM Conferences
Témata:	Algorithm design and analysis Autonomous agents Computer science Computing methodologies > Artificial intelligence > Distributed artificial intelligence > Multi-agent systems Computing methodologies > Artificial intelligence > Planning and scheduling Computing methodologies > Machine learning > Learning paradigms Computing methodologies > Machine learning > Machine learning approaches > Markov decision processes Convergence Game theory Learning Permission Process control Stochastic processes Theory of computation > Theory and algorithms for application domains > Machine learning theory > Markov decision processes
ISBN:	9781581138641, 1581138644
On-line přístup:	Získat plný text
Tagy:	Přidat tag Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!

Popis
Shrnutí:	This paper investigates a relatively new direction in Multiagent Reinforcement Learning. Most multiagent learning techniques focus on Nash equilibria as elements of both the learning algorithm and its evaluation criteria. In contrast, we propose a multiagent learning algorithm that is optimal in the sense of finding a best-response policy, rather than in reaching an equilibrium. We present the first learning algorithm that is provably optimal against restricted classes of non-stationary opponents. The algorithm infers an accurate model of the opponentýs non-stationary strategy, and simultaneously creates a best-response policy against that strategy. Our learning algorithm works within the very general framework of n-player, general-sum stochastic games, and learns both the game structure and its associated optimal policy.
Bibliografie:	SourceType-Conference Papers & Proceedings-1 ObjectType-Conference Paper-1 content type line 25
ISBN:	9781581138641 1581138644
DOI:	10.5555/1018410.1018798