Online Distributed Algorithms for Online Noncooperative Games With Stochastic Cost Functions: High Probability Bound of Regrets

In this article, online noncooperative games without full decision information are studied, where the goal of players is to seek the Nash equilibria in a distributed manner. Different from the existing works on online noncooperative games, here we consider the case where the cost functions are stoch...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	IEEE transactions on automatic control Jg. 69; H. 12; S. 8860 - 8867
1. Verfasser:	Lu, Kaihong
Format:	Journal Article
Sprache:	Englisch
Veröffentlicht:	New York IEEE 01.12.2024 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Schlagworte:	Algorithms Cost function Distributed algorithms Dynamic regrets Game theory Games Heuristic algorithms high probability bound Nash equilibria seeking Nash equilibrium Noise measurement noncooperative game Random noise Stochastic processes
ISSN:	0018-9286, 1558-2523
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	In this article, online noncooperative games without full decision information are studied, where the goal of players is to seek the Nash equilibria in a distributed manner. Different from the existing works on online noncooperative games, here we consider the case where the cost functions are stochastic. In the problem, each player only has access to a noisy gradient of its own cost function and a local action set, and needs to make decisions before the current noisy gradient information is revealed. To handle this problem, we propose an online distributed stochastic mirror descent algorithm. The performance of the presented algorithm is measured by employing the dynamic regrets, where the offline benchmarks are to seek the Nash equilibrium point at each time. Particularly, a high probability bound of the dynamic regrets is proposed on the basis of the sub-Gaussian noise model. The result shows that if the variation in the Nash equilibrium sequence grows at a certain rate, then the regrets increase sublinearly with a high probability. Finally, simulations are worked out to demonstrate the effectiveness of our theoretical results.
Bibliographie:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	0018-9286 1558-2523
DOI:	10.1109/TAC.2024.3419018