A tractable online learning algorithm for the multinomial logit contextual bandit

•This work considers the dynamic assortment optimization problem.•The consumer choice is modelled via Multinomial logit contextual model.•The worst-case regret bound is free from the multiplicative problem dependent factor k and improves upon the previous bounds in the literature.•The k factor could...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	European journal of operational research Jg. 310; H. 2; S. 737 - 750
Hauptverfasser:	Agrawal, Priyank, Tulabandhula, Theja, Avadhanula, Vashist
Format:	Journal Article
Sprache:	Englisch
Veröffentlicht:	Elsevier B.V 16.10.2023
Schlagworte:	Multi-armed bandit Multinomial logit model OR in marketing Revenue management Sequential decision-making OR in marketing Revenue management Sequential decision-making Multi-armed bandit Multinomial logit model
ISSN:	0377-2217, 1872-6860
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Schreiben Sie den ersten Kommentar!