A tractable online learning algorithm for the multinomial logit contextual bandit
•This work considers the dynamic assortment optimization problem.•The consumer choice is modelled via Multinomial logit contextual model.•The worst-case regret bound is free from the multiplicative problem dependent factor k and improves upon the previous bounds in the literature.•The k factor could...
Saved in:
| Published in: | European journal of operational research Vol. 310; no. 2; pp. 737 - 750 |
|---|---|
| Main Authors: | , , |
| Format: | Journal Article |
| Language: | English |
| Published: |
Elsevier B.V
16.10.2023
|
| Subjects: | |
| ISSN: | 0377-2217, 1872-6860 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Be the first to leave a comment!