Opportunities for reinforcement learning in stochastic dynamic vehicle routing

There has been a paradigm-shift in urban logistic services in the last years; demand for real-time, instant mobility and delivery services grows. This poses new challenges to logistic service providers as the underlying stochastic dynamic vehicle routing problems (SDVRPs) require anticipatory real-t...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Computers & operations research Jg. 150; S. 106071
Hauptverfasser: Hildebrandt, Florentin D., Thomas, Barrett W., Ulmer, Marlin W.
Format: Journal Article
Sprache:Englisch
Veröffentlicht: Elsevier Ltd 01.02.2023
Schlagworte:
ISSN:0305-0548, 1873-765X
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:There has been a paradigm-shift in urban logistic services in the last years; demand for real-time, instant mobility and delivery services grows. This poses new challenges to logistic service providers as the underlying stochastic dynamic vehicle routing problems (SDVRPs) require anticipatory real-time routing actions. The complexity of finding efficient routing actions is multiplied by the challenge of evaluating such actions with respect to their effectiveness given future dynamism and uncertainty. Reinforcement learning (RL) is a promising tool for evaluating actions but it is not designed for searching the complex and combinatorial action space. Thus, past work on RL for SDVRP has either restricted the action space, that is solving only subproblems by RL and everything else by established heuristics, or focused on problems that reduce to resource allocation problems. For solving real-world SDVRPs, new strategies are required that address the combined challenge of combinatorial, constrained action space and future uncertainty, but as our findings suggest, such strategies are essentially non-existing. Our survey paper shows that past work relied either on action-space restriction or avoided routing actions entirely and highlights opportunities for more holistic solutions. •We discuss the challenges and opportunities for RL in SDVRP.•We carefully review and classify the existing literature.•We use examples and pseudocode for illustration throughout the papers.•We present two specific and promising means to improve future work in this domain.
ISSN:0305-0548
1873-765X
DOI:10.1016/j.cor.2022.106071