Opportunities for reinforcement learning in stochastic dynamic vehicle routing

There has been a paradigm-shift in urban logistic services in the last years; demand for real-time, instant mobility and delivery services grows. This poses new challenges to logistic service providers as the underlying stochastic dynamic vehicle routing problems (SDVRPs) require anticipatory real-t...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:Computers & operations research Ročník 150; s. 106071
Hlavní autori: Hildebrandt, Florentin D., Thomas, Barrett W., Ulmer, Marlin W.
Médium: Journal Article
Jazyk:English
Vydavateľské údaje: Elsevier Ltd 01.02.2023
Predmet:
ISSN:0305-0548, 1873-765X
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Popis
Shrnutí:There has been a paradigm-shift in urban logistic services in the last years; demand for real-time, instant mobility and delivery services grows. This poses new challenges to logistic service providers as the underlying stochastic dynamic vehicle routing problems (SDVRPs) require anticipatory real-time routing actions. The complexity of finding efficient routing actions is multiplied by the challenge of evaluating such actions with respect to their effectiveness given future dynamism and uncertainty. Reinforcement learning (RL) is a promising tool for evaluating actions but it is not designed for searching the complex and combinatorial action space. Thus, past work on RL for SDVRP has either restricted the action space, that is solving only subproblems by RL and everything else by established heuristics, or focused on problems that reduce to resource allocation problems. For solving real-world SDVRPs, new strategies are required that address the combined challenge of combinatorial, constrained action space and future uncertainty, but as our findings suggest, such strategies are essentially non-existing. Our survey paper shows that past work relied either on action-space restriction or avoided routing actions entirely and highlights opportunities for more holistic solutions. •We discuss the challenges and opportunities for RL in SDVRP.•We carefully review and classify the existing literature.•We use examples and pseudocode for illustration throughout the papers.•We present two specific and promising means to improve future work in this domain.
ISSN:0305-0548
1873-765X
DOI:10.1016/j.cor.2022.106071