Learning to reschedule platforms: A graph neural network based deep reinforcement learning method for the train platforming and rescheduling problem

•A novel deep reinforcement learning framework for the Train Platforming and Rescheduling Problem.•Integrating the microscopic simulation into the deep reinforcement learning framework.•High solution quality competitive with modern heuristics.•High and stable solving efficiency, regardless of the se...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Transportation research. Part C, Emerging technologies Jg. 183; S. 105453
Hauptverfasser: Zhang, Hongxiang, D’Ariano, Andrea, Zhu, Yongqiu, Wu, Yaoxin, Hu, Liuyang, Lu, Gongyuan
Format: Journal Article
Sprache:Englisch
Veröffentlicht: Elsevier Ltd 01.02.2026
Schlagworte:
ISSN:0968-090X
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:•A novel deep reinforcement learning framework for the Train Platforming and Rescheduling Problem.•Integrating the microscopic simulation into the deep reinforcement learning framework.•High solution quality competitive with modern heuristics.•High and stable solving efficiency, regardless of the severity of train delay disruptions.•Strong generalization capability to tackle unseen scenarios and recover line disruptions. The train platforming schedule is the crucial plan for guiding trains to travel through a railway station without spatial and temporal conflicts. When trains are delayed in arriving at the station due to disturbances or disruptions, it raises the Train Platforming and Rescheduling Problem (TPRP), one of the hot topics in railway traffic management. It focuses on allocating platforms and time slots for trains to reduce delays and ensure operational efficiency in a station. This paper introduces a novel graph neural network based deep reinforcement learning method to address this problem, named Learning to Reschedule Platforms (L2RP). We formulate the solving process of TPRP as a customized Markov decision process. Meanwhile, we integrate a microscopic discrete-event train operation simulation model to serve as the agent exploration environment, which provides states, executes actions, and completes transitions. Then, we design a hybrid graph neural network based policy network to derive high-quality actions under each graph encoded state. The policy network is trained with the reward function designed to minimize total train knock-on delays and platform changes. The experiments on real-world instances show that the proposed L2RP method can produce high-quality solutions for instances of various scenarios within stably short solving times.
ISSN:0968-090X
DOI:10.1016/j.trc.2025.105453