A Scalable Reinforcement Learning Algorithm for Scheduling Railway Lines

This paper describes an algorithm for scheduling bidirectional railway lines (both single- and multi-track) using a reinforcement learning (RL) approach. The goal is to define the track allocations and arrival/departure times for all trains on the line, given their initial positions, priority, halt...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on intelligent transportation systems Vol. 20; no. 2; pp. 727 - 736
Main Author: Khadilkar, Harshad
Format: Journal Article
Language:English
Published: New York IEEE 01.02.2019
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects:
ISSN:1524-9050, 1558-0016
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:This paper describes an algorithm for scheduling bidirectional railway lines (both single- and multi-track) using a reinforcement learning (RL) approach. The goal is to define the track allocations and arrival/departure times for all trains on the line, given their initial positions, priority, halt times, and traversal times, while minimizing the total priority-weighted delay. The primary advantage of the proposed algorithm compared to exact approaches is its scalability, and compared to heuristic approaches is its solution quality. Efficient scaling is ensured by decoupling the size of the state-action space from the size of the problem instance. Improved solution quality is obtained because of the inherent adaptability of reinforcement learning to specific problem instances. An additional advantage is that the learning from one instance can be transferred with minimal re-learning to another instance with different infrastructure resources and traffic mix. It is shown that the solution quality of the RL algorithm exceeds that of two prior heuristic-based approaches while having comparable computation times. Two lines from the Indian rail network are used for demonstrating the applicability of the proposed algorithm in the real world.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1524-9050
1558-0016
DOI:10.1109/TITS.2018.2829165