FedLight: Federated Reinforcement Learning for Autonomous Multi-Intersection Traffic Signal Control

Although Reinforcement Learning (RL) has been successfully applied in traffic control, it suffers from the problems of high average vehicle travel time and slow convergence to optimized solutions. This is because, due to the scalability restriction, most existing RL-based methods focus on the optimi...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:2021 58th ACM/IEEE Design Automation Conference (DAC) S. 847 - 852
Hauptverfasser: Ye, Yutong, Zhao, Wupan, Wei, Tongquan, Hu, Shiyan, Chen, Mingsong
Format: Tagungsbericht
Sprache:Englisch
Veröffentlicht: IEEE 05.12.2021
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Although Reinforcement Learning (RL) has been successfully applied in traffic control, it suffers from the problems of high average vehicle travel time and slow convergence to optimized solutions. This is because, due to the scalability restriction, most existing RL-based methods focus on the optimization of individual intersections while the impact of their cooperation is neglected. Without taking all the correlated intersections as a whole into account, it is difficult to achieve global optimization goals for complex traffic scenarios. To address this issue, this paper proposes a novel federated reinforcement learning approach named FedLight to enable optimal signal control policy generation for multi-intersection traffic scenarios. Inspired by federated learning, our approach supports knowledge sharing among RL agents, whose models are trained using decentralized traffic data at intersections. Based on such model-level collaborations, both the overall convergence rate and control quality can be significantly improved. Comprehensive experimental results demonstrate that compared with the state-of-the-art techniques, our approach can not only achieve better average vehicle travel time for various multi-intersection configurations, but also converge to optimal solutions much faster.
DOI:10.1109/DAC18074.2021.9586175