Hierarchical Reinforcement Learning-Based Routing Algorithm With Grouped RSU in Urban VANETs

The rapid growth of the Internet of Vehicles (IoV) has generated significant interest in routing techniques for vehicular ad hoc networks (VANETs) in both academic and industrial communities. To address the complexity of urban environments and dynamic vehicle mobility, we propose a hierarchical Q-le...

Celý popis

Uloženo v:

Podrobná bibliografie
Vydáno v:	IEEE transactions on intelligent transportation systems Ročník 25; číslo 8; s. 10131 - 10146
Hlavní autoři:	Yang, Qin, Yoo, Sang-Jo
Médium:	Journal Article
Jazyk:	angličtina
Vydáno:	IEEE 01.08.2024
Témata:	Data routing distributed learning Heuristic algorithms intelligent transportation systems (ITS) Measurement Q-learning reinforcement learning (RL) Roads roadside unit (RSU) Routing Vehicle-to-everything Vehicular ad hoc networks vehicular ad hoc networks (VANETs)
ISSN:	1524-9050, 1558-0016
On-line přístup:	Získat plný text
Tagy:	Přidat tag Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!

Popis
Shrnutí:	The rapid growth of the Internet of Vehicles (IoV) has generated significant interest in routing techniques for vehicular ad hoc networks (VANETs) in both academic and industrial communities. To address the complexity of urban environments and dynamic vehicle mobility, we propose a hierarchical Q-learning-based routing algorithm with grouped roadside unit (RSU) for VANETs. RSUs are grouped, and a Q-vector containing group information is exchanged through vehicle-to-everything (V2X) communications. Q-vector-based road-segment (QVRS) control messages are periodically broadcasted to refresh the V2X evaluation metric, which considers vehicle positions, velocities, directions, and communication conditions. To adapt to the nonstationary vehicular environment, a multi-agent reinforcement learning (RL) algorithm is performed on RSUs at each intersection to achieve distributed learning and local decisions. The hierarchical Q-learning algorithm trains group Q-table and local Q-table individually for reaching destinations on each RSU. The optimal data routing behavior is conducted with two separate Q-tables by utilizing the integrated V2X metric as the reward function. Simulation results demonstrate that our proposed method reduces broadcasting overhead, prolongs path lifetime and maintains a high packet delivery ratio and low average end-to-end delay. The incorporation of group design in our method accelerates the learning process, which facilitates more efficient communication in VANETs.
ISSN:	1524-9050 1558-0016
DOI:	10.1109/TITS.2024.3353258