Toward Packet Routing With Fully Distributed Multiagent Deep Reinforcement Learning

Packet routing is one of the fundamental problems in computer networks in which a router determines the next-hop of each packet in the queue to get it as quickly as possible to its destination. Reinforcement learning (RL) has been introduced to design autonomous packet routing policies with local in...

Celý popis

Uloženo v:

Podrobná bibliografie
Vydáno v:	IEEE transactions on systems, man, and cybernetics. Systems Ročník 52; číslo 2; s. 855 - 868
Hlavní autoři:	You, Xinyu, Li, Xuanjie, Xu, Yuedong, Feng, Hui, Zhao, Jin, Yan, Huaicheng
Médium:	Journal Article
Jazyk:	angličtina
Vydáno:	New York IEEE 01.02.2022 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Témata:	Biological neural networks Computer networks Decision making Deep learning Deep reinforcement learning (DRL) Delays Feature extraction Heuristic algorithms local communications Machine learning multiagent learning Multiagent systems Network topologies Optimization packet routing Prediction algorithms Recurrent neural networks Routing Short term Training
ISSN:	2168-2216, 2168-2232
On-line přístup:	Získat plný text
Tagy:	Přidat tag Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!

Popis
Shrnutí:	Packet routing is one of the fundamental problems in computer networks in which a router determines the next-hop of each packet in the queue to get it as quickly as possible to its destination. Reinforcement learning (RL) has been introduced to design autonomous packet routing policies with local information of stochastic packet arrival and service. However, the curse of dimensionality of RL prohibits the more comprehensive representation of dynamic network states, thus limiting its potential benefit. In this article, we propose a novel packet routing framework based on multiagent deep RL (DRL) in which each router possess an independent long short term memory (LSTM) recurrent neural network (RNN) for training and decision making in a fully distributed environment. The LSTM RNN extracts routing features from rich information regarding backlogged packets and past actions, and effectively approximates the value function of Q-learning. We further allow each route to communicate periodically with direct neighbors so that a broader view of network state can be incorporated. The experimental results manifest that our multiagent DRL policy can strike the delicate balance between congestion-aware and shortest routes, and significantly reduce the packet delivery time in general network topologies compared with its counterparts.
Bibliografie:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	2168-2216 2168-2232
DOI:	10.1109/TSMC.2020.3012832