Task offloading strategy and scheduling optimization for internet of vehicles based on deep reinforcement learning

Driven by the construction of smart cities, networks and communication technologies are gradually infiltrating into the Internet of Things (IoT) applications in urban infrastructure, such as automatic driving. In the Internet of Vehicles (IoV) environment, intelligent vehicles will generate a lot of...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Ad hoc networks Ročník 147; s. 103193
Hlavní autoři: Zhao, Xu, Liu, Mingzhen, Li, Maozhen
Médium: Journal Article
Jazyk:angličtina
Vydáno: Elsevier B.V 01.08.2023
Témata:
ISSN:1570-8705, 1570-8713
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Driven by the construction of smart cities, networks and communication technologies are gradually infiltrating into the Internet of Things (IoT) applications in urban infrastructure, such as automatic driving. In the Internet of Vehicles (IoV) environment, intelligent vehicles will generate a lot of data. However, the limited computing power of in-vehicle terminals cannot meet the demand. To solve this problem, we first simulate the task offloading model of vehicle terminal in Mobile Edge Computing (MEC) environment. Secondly, according to the model, we design and implement a MEC server collaboration scheme considering both delay and energy consumption. Thirdly, based on the optimization theory, the system optimization solution is formulated with the goal of minimizing system cost. Because the problem to be resolved is a mixed binary nonlinear programming problem, we model the problem as a Markov Decision Process (MDP). The original resource allocation decision is turned into a Reinforcement Learning (RL) problem. In order to achieve the optimal solution, the Deep Reinforcement Learning (DRL) method is used. Finally, we propose a Deep Deterministic Policy Gradient (DDPG) algorithm to deal with task offloading and scheduling optimization in high-dimensional continuous action space, and the experience replay mechanism is used to accelerate the convergence and enhance the stability of the network. The simulation results show that our scheme has good performance optimization in terms of convergence, system delay, average task energy consumption and system cost. For example, compared with the comparison algorithm, the system cost performance has improved by 9.12% under different task sizes, which indicates that our scheme is more suitable for highly dynamic Internet of Vehicles environment.
ISSN:1570-8705
1570-8713
DOI:10.1016/j.adhoc.2023.103193