Deep Reinforcement Learning for Delay-Oriented IoT Task Scheduling in SAGIN

In this article, we investigate a computing task scheduling problem in space-air-ground integrated network (SAGIN) for delay-oriented Internet of Things (IoT) services. In the considered scenario, an unmanned aerial vehicle (UAV) collects computing tasks from IoT devices and then makes online offloa...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on wireless communications Vol. 20; no. 2; pp. 911 - 925
Main Authors: Zhou, Conghao, Wu, Wen, He, Hongli, Yang, Peng, Lyu, Feng, Cheng, Nan, Shen, Xuemin
Format: Journal Article
Language:English
Published: New York IEEE 01.02.2021
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects:
ISSN:1536-1276, 1558-2248
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In this article, we investigate a computing task scheduling problem in space-air-ground integrated network (SAGIN) for delay-oriented Internet of Things (IoT) services. In the considered scenario, an unmanned aerial vehicle (UAV) collects computing tasks from IoT devices and then makes online offloading decisions, in which the tasks can be processed at the UAV or offloaded to the nearby base station or the remote satellite. Our objective is to design a task scheduling policy that minimizes offloading and computing delay of all tasks given the UAV energy capacity constraint. To this end, we first formulate the online scheduling problem as an energy-constrained Markov decision process (MDP). Then, considering the task arrival dynamics, we develop a novel deep risk-sensitive reinforcement learning algorithm. Specifically, the algorithm evaluates the risk, which measures the energy consumption that exceeds the constraint, for each state and searches the optimal parameter weighing the minimization of delay and risk while learning the optimal policy. Extensive simulation results demonstrate that the proposed algorithm can reduce the task processing delay by up to 30% compared to probabilistic configuration methods while satisfying the UAV energy capacity constraint.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1536-1276
1558-2248
DOI:10.1109/TWC.2020.3029143