DNN Inference Acceleration for Smart Devices in Industry 5.0 by Decentralized Deep Reinforcement Learning

With the emergence of Industry 5.0, there has been a significant surge in the need for intelligent services within the realm of smart devices. Currently, deep neural networks (DNNs) have become the predominant technology in driving advancements in intelligent applications. With the collaboration of...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on consumer electronics Jg. 70; H. 1; S. 1519 - 1530
Hauptverfasser: Dong, Chongwu, Shafiq, Muhammad, Dabel, Maryam M. Al, Sun, Yanbin, Tian, Zhihong
Format: Journal Article
Sprache:Englisch
Veröffentlicht: New York IEEE 01.02.2024
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Schlagworte:
ISSN:0098-3063, 1558-4127
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:With the emergence of Industry 5.0, there has been a significant surge in the need for intelligent services within the realm of smart devices. Currently, deep neural networks (DNNs) have become the predominant technology in driving advancements in intelligent applications. With the collaboration of mobile edge computing (MEC), resource-constraint smart devices, such as the industrial Internet of Things (IIoT) devices, can meet the requirement of high computing for DNN-based inference by computation offloading. In the task offloading strategy obtained by a central decision-maker with global information, all devices in the MEC can get the optimal optimization for DNN inference acceleration. However, in a practical environment, central decision-making may get into trouble, such as information synchronization delay, irrational behavior of devices, and privacy leakage. In this paper, we explore the optimization of distributed task offloading for smart devices to deal with these challenges regarding DNN inference acceleration, considering the character of an early exit in the DNN model to balance the accuracy and latency. In our system model, the optimization is formulated as a decentralized partially observable Markov decision process (Dec-POMDP). Each smart device performs its strategy, including task offloading decision and DNN branch selection with local observation, and cooperatively optimizes the overall Quality of Experience for DNN inference. Based on the model of Dec-POMDP, we propose one algorithm based on Multi-agent Reinforcement Learning to solve the above problem. In our algorithm, we utilize the advanced function based on the counterfactual baseline to guide policy gradient learning to overcome the credit allocation problem in cooperative optimization. In addition, LSTM is introduced to improve the robustness of the algorithm. Finally, detailed performance evaluation and comparison are performed to show the effectiveness of our strategy.
Bibliographie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0098-3063
1558-4127
DOI:10.1109/TCE.2023.3339468