DNN Inference Acceleration for Smart Devices in Industry 5.0 by Decentralized Deep Reinforcement Learning
With the emergence of Industry 5.0, there has been a significant surge in the need for intelligent services within the realm of smart devices. Currently, deep neural networks (DNNs) have become the predominant technology in driving advancements in intelligent applications. With the collaboration of...
Uloženo v:
| Vydáno v: | IEEE transactions on consumer electronics Ročník 70; číslo 1; s. 1519 - 1530 |
|---|---|
| Hlavní autoři: | , , , , |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
New York
IEEE
01.02.2024
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Témata: | |
| ISSN: | 0098-3063, 1558-4127 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Shrnutí: | With the emergence of Industry 5.0, there has been a significant surge in the need for intelligent services within the realm of smart devices. Currently, deep neural networks (DNNs) have become the predominant technology in driving advancements in intelligent applications. With the collaboration of mobile edge computing (MEC), resource-constraint smart devices, such as the industrial Internet of Things (IIoT) devices, can meet the requirement of high computing for DNN-based inference by computation offloading. In the task offloading strategy obtained by a central decision-maker with global information, all devices in the MEC can get the optimal optimization for DNN inference acceleration. However, in a practical environment, central decision-making may get into trouble, such as information synchronization delay, irrational behavior of devices, and privacy leakage. In this paper, we explore the optimization of distributed task offloading for smart devices to deal with these challenges regarding DNN inference acceleration, considering the character of an early exit in the DNN model to balance the accuracy and latency. In our system model, the optimization is formulated as a decentralized partially observable Markov decision process (Dec-POMDP). Each smart device performs its strategy, including task offloading decision and DNN branch selection with local observation, and cooperatively optimizes the overall Quality of Experience for DNN inference. Based on the model of Dec-POMDP, we propose one algorithm based on Multi-agent Reinforcement Learning to solve the above problem. In our algorithm, we utilize the advanced function based on the counterfactual baseline to guide policy gradient learning to overcome the credit allocation problem in cooperative optimization. In addition, LSTM is introduced to improve the robustness of the algorithm. Finally, detailed performance evaluation and comparison are performed to show the effectiveness of our strategy. |
|---|---|
| Bibliografie: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ISSN: | 0098-3063 1558-4127 |
| DOI: | 10.1109/TCE.2023.3339468 |