Multiendpoint DAG-Driven Joint Partitioning-Offloading and Scheduling Optimization for DNN Inference

Model partitioning techniques, which decompose and collaboratively execute subtasks of deep neural networks (DNNs), have emerged as a critical strategy for enhancing distributed inference efficiency. However, in mobile edge computing (MEC), dynamic load fluctuations at edge nodes and the complexity...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE internet of things journal Jg. 12; H. 19; S. 41087 - 41102
Hauptverfasser: Yan, Xiukun, Zhang, Xuexue, Zeng, Kai, Bai, Fenhua, Shen, Tao, Cao, Bin
Format: Journal Article
Sprache:Englisch
Veröffentlicht: Piscataway IEEE 01.10.2025
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Schlagworte:
ISSN:2327-4662, 2327-4662
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Abstract Model partitioning techniques, which decompose and collaboratively execute subtasks of deep neural networks (DNNs), have emerged as a critical strategy for enhancing distributed inference efficiency. However, in mobile edge computing (MEC), dynamic load fluctuations at edge nodes and the complexity of cross-node task dependencies make the delay minimization problem extremely challenging. Existing studies predominantly adopt a decoupled optimization framework that separately addresses partitioning-offloading and pipeline scheduling, neglecting their inherent cyclic state-dependent coupling. This oversight leads to suboptimal solutions, such as pipeline stagnation caused by mismatched computation and communication timestamps. To address these challenges, we propose a multiendpoint directed acyclic graph (DAG)-driven cooperative optimization approach, enabling partitioning-offloading and pipeline scheduling in MEC. Specifically, the approach involves two core steps: 1) Dynamic prescheduling: We propose an improved DNN scheduling algorithm for constrained subtasks, which simulates node-level queuing delays and pipeline stalls under real-world constraints, translating runtime states into latency objectives. 2) Partitioning and offloading solution retrieval: Based on latency objectives, we introduce a novel multiendpoint DAG structure and design a multinode collaborative optimization retrieval algorithm, enabling adaptive partitioning-offloading remapping of subtasks. Experiments demonstrate the superiority of the proposed method over other advanced methods, reducing the time overhead by an average of 24% and 75% in two different scenarios, respectively. The resource code can be found at: https://github.com/aiheiheiheii/Partition_Scheduling.git .
AbstractList Model partitioning techniques, which decompose and collaboratively execute subtasks of deep neural networks (DNNs), have emerged as a critical strategy for enhancing distributed inference efficiency. However, in mobile edge computing (MEC), dynamic load fluctuations at edge nodes and the complexity of cross-node task dependencies make the delay minimization problem extremely challenging. Existing studies predominantly adopt a decoupled optimization framework that separately addresses partitioning-offloading and pipeline scheduling, neglecting their inherent cyclic state-dependent coupling. This oversight leads to suboptimal solutions, such as pipeline stagnation caused by mismatched computation and communication timestamps. To address these challenges, we propose a multiendpoint directed acyclic graph (DAG)-driven cooperative optimization approach, enabling partitioning-offloading and pipeline scheduling in MEC. Specifically, the approach involves two core steps: 1) Dynamic prescheduling: We propose an improved DNN scheduling algorithm for constrained subtasks, which simulates node-level queuing delays and pipeline stalls under real-world constraints, translating runtime states into latency objectives. 2) Partitioning and offloading solution retrieval: Based on latency objectives, we introduce a novel multiendpoint DAG structure and design a multinode collaborative optimization retrieval algorithm, enabling adaptive partitioning-offloading remapping of subtasks. Experiments demonstrate the superiority of the proposed method over other advanced methods, reducing the time overhead by an average of 24% and 75% in two different scenarios, respectively. The resource code can be found at: https://github.com/aiheiheiheii/Partition_Scheduling.git.
Author Zhang, Xuexue
Shen, Tao
Zeng, Kai
Cao, Bin
Yan, Xiukun
Bai, Fenhua
Author_xml – sequence: 1
  givenname: Xiukun
  orcidid: 0009-0008-7691-3427
  surname: Yan
  fullname: Yan, Xiukun
  email: xk_yan@stu.kust.edu.cn
  organization: Faculty of Information Engineering and Automation, and the Yunnan Key Laboratory of Computer Technologies Application, Kunming University of Science and Technology, Kunming, China
– sequence: 2
  givenname: Xuexue
  orcidid: 0009-0002-9524-6216
  surname: Zhang
  fullname: Zhang, Xuexue
  email: zhangxuexue@stu.kust.edu.cn
  organization: Faculty of Information Engineering and Automation, and the Yunnan Key Laboratory of Computer Technologies Application, Kunming University of Science and Technology, Kunming, China
– sequence: 3
  givenname: Kai
  orcidid: 0000-0003-2662-1596
  surname: Zeng
  fullname: Zeng, Kai
  email: zengkai@kust.edu.cn
  organization: Faculty of Information Engineering and Automation, and the Yunnan Key Laboratory of Computer Technologies Application, Kunming University of Science and Technology, Kunming, China
– sequence: 4
  givenname: Fenhua
  orcidid: 0000-0002-2505-0288
  surname: Bai
  fullname: Bai, Fenhua
  email: baifenhua@kust.edu.cn
  organization: Faculty of Information Engineering and Automation, and the Yunnan Key Laboratory of Computer Technologies Application, Kunming University of Science and Technology, Kunming, China
– sequence: 5
  givenname: Tao
  orcidid: 0000-0003-1273-7950
  surname: Shen
  fullname: Shen, Tao
  email: shentao@kust.edu.cn
  organization: Faculty of Information Engineering and Automation, and the Yunnan Key Laboratory of Computer Technologies Application, Kunming University of Science and Technology, Kunming, China
– sequence: 6
  givenname: Bin
  orcidid: 0000-0001-8839-243X
  surname: Cao
  fullname: Cao, Bin
  email: caobin@bupt.edu.cn
  organization: State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing, China
BookMark eNpNkF1LwzAUhoNMcM79AMGLgted-WjS9nJsOjfmKjivS5acaEaX1rQV9NfbuoFenQ-e9xx4LtHAlQ4QuiZ4QghO71bLbDuhmPIJ4ynhjJyhIWU0DiMh6OBff4HGdb3HGHcxTlIxRPqpLRoLTleldU0wny7Cubef4ILV7-JZ-sY2tnTWvYWZMUUpddcG0ungRb2Dbot-zKrGHuy37MnAlD6YbzbB0hnw4BRcoXMjixrGpzpCrw_329ljuM4Wy9l0HSoaJU3IKRVqp7CKUx0JKSWmClISgxGgpTFJEgslsBA6wtFOAGZas10EwKmEGICN0O3xbuXLjxbqJt-XrXfdy5xRzpJUxJx1FDlSypd17cHklbcH6b9ygvPeZ977zHuf-clnl7k5ZiwA_PEEJ2kaYfYDzU50Xg
CODEN IITJAU
Cites_doi 10.1109/TII.2023.3315375
10.1109/TVT.2019.2935450
10.1109/TVT.2023.3290019
10.1016/j.jii.2023.100481
10.1109/JIOT.2024.3373647
10.1109/TNET.2023.3311131
10.1109/ICC45041.2023.10278885
10.3390/app122010619
10.1109/TWC.2023.3327372
10.1109/TMC.2024.3430103
10.1109/TMC.2022.3177569
10.1109/TMC.2021.3077470
10.1109/TMC.2022.3220720
10.1109/TCAD.2024.3433411
10.1109/TNSE.2022.3172794
10.1109/JIOT.2023.3237361
10.1109/TMC.2021.3068748
10.1109/TITS.2022.3232153
10.1109/INFOCOM41043.2020.9155237
10.1109/TMC.2023.3331690
10.1016/j.comnet.2024.110607
10.1109/TFUZZ.2024.3412971
10.1109/TMC.2022.3183098
10.1109/TITS.2022.3142566
10.1109/TITS.2025.3545445
10.1109/TMC.2021.3125949
10.1109/TMC.2024.3357874
10.1109/TCC.2023.3258982
10.1145/3626184.3635278
10.1016/j.eng.2023.04.015
10.1109/TSC.2022.3180067
10.1109/TMC.2022.3218724
10.1109/TMC.2021.3112941
10.5555/3015812.3015979
10.1109/JIOT.2021.3110412
10.1109/MWC.2019.1700441
10.1109/TMC.2023.3276937
ContentType Journal Article
Copyright Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2025
Copyright_xml – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2025
DBID 97E
RIA
RIE
AAYXX
CITATION
7SC
8FD
JQ2
L7M
L~C
L~D
DOI 10.1109/JIOT.2025.3591531
DatabaseName IEEE Xplore (IEEE)
IEEE All-Society Periodicals Package (ASPP) 1998–Present
IEEE Electronic Library (IEL)
CrossRef
Computer and Information Systems Abstracts
Technology Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
DatabaseTitle CrossRef
Computer and Information Systems Abstracts
Technology Research Database
Computer and Information Systems Abstracts – Academic
Advanced Technologies Database with Aerospace
ProQuest Computer Science Collection
Computer and Information Systems Abstracts Professional
DatabaseTitleList Computer and Information Systems Abstracts

Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Xplore
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISSN 2327-4662
EndPage 41102
ExternalDocumentID 10_1109_JIOT_2025_3591531
11089940
Genre orig-research
GrantInformation_xml – fundername: Major Project on Special Education for Linguistic Intelligence in the Key Laboratory of Philosophy and Social Science of Sichuan Province
  grantid: YYZN-2024-1
– fundername: Photonics Fund Class A
  grantid: ghfund202407010460
– fundername: Yunnan Fundamental Research Projects
  grantid: 202301AV070003; 202501AT070345
– fundername: National Natural Science Foundation of China
  grantid: 62471205
  funderid: 10.13039/501100001809
– fundername: Major Science and Technology Projects in Yunnan Province
  grantid: 202302AG050009; 202202AD080013
  funderid: 10.13039/501100018531
GroupedDBID 0R~
6IK
97E
AAJGR
AASAJ
AAWTH
ABAZT
ABJNI
ABQJQ
ABVLG
AGQYO
AHBIQ
AKJIK
AKQYR
ALMA_UNASSIGNED_HOLDINGS
ATWAV
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
EBS
IFIPE
IPLJI
JAVBF
OCL
PQQKQ
RIA
RIE
AAYXX
CITATION
M43
7SC
8FD
JQ2
L7M
L~C
L~D
ID FETCH-LOGICAL-c248t-5226cbc0c79d46aaa02ce917ef6edaff8876c6066d404b6e03dd3b4ee52ae7ee3
IEDL.DBID RIE
ISICitedReferencesCount 1
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001579082500032&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 2327-4662
IngestDate Thu Nov 20 16:00:38 EST 2025
Sat Nov 29 07:23:09 EST 2025
Wed Oct 01 07:05:11 EDT 2025
IsPeerReviewed false
IsScholarly true
Issue 19
Language English
License https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html
https://doi.org/10.15223/policy-029
https://doi.org/10.15223/policy-037
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c248t-5226cbc0c79d46aaa02ce917ef6edaff8876c6066d404b6e03dd3b4ee52ae7ee3
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ORCID 0009-0008-7691-3427
0009-0002-9524-6216
0000-0002-2505-0288
0000-0001-8839-243X
0000-0003-2662-1596
0000-0003-1273-7950
PQID 3253896753
PQPubID 2040421
PageCount 16
ParticipantIDs ieee_primary_11089940
proquest_journals_3253896753
crossref_primary_10_1109_JIOT_2025_3591531
PublicationCentury 2000
PublicationDate 2025-10-01
PublicationDateYYYYMMDD 2025-10-01
PublicationDate_xml – month: 10
  year: 2025
  text: 2025-10-01
  day: 01
PublicationDecade 2020
PublicationPlace Piscataway
PublicationPlace_xml – name: Piscataway
PublicationTitle IEEE internet of things journal
PublicationTitleAbbrev JIoT
PublicationYear 2025
Publisher IEEE
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml – name: IEEE
– name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References ref13
ref35
ref12
ref34
ref15
ref37
ref14
ref31
ref30
ref11
ref33
ref10
ref32
ref2
ref1
ref17
ref16
ref19
ref18
Yuan (ref36) 2024; 33
ref24
ref23
ref26
ref25
ref20
ref22
ref21
ref28
ref27
ref29
ref8
ref7
ref9
ref4
ref3
ref6
ref5
References_xml – ident: ref8
  doi: 10.1109/TII.2023.3315375
– ident: ref5
  doi: 10.1109/TVT.2019.2935450
– ident: ref10
  doi: 10.1109/TVT.2023.3290019
– ident: ref22
  doi: 10.1016/j.jii.2023.100481
– ident: ref28
  doi: 10.1109/JIOT.2024.3373647
– ident: ref31
  doi: 10.1109/TNET.2023.3311131
– ident: ref16
  doi: 10.1109/ICC45041.2023.10278885
– ident: ref13
  doi: 10.3390/app122010619
– ident: ref25
  doi: 10.1109/TWC.2023.3327372
– ident: ref17
  doi: 10.1109/TMC.2024.3430103
– ident: ref23
  doi: 10.1109/TMC.2022.3177569
– ident: ref20
  doi: 10.1109/TMC.2021.3077470
– ident: ref11
  doi: 10.1109/TMC.2022.3220720
– ident: ref35
  doi: 10.1109/TCAD.2024.3433411
– ident: ref27
  doi: 10.1109/TNSE.2022.3172794
– ident: ref9
  doi: 10.1109/JIOT.2023.3237361
– ident: ref6
  doi: 10.1109/TMC.2021.3068748
– ident: ref34
  doi: 10.1109/TITS.2022.3232153
– ident: ref19
  doi: 10.1109/INFOCOM41043.2020.9155237
– ident: ref24
  doi: 10.1109/TMC.2023.3331690
– ident: ref12
  doi: 10.1016/j.comnet.2024.110607
– ident: ref26
  doi: 10.1109/TFUZZ.2024.3412971
– ident: ref18
  doi: 10.1109/TMC.2022.3183098
– ident: ref30
  doi: 10.1109/TITS.2022.3142566
– ident: ref37
  doi: 10.1109/TITS.2025.3545445
– ident: ref3
  doi: 10.1109/TMC.2021.3125949
– ident: ref4
  doi: 10.1109/TMC.2024.3357874
– ident: ref15
  doi: 10.1109/TCC.2023.3258982
– ident: ref33
  doi: 10.1145/3626184.3635278
– volume: 33
  start-page: 178
  year: 2024
  ident: ref36
  article-title: Low-cost federated broad learning for privacy-preserved knowledge sharing in the RIS-aided Internet of Vehicles
  publication-title: Engineering
  doi: 10.1016/j.eng.2023.04.015
– ident: ref29
  doi: 10.1109/TSC.2022.3180067
– ident: ref7
  doi: 10.1109/TMC.2022.3218724
– ident: ref1
  doi: 10.1109/TMC.2021.3112941
– ident: ref21
  doi: 10.5555/3015812.3015979
– ident: ref32
  doi: 10.1109/JIOT.2021.3110412
– ident: ref2
  doi: 10.1109/MWC.2019.1700441
– ident: ref14
  doi: 10.1109/TMC.2023.3276937
SSID ssj0001105196
Score 2.3596518
Snippet Model partitioning techniques, which decompose and collaboratively execute subtasks of deep neural networks (DNNs), have emerged as a critical strategy for...
SourceID proquest
crossref
ieee
SourceType Aggregation Database
Index Database
Publisher
StartPage 41087
SubjectTerms Adaptive algorithms
Artificial neural networks
Chemical partition
Collaboration
Computational modeling
Constrained scheduling
deep neural network (DNN) inference
Delays
Design optimization
distributed computing
Dynamic loads
Dynamic scheduling
Edge computing
Inference
Internet of Things
Load fluctuation
Mobile computing
model partitioning
multiendpoint directed acyclic graph (DAG)
Optimization
Partitioning
Pipelines
Queueing
Resource management
Retrieval
Scheduling
Servers
Title Multiendpoint DAG-Driven Joint Partitioning-Offloading and Scheduling Optimization for DNN Inference
URI https://ieeexplore.ieee.org/document/11089940
https://www.proquest.com/docview/3253896753
Volume 12
WOSCitedRecordID wos001579082500032&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVIEE
  databaseName: IEEE Xplore
  customDbUrl:
  eissn: 2327-4662
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0001105196
  issn: 2327-4662
  databaseCode: RIE
  dateStart: 20140101
  isFulltext: true
  titleUrlDefault: https://ieeexplore.ieee.org/
  providerName: IEEE
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LTwIxEG6UePAiPjCiaHrwZFJYdrvd7ZGID4gBDmi4bfqYRhJdCA9_v20pIcZ48LbZ7Dabmc43j863g9Atj3JqUqOINcKE0FznRChlCONGZEZmaeZHJ7y9ZINBPpnwUSCrey4MAPjmM2i6S3-Wr2dq7UplLdeyzjm1Gfp-lrENWWtXUGm7aISFk8t2xFv93nBsM8A4bSYpt5bd_uF7_DCVXwjs3cpj9Z8fdIyOQvyIOxuFn6A9KE9RdTubAQdTPUPaM2uh1PPZtFzhbueJdBcO2XDf3xi5LROKsWRozMfMN9NjUWq7yLt1QI6njocWUT4DVRPb-BZ3BwPc25IEa-j18WF8_0zCRAWiYpqviAu2lFSRyrimTAgRxQpswgaGgRbGWMRhyqU0mkZUMogSrRNJAdJYQAaQnKNKOSvhAmGZasOUYW1hBcBpImOwsK8FlzJPZRrX0d1W1sV88-OMwiccES-cYgqnmCIopo5qTri7B4Nc66ixVU8RbGtZJLEFaW4TneTyj9eu0KFbfdNz10CV1WIN1-hAfa2my8WN3zbfNsXEHw
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1JSwMxFA6igl6sK9Y1B09C6iyZJcdi1bbWqYcqvQ1ZXrCg01Krv98kzSAiHrwNw2y8l7d8L--bh9AFC3KqEy2JMcKY0FzlhEupSco0z7TIksyNTngeZEWRj8fs0ZPVHRcGAFzzGbTsodvLV1P5YUtlV7ZlnTFqEPqaHZ3l6VrfJZXQ5iOp37sMA3bV7w1HBgNGSStOmLHt8Ef0ceNUfvlgF1huG__8pG205TNI3F6qfAetQLWLGvV0BuyNdQ8px62FSs2mk2qBO-070plb34b77sSjXTS-HEuGWr9OXTs95pUyD3kxIcgy1fHQ-JQ3T9bEJsPFnaLAvZomuI-ebm9G113iZyoQGdF8QWy6JYUMZMYUTTnnQSTBQDbQKSiutfE5qbSgRtGAihSCWKlYUIAk4pABxAdotZpWcIiwSJROpU5DbgTAaCwiMI5fcSZEnogkaqLLWtblbPnrjNJBjoCVVjGlVUzpFdNE-1a43xd6uTbRSa2e0lvXexlHxk0zA3Xioz9uO0cb3dHDoBz0ivtjtGnftOzAO0Gri_kHnKJ1-bmYvM_P3BL6ApZVx2g
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Multiendpoint+DAG-Driven+Joint+Partitioning-Offloading+and+Scheduling+Optimization+for+DNN+Inference&rft.jtitle=IEEE+internet+of+things+journal&rft.au=Yan%2C+Xiukun&rft.au=Zhang%2C+Xuexue&rft.au=Zeng%2C+Kai&rft.au=Bai%2C+Fenhua&rft.date=2025-10-01&rft.pub=IEEE&rft.eissn=2327-4662&rft.volume=12&rft.issue=19&rft.spage=41087&rft.epage=41102&rft_id=info:doi/10.1109%2FJIOT.2025.3591531&rft.externalDocID=11089940
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2327-4662&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2327-4662&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2327-4662&client=summon