Multiendpoint DAG-Driven Joint Partitioning-Offloading and Scheduling Optimization for DNN Inference
Model partitioning techniques, which decompose and collaboratively execute subtasks of deep neural networks (DNNs), have emerged as a critical strategy for enhancing distributed inference efficiency. However, in mobile edge computing (MEC), dynamic load fluctuations at edge nodes and the complexity...
Gespeichert in:
| Veröffentlicht in: | IEEE internet of things journal Jg. 12; H. 19; S. 41087 - 41102 |
|---|---|
| Hauptverfasser: | , , , , , |
| Format: | Journal Article |
| Sprache: | Englisch |
| Veröffentlicht: |
Piscataway
IEEE
01.10.2025
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Schlagworte: | |
| ISSN: | 2327-4662, 2327-4662 |
| Online-Zugang: | Volltext |
| Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
| Abstract | Model partitioning techniques, which decompose and collaboratively execute subtasks of deep neural networks (DNNs), have emerged as a critical strategy for enhancing distributed inference efficiency. However, in mobile edge computing (MEC), dynamic load fluctuations at edge nodes and the complexity of cross-node task dependencies make the delay minimization problem extremely challenging. Existing studies predominantly adopt a decoupled optimization framework that separately addresses partitioning-offloading and pipeline scheduling, neglecting their inherent cyclic state-dependent coupling. This oversight leads to suboptimal solutions, such as pipeline stagnation caused by mismatched computation and communication timestamps. To address these challenges, we propose a multiendpoint directed acyclic graph (DAG)-driven cooperative optimization approach, enabling partitioning-offloading and pipeline scheduling in MEC. Specifically, the approach involves two core steps: 1) Dynamic prescheduling: We propose an improved DNN scheduling algorithm for constrained subtasks, which simulates node-level queuing delays and pipeline stalls under real-world constraints, translating runtime states into latency objectives. 2) Partitioning and offloading solution retrieval: Based on latency objectives, we introduce a novel multiendpoint DAG structure and design a multinode collaborative optimization retrieval algorithm, enabling adaptive partitioning-offloading remapping of subtasks. Experiments demonstrate the superiority of the proposed method over other advanced methods, reducing the time overhead by an average of 24% and 75% in two different scenarios, respectively. The resource code can be found at: https://github.com/aiheiheiheii/Partition_Scheduling.git . |
|---|---|
| AbstractList | Model partitioning techniques, which decompose and collaboratively execute subtasks of deep neural networks (DNNs), have emerged as a critical strategy for enhancing distributed inference efficiency. However, in mobile edge computing (MEC), dynamic load fluctuations at edge nodes and the complexity of cross-node task dependencies make the delay minimization problem extremely challenging. Existing studies predominantly adopt a decoupled optimization framework that separately addresses partitioning-offloading and pipeline scheduling, neglecting their inherent cyclic state-dependent coupling. This oversight leads to suboptimal solutions, such as pipeline stagnation caused by mismatched computation and communication timestamps. To address these challenges, we propose a multiendpoint directed acyclic graph (DAG)-driven cooperative optimization approach, enabling partitioning-offloading and pipeline scheduling in MEC. Specifically, the approach involves two core steps: 1) Dynamic prescheduling: We propose an improved DNN scheduling algorithm for constrained subtasks, which simulates node-level queuing delays and pipeline stalls under real-world constraints, translating runtime states into latency objectives. 2) Partitioning and offloading solution retrieval: Based on latency objectives, we introduce a novel multiendpoint DAG structure and design a multinode collaborative optimization retrieval algorithm, enabling adaptive partitioning-offloading remapping of subtasks. Experiments demonstrate the superiority of the proposed method over other advanced methods, reducing the time overhead by an average of 24% and 75% in two different scenarios, respectively. The resource code can be found at: https://github.com/aiheiheiheii/Partition_Scheduling.git. |
| Author | Zhang, Xuexue Shen, Tao Zeng, Kai Cao, Bin Yan, Xiukun Bai, Fenhua |
| Author_xml | – sequence: 1 givenname: Xiukun orcidid: 0009-0008-7691-3427 surname: Yan fullname: Yan, Xiukun email: xk_yan@stu.kust.edu.cn organization: Faculty of Information Engineering and Automation, and the Yunnan Key Laboratory of Computer Technologies Application, Kunming University of Science and Technology, Kunming, China – sequence: 2 givenname: Xuexue orcidid: 0009-0002-9524-6216 surname: Zhang fullname: Zhang, Xuexue email: zhangxuexue@stu.kust.edu.cn organization: Faculty of Information Engineering and Automation, and the Yunnan Key Laboratory of Computer Technologies Application, Kunming University of Science and Technology, Kunming, China – sequence: 3 givenname: Kai orcidid: 0000-0003-2662-1596 surname: Zeng fullname: Zeng, Kai email: zengkai@kust.edu.cn organization: Faculty of Information Engineering and Automation, and the Yunnan Key Laboratory of Computer Technologies Application, Kunming University of Science and Technology, Kunming, China – sequence: 4 givenname: Fenhua orcidid: 0000-0002-2505-0288 surname: Bai fullname: Bai, Fenhua email: baifenhua@kust.edu.cn organization: Faculty of Information Engineering and Automation, and the Yunnan Key Laboratory of Computer Technologies Application, Kunming University of Science and Technology, Kunming, China – sequence: 5 givenname: Tao orcidid: 0000-0003-1273-7950 surname: Shen fullname: Shen, Tao email: shentao@kust.edu.cn organization: Faculty of Information Engineering and Automation, and the Yunnan Key Laboratory of Computer Technologies Application, Kunming University of Science and Technology, Kunming, China – sequence: 6 givenname: Bin orcidid: 0000-0001-8839-243X surname: Cao fullname: Cao, Bin email: caobin@bupt.edu.cn organization: State Key Laboratory of Networking and Switching Technology, Beijing University of Posts and Telecommunications, Beijing, China |
| BookMark | eNpNkF1LwzAUhoNMcM79AMGLgted-WjS9nJsOjfmKjivS5acaEaX1rQV9NfbuoFenQ-e9xx4LtHAlQ4QuiZ4QghO71bLbDuhmPIJ4ynhjJyhIWU0DiMh6OBff4HGdb3HGHcxTlIxRPqpLRoLTleldU0wny7Cubef4ILV7-JZ-sY2tnTWvYWZMUUpddcG0ungRb2Dbot-zKrGHuy37MnAlD6YbzbB0hnw4BRcoXMjixrGpzpCrw_329ljuM4Wy9l0HSoaJU3IKRVqp7CKUx0JKSWmClISgxGgpTFJEgslsBA6wtFOAGZas10EwKmEGICN0O3xbuXLjxbqJt-XrXfdy5xRzpJUxJx1FDlSypd17cHklbcH6b9ygvPeZ977zHuf-clnl7k5ZiwA_PEEJ2kaYfYDzU50Xg |
| CODEN | IITJAU |
| Cites_doi | 10.1109/TII.2023.3315375 10.1109/TVT.2019.2935450 10.1109/TVT.2023.3290019 10.1016/j.jii.2023.100481 10.1109/JIOT.2024.3373647 10.1109/TNET.2023.3311131 10.1109/ICC45041.2023.10278885 10.3390/app122010619 10.1109/TWC.2023.3327372 10.1109/TMC.2024.3430103 10.1109/TMC.2022.3177569 10.1109/TMC.2021.3077470 10.1109/TMC.2022.3220720 10.1109/TCAD.2024.3433411 10.1109/TNSE.2022.3172794 10.1109/JIOT.2023.3237361 10.1109/TMC.2021.3068748 10.1109/TITS.2022.3232153 10.1109/INFOCOM41043.2020.9155237 10.1109/TMC.2023.3331690 10.1016/j.comnet.2024.110607 10.1109/TFUZZ.2024.3412971 10.1109/TMC.2022.3183098 10.1109/TITS.2022.3142566 10.1109/TITS.2025.3545445 10.1109/TMC.2021.3125949 10.1109/TMC.2024.3357874 10.1109/TCC.2023.3258982 10.1145/3626184.3635278 10.1016/j.eng.2023.04.015 10.1109/TSC.2022.3180067 10.1109/TMC.2022.3218724 10.1109/TMC.2021.3112941 10.5555/3015812.3015979 10.1109/JIOT.2021.3110412 10.1109/MWC.2019.1700441 10.1109/TMC.2023.3276937 |
| ContentType | Journal Article |
| Copyright | Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2025 |
| Copyright_xml | – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2025 |
| DBID | 97E RIA RIE AAYXX CITATION 7SC 8FD JQ2 L7M L~C L~D |
| DOI | 10.1109/JIOT.2025.3591531 |
| DatabaseName | IEEE Xplore (IEEE) IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Electronic Library (IEL) CrossRef Computer and Information Systems Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional |
| DatabaseTitle | CrossRef Computer and Information Systems Abstracts Technology Research Database Computer and Information Systems Abstracts – Academic Advanced Technologies Database with Aerospace ProQuest Computer Science Collection Computer and Information Systems Abstracts Professional |
| DatabaseTitleList | Computer and Information Systems Abstracts |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Xplore url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Computer Science |
| EISSN | 2327-4662 |
| EndPage | 41102 |
| ExternalDocumentID | 10_1109_JIOT_2025_3591531 11089940 |
| Genre | orig-research |
| GrantInformation_xml | – fundername: Major Project on Special Education for Linguistic Intelligence in the Key Laboratory of Philosophy and Social Science of Sichuan Province grantid: YYZN-2024-1 – fundername: Photonics Fund Class A grantid: ghfund202407010460 – fundername: Yunnan Fundamental Research Projects grantid: 202301AV070003; 202501AT070345 – fundername: National Natural Science Foundation of China grantid: 62471205 funderid: 10.13039/501100001809 – fundername: Major Science and Technology Projects in Yunnan Province grantid: 202302AG050009; 202202AD080013 funderid: 10.13039/501100018531 |
| GroupedDBID | 0R~ 6IK 97E AAJGR AASAJ AAWTH ABAZT ABJNI ABQJQ ABVLG AGQYO AHBIQ AKJIK AKQYR ALMA_UNASSIGNED_HOLDINGS ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ EBS IFIPE IPLJI JAVBF OCL PQQKQ RIA RIE AAYXX CITATION M43 7SC 8FD JQ2 L7M L~C L~D |
| ID | FETCH-LOGICAL-c248t-5226cbc0c79d46aaa02ce917ef6edaff8876c6066d404b6e03dd3b4ee52ae7ee3 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 1 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001579082500032&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 2327-4662 |
| IngestDate | Thu Nov 20 16:00:38 EST 2025 Sat Nov 29 07:23:09 EST 2025 Wed Oct 01 07:05:11 EDT 2025 |
| IsPeerReviewed | false |
| IsScholarly | true |
| Issue | 19 |
| Language | English |
| License | https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html https://doi.org/10.15223/policy-029 https://doi.org/10.15223/policy-037 |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c248t-5226cbc0c79d46aaa02ce917ef6edaff8876c6066d404b6e03dd3b4ee52ae7ee3 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ORCID | 0009-0008-7691-3427 0009-0002-9524-6216 0000-0002-2505-0288 0000-0001-8839-243X 0000-0003-2662-1596 0000-0003-1273-7950 |
| PQID | 3253896753 |
| PQPubID | 2040421 |
| PageCount | 16 |
| ParticipantIDs | ieee_primary_11089940 proquest_journals_3253896753 crossref_primary_10_1109_JIOT_2025_3591531 |
| PublicationCentury | 2000 |
| PublicationDate | 2025-10-01 |
| PublicationDateYYYYMMDD | 2025-10-01 |
| PublicationDate_xml | – month: 10 year: 2025 text: 2025-10-01 day: 01 |
| PublicationDecade | 2020 |
| PublicationPlace | Piscataway |
| PublicationPlace_xml | – name: Piscataway |
| PublicationTitle | IEEE internet of things journal |
| PublicationTitleAbbrev | JIoT |
| PublicationYear | 2025 |
| Publisher | IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Publisher_xml | – name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| References | ref13 ref35 ref12 ref34 ref15 ref37 ref14 ref31 ref30 ref11 ref33 ref10 ref32 ref2 ref1 ref17 ref16 ref19 ref18 Yuan (ref36) 2024; 33 ref24 ref23 ref26 ref25 ref20 ref22 ref21 ref28 ref27 ref29 ref8 ref7 ref9 ref4 ref3 ref6 ref5 |
| References_xml | – ident: ref8 doi: 10.1109/TII.2023.3315375 – ident: ref5 doi: 10.1109/TVT.2019.2935450 – ident: ref10 doi: 10.1109/TVT.2023.3290019 – ident: ref22 doi: 10.1016/j.jii.2023.100481 – ident: ref28 doi: 10.1109/JIOT.2024.3373647 – ident: ref31 doi: 10.1109/TNET.2023.3311131 – ident: ref16 doi: 10.1109/ICC45041.2023.10278885 – ident: ref13 doi: 10.3390/app122010619 – ident: ref25 doi: 10.1109/TWC.2023.3327372 – ident: ref17 doi: 10.1109/TMC.2024.3430103 – ident: ref23 doi: 10.1109/TMC.2022.3177569 – ident: ref20 doi: 10.1109/TMC.2021.3077470 – ident: ref11 doi: 10.1109/TMC.2022.3220720 – ident: ref35 doi: 10.1109/TCAD.2024.3433411 – ident: ref27 doi: 10.1109/TNSE.2022.3172794 – ident: ref9 doi: 10.1109/JIOT.2023.3237361 – ident: ref6 doi: 10.1109/TMC.2021.3068748 – ident: ref34 doi: 10.1109/TITS.2022.3232153 – ident: ref19 doi: 10.1109/INFOCOM41043.2020.9155237 – ident: ref24 doi: 10.1109/TMC.2023.3331690 – ident: ref12 doi: 10.1016/j.comnet.2024.110607 – ident: ref26 doi: 10.1109/TFUZZ.2024.3412971 – ident: ref18 doi: 10.1109/TMC.2022.3183098 – ident: ref30 doi: 10.1109/TITS.2022.3142566 – ident: ref37 doi: 10.1109/TITS.2025.3545445 – ident: ref3 doi: 10.1109/TMC.2021.3125949 – ident: ref4 doi: 10.1109/TMC.2024.3357874 – ident: ref15 doi: 10.1109/TCC.2023.3258982 – ident: ref33 doi: 10.1145/3626184.3635278 – volume: 33 start-page: 178 year: 2024 ident: ref36 article-title: Low-cost federated broad learning for privacy-preserved knowledge sharing in the RIS-aided Internet of Vehicles publication-title: Engineering doi: 10.1016/j.eng.2023.04.015 – ident: ref29 doi: 10.1109/TSC.2022.3180067 – ident: ref7 doi: 10.1109/TMC.2022.3218724 – ident: ref1 doi: 10.1109/TMC.2021.3112941 – ident: ref21 doi: 10.5555/3015812.3015979 – ident: ref32 doi: 10.1109/JIOT.2021.3110412 – ident: ref2 doi: 10.1109/MWC.2019.1700441 – ident: ref14 doi: 10.1109/TMC.2023.3276937 |
| SSID | ssj0001105196 |
| Score | 2.3596518 |
| Snippet | Model partitioning techniques, which decompose and collaboratively execute subtasks of deep neural networks (DNNs), have emerged as a critical strategy for... |
| SourceID | proquest crossref ieee |
| SourceType | Aggregation Database Index Database Publisher |
| StartPage | 41087 |
| SubjectTerms | Adaptive algorithms Artificial neural networks Chemical partition Collaboration Computational modeling Constrained scheduling deep neural network (DNN) inference Delays Design optimization distributed computing Dynamic loads Dynamic scheduling Edge computing Inference Internet of Things Load fluctuation Mobile computing model partitioning multiendpoint directed acyclic graph (DAG) Optimization Partitioning Pipelines Queueing Resource management Retrieval Scheduling Servers |
| Title | Multiendpoint DAG-Driven Joint Partitioning-Offloading and Scheduling Optimization for DNN Inference |
| URI | https://ieeexplore.ieee.org/document/11089940 https://www.proquest.com/docview/3253896753 |
| Volume | 12 |
| WOSCitedRecordID | wos001579082500032&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVIEE databaseName: IEEE Xplore customDbUrl: eissn: 2327-4662 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0001105196 issn: 2327-4662 databaseCode: RIE dateStart: 20140101 isFulltext: true titleUrlDefault: https://ieeexplore.ieee.org/ providerName: IEEE |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LTwIxEG6UePAiPjCiaHrwZFJYdrvd7ZGID4gBDmi4bfqYRhJdCA9_v20pIcZ48LbZ7Dabmc43j863g9Atj3JqUqOINcKE0FznRChlCONGZEZmaeZHJ7y9ZINBPpnwUSCrey4MAPjmM2i6S3-Wr2dq7UplLdeyzjm1Gfp-lrENWWtXUGm7aISFk8t2xFv93nBsM8A4bSYpt5bd_uF7_DCVXwjs3cpj9Z8fdIyOQvyIOxuFn6A9KE9RdTubAQdTPUPaM2uh1PPZtFzhbueJdBcO2XDf3xi5LROKsWRozMfMN9NjUWq7yLt1QI6njocWUT4DVRPb-BZ3BwPc25IEa-j18WF8_0zCRAWiYpqviAu2lFSRyrimTAgRxQpswgaGgRbGWMRhyqU0mkZUMogSrRNJAdJYQAaQnKNKOSvhAmGZasOUYW1hBcBpImOwsK8FlzJPZRrX0d1W1sV88-OMwiccES-cYgqnmCIopo5qTri7B4Nc66ixVU8RbGtZJLEFaW4TneTyj9eu0KFbfdNz10CV1WIN1-hAfa2my8WN3zbfNsXEHw |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1JSwMxFA6igl6sK9Y1B09C6iyZJcdi1bbWqYcqvQ1ZXrCg01Krv98kzSAiHrwNw2y8l7d8L--bh9AFC3KqEy2JMcKY0FzlhEupSco0z7TIksyNTngeZEWRj8fs0ZPVHRcGAFzzGbTsodvLV1P5YUtlV7ZlnTFqEPqaHZ3l6VrfJZXQ5iOp37sMA3bV7w1HBgNGSStOmLHt8Ef0ceNUfvlgF1huG__8pG205TNI3F6qfAetQLWLGvV0BuyNdQ8px62FSs2mk2qBO-070plb34b77sSjXTS-HEuGWr9OXTs95pUyD3kxIcgy1fHQ-JQ3T9bEJsPFnaLAvZomuI-ebm9G113iZyoQGdF8QWy6JYUMZMYUTTnnQSTBQDbQKSiutfE5qbSgRtGAihSCWKlYUIAk4pABxAdotZpWcIiwSJROpU5DbgTAaCwiMI5fcSZEnogkaqLLWtblbPnrjNJBjoCVVjGlVUzpFdNE-1a43xd6uTbRSa2e0lvXexlHxk0zA3Xioz9uO0cb3dHDoBz0ivtjtGnftOzAO0Gri_kHnKJ1-bmYvM_P3BL6ApZVx2g |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Multiendpoint+DAG-Driven+Joint+Partitioning-Offloading+and+Scheduling+Optimization+for+DNN+Inference&rft.jtitle=IEEE+internet+of+things+journal&rft.au=Yan%2C+Xiukun&rft.au=Zhang%2C+Xuexue&rft.au=Zeng%2C+Kai&rft.au=Bai%2C+Fenhua&rft.date=2025-10-01&rft.pub=IEEE&rft.eissn=2327-4662&rft.volume=12&rft.issue=19&rft.spage=41087&rft.epage=41102&rft_id=info:doi/10.1109%2FJIOT.2025.3591531&rft.externalDocID=11089940 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2327-4662&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2327-4662&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2327-4662&client=summon |