Offloading Algorithms for Maximizing Inference Accuracy on Edge Device in an Edge Intelligence System

With the emergence of edge computing, the problem of offloading jobs between an Edge Device (ED) and an Edge Server (ES) received significant attention in the past. Motivated by the fact that an increasing number of applications are using Machine Learning (ML) inference from the data samples collect...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:IEEE transactions on parallel and distributed systems Ročník 34; číslo 7; s. 1 - 15
Hlavní autori: Fresa, Andrea, Champati, Jaya Prakash
Médium: Journal Article
Jazyk:English
Vydavateľské údaje: New York IEEE 01.07.2023
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Predmet:
ISSN:1045-9219, 1558-2183
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Abstract With the emergence of edge computing, the problem of offloading jobs between an Edge Device (ED) and an Edge Server (ES) received significant attention in the past. Motivated by the fact that an increasing number of applications are using Machine Learning (ML) inference from the data samples collected at the EDs, we study the problem of offloading inference jobs by considering the following novel aspects: 1) in contrast to a typical computational job, the processing time of an inference job depends on the size of the ML model, and 2) recently proposed Deep Neural Networks (DNNs) for resource-constrained devices provide the choice of scaling down the model size by trading off the inference accuracy. Considering that multiple ML models are available at the ED, and a powerful ML model is available at the ES, we formulate an Integer Linear Programming (ILP) problem with the objective of maximizing the total inference accuracy of <inline-formula><tex-math notation="LaTeX">n</tex-math></inline-formula> data samples at the ED subject to a time constraint <inline-formula><tex-math notation="LaTeX">T</tex-math></inline-formula> on the makespan. Noting that the problem is NP-hard, we propose an approximation algorithm Accuracy Maximization using LP-Relaxation and Rounding (AMR<inline-formula><tex-math notation="LaTeX">^{2}</tex-math></inline-formula>) and prove that it results in a makespan at most <inline-formula><tex-math notation="LaTeX">2T</tex-math></inline-formula> and achieves a total accuracy that is lower by a small constant from the optimal total accuracy implying that AMR<inline-formula><tex-math notation="LaTeX">^{2}</tex-math></inline-formula> is asymptotically optimal. Further, if the data samples are identical we propose Accuracy Maximization using Dynamic Programming (AMDP), an optimal pseudo-polynomial time algorithm. Furthermore, we extend AMR<inline-formula><tex-math notation="LaTeX">^{2}</tex-math></inline-formula> for the case of multiple ESs, where each ES is equipped with a powerful ML model. As proof of concept, we implemented AMR<inline-formula><tex-math notation="LaTeX">^{2}</tex-math></inline-formula> on a Raspberry Pi, equipped with MobileNets, that is connected to a server equipped with ResNet, and studied the total accuracy and makespan performance of AMR<inline-formula><tex-math notation="LaTeX">^{2}</tex-math></inline-formula> for image classification.
AbstractList With the emergence of edge computing, the problem of offloading jobs between an Edge Device (ED) and an Edge Server (ES) received significant attention in the past. Motivated by the fact that an increasing number of applications are using Machine Learning (ML) inference from the data samples collected at the EDs, we study the problem of offloading inference jobs by considering the following novel aspects: 1) in contrast to a typical computational job, the processing time of an inference job depends on the size of the ML model, and 2) recently proposed Deep Neural Networks (DNNs) for resource-constrained devices provide the choice of scaling down the model size by trading off the inference accuracy. Considering that multiple ML models are available at the ED, and a powerful ML model is available at the ES, we formulate an Integer Linear Programming (ILP) problem with the objective of maximizing the total inference accuracy of [Formula Omitted] data samples at the ED subject to a time constraint [Formula Omitted] on the makespan. Noting that the problem is NP-hard, we propose an approximation algorithm Accuracy Maximization using LP-Relaxation and Rounding (AMR[Formula Omitted]) and prove that it results in a makespan at most [Formula Omitted] and achieves a total accuracy that is lower by a small constant from the optimal total accuracy implying that AMR[Formula Omitted] is asymptotically optimal. Further, if the data samples are identical we propose Accuracy Maximization using Dynamic Programming (AMDP), an optimal pseudo-polynomial time algorithm. Furthermore, we extend AMR[Formula Omitted] for the case of multiple ESs, where each ES is equipped with a powerful ML model. As proof of concept, we implemented AMR[Formula Omitted] on a Raspberry Pi, equipped with MobileNets, that is connected to a server equipped with ResNet, and studied the total accuracy and makespan performance of AMR[Formula Omitted] for image classification.
With the emergence of edge computing, the problem of offloading jobs between an Edge Device (ED) and an Edge Server (ES) received significant attention in the past. Motivated by the fact that an increasing number of applications are using Machine Learning (ML) inference from the data samples collected at the EDs, we study the problem of offloading inference jobs by considering the following novel aspects: 1) in contrast to a typical computational job, the processing time of an inference job depends on the size of the ML model, and 2) recently proposed Deep Neural Networks (DNNs) for resource-constrained devices provide the choice of scaling down the model size by trading off the inference accuracy. Considering that multiple ML models are available at the ED, and a powerful ML model is available at the ES, we formulate an Integer Linear Programming (ILP) problem with the objective of maximizing the total inference accuracy of <inline-formula><tex-math notation="LaTeX">n</tex-math></inline-formula> data samples at the ED subject to a time constraint <inline-formula><tex-math notation="LaTeX">T</tex-math></inline-formula> on the makespan. Noting that the problem is NP-hard, we propose an approximation algorithm Accuracy Maximization using LP-Relaxation and Rounding (AMR<inline-formula><tex-math notation="LaTeX">^{2}</tex-math></inline-formula>) and prove that it results in a makespan at most <inline-formula><tex-math notation="LaTeX">2T</tex-math></inline-formula> and achieves a total accuracy that is lower by a small constant from the optimal total accuracy implying that AMR<inline-formula><tex-math notation="LaTeX">^{2}</tex-math></inline-formula> is asymptotically optimal. Further, if the data samples are identical we propose Accuracy Maximization using Dynamic Programming (AMDP), an optimal pseudo-polynomial time algorithm. Furthermore, we extend AMR<inline-formula><tex-math notation="LaTeX">^{2}</tex-math></inline-formula> for the case of multiple ESs, where each ES is equipped with a powerful ML model. As proof of concept, we implemented AMR<inline-formula><tex-math notation="LaTeX">^{2}</tex-math></inline-formula> on a Raspberry Pi, equipped with MobileNets, that is connected to a server equipped with ResNet, and studied the total accuracy and makespan performance of AMR<inline-formula><tex-math notation="LaTeX">^{2}</tex-math></inline-formula> for image classification.
Author Champati, Jaya Prakash
Fresa, Andrea
Author_xml – sequence: 1
  givenname: Andrea
  surname: Fresa
  fullname: Fresa, Andrea
  organization: Edge Networks Group, IMDEA Networks Institute, Madrid, Spain
– sequence: 2
  givenname: Jaya Prakash
  orcidid: 0000-0002-5127-8497
  surname: Champati
  fullname: Champati, Jaya Prakash
  organization: Edge Networks Group, IMDEA Networks Institute, Madrid, Spain
BookMark eNp9UE1PwkAU3BhMBPQHmHho4rm4n3R7JIBKgsEEPDfr9m1d0u7ithjx19sKB-PB05vMm5n3MgPUc94BQtcEjwjB6d3mebYeUUzZiNFxwoU8Q30ihIwpkazXYsxFnFKSXqBBXW8xJlxg3kewMqb0KreuiCZl4YNt3qo6Mj5ET-rTVvar2yycgQBOQzTReh-UPkTeRfO8gGgGH7blrYvUiVm4BsrSFj_69aFuoLpE50aVNVyd5hC93M8308d4uXpYTCfLWNOUN7GiCr8akJSC0EblqVGmxbmiXMv2ZUMwY7kEzrkxjKREiUQmudEmYSTRhg3R7TF3F_z7Huom2_p9cO3JjEoieZtBeatKjiodfF0HMJm2jWqsd01QtswIzrpOs67TrOs0O3XaOskf5y7YSoXDv56bo8cCwC89wYKPBfsGeiWFJg
CODEN ITDSEO
CitedBy_id crossref_primary_10_1109_ACCESS_2024_3421652
crossref_primary_10_1186_s13677_024_00693_x
crossref_primary_10_1109_JIOT_2025_3583477
crossref_primary_10_1109_TMC_2024_3466844
crossref_primary_10_1109_TWC_2024_3497593
crossref_primary_10_1109_OJCOMS_2025_3555947
crossref_primary_10_1109_ACCESS_2024_3469956
crossref_primary_10_1016_j_cosrev_2024_100656
Cites_doi 10.1109/ISIT.2016.7541539
10.1016/0167-6377(89)90063-1
10.1109/CVPR.2018.00474
10.1109/LCOMM.2020.3034992
10.1145/3551659.3559044
10.1109/TMC.2020.3004225
10.1109/ICPR.2016.7900006
10.1007/BF01585178
10.1109/TMC.2019.2944371
10.1109/TCOMM.2016.2599530
10.1109/TPDS.2020.3032443
10.1109/IC2E48712.2020.00010
10.1109/ICC.2015.7249203
10.1109/TPDS.2022.3222509
10.1016/0377-2217(92)90077-M
10.1109/CVPR.2009.5206848
10.26599/TST.2021.9010050
10.1109/SPAWC.2015.7227025
10.1007/BF01580430
10.1109/MPRV.2009.82
10.1016/0166-218X(85)90009-5
10.1109/TPDS.2016.2605684
10.1109/JSAC.2016.2611964
10.1109/COMST.2017.2682318
10.1007/978-3-540-24777-7
10.1109/TPDS.2019.2929173
10.1109/JIOT.2016.2579198
10.1109/JPROC.2020.2976475
10.1109/CVPR.2016.90
10.1109/BigData47090.2019.9005455
10.1137/1.9781611975994.16
10.1145/3345768.3355917
ContentType Journal Article
Copyright Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023
Copyright_xml – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023
DBID 97E
RIA
RIE
AAYXX
CITATION
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
DOI 10.1109/TPDS.2023.3267458
DatabaseName IEEE All-Society Periodicals Package (ASPP) 2005-present
IEEE All-Society Periodicals Package (ASPP) 1998–Present
IEEE Electronic Library (IEL)
CrossRef
Computer and Information Systems Abstracts
Electronics & Communications Abstracts
Technology Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
DatabaseTitle CrossRef
Technology Research Database
Computer and Information Systems Abstracts – Academic
Electronics & Communications Abstracts
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts Professional
DatabaseTitleList Technology Research Database

Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
Computer Science
EISSN 1558-2183
EndPage 15
ExternalDocumentID 10_1109_TPDS_2023_3267458
10105465
Genre orig-research
GroupedDBID --Z
-~X
.DC
0R~
29I
4.4
5GY
6IK
97E
AAJGR
AARMG
AASAJ
AAWTH
ABAZT
ABQJQ
ABVLG
ACGFO
ACIWK
AENEX
AGQYO
AHBIQ
AKJIK
AKQYR
ALMA_UNASSIGNED_HOLDINGS
ASUFR
ATWAV
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CS3
DU5
EBS
EJD
HZ~
IEDLZ
IFIPE
IPLJI
JAVBF
LAI
M43
MS~
O9-
OCL
P2P
PQQKQ
RIA
RIE
RNS
TN5
TWZ
UHB
5VS
AAYXX
ABFSI
AETIX
AGSQL
AI.
AIBXA
ALLEH
CITATION
E.L
H~9
ICLAB
IFJZH
RNI
RZB
VH1
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
ID FETCH-LOGICAL-c294t-a2a0bfe822e5cfad9faf22eda24c8001f1033d8e444ff3191a5787dfcf7317cf3
IEDL.DBID RIE
ISICitedReferencesCount 12
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000994532500003&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 1045-9219
IngestDate Sun Nov 09 05:48:52 EST 2025
Sat Nov 29 06:06:50 EST 2025
Tue Nov 18 21:32:39 EST 2025
Wed Aug 27 02:21:20 EDT 2025
IsPeerReviewed true
IsScholarly true
Issue 7
Language English
License https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html
https://doi.org/10.15223/policy-029
https://doi.org/10.15223/policy-037
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c294t-a2a0bfe822e5cfad9faf22eda24c8001f1033d8e444ff3191a5787dfcf7317cf3
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ORCID 0000-0002-5127-8497
0000-0002-2849-5151
PQID 2818400124
PQPubID 85437
PageCount 15
ParticipantIDs crossref_primary_10_1109_TPDS_2023_3267458
crossref_citationtrail_10_1109_TPDS_2023_3267458
ieee_primary_10105465
proquest_journals_2818400124
PublicationCentury 2000
PublicationDate 2023-07-01
PublicationDateYYYYMMDD 2023-07-01
PublicationDate_xml – month: 07
  year: 2023
  text: 2023-07-01
  day: 01
PublicationDecade 2020
PublicationPlace New York
PublicationPlace_xml – name: New York
PublicationTitle IEEE transactions on parallel and distributed systems
PublicationTitleAbbrev TPDS
PublicationYear 2023
Publisher IEEE
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml – name: IEEE
– name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References ref13
ref35
ref12
ref15
ref37
ref14
ref36
ref31
ref30
ref33
ref32
Cai (ref10) 2019
ref2
ref1
ref17
ref16
ref19
ref18
ref24
ref23
ref26
ref25
ref20
ref22
ref21
Howard (ref38) 2017
ref28
ref27
Chollet (ref39) 2015
ref29
ref8
ref7
ref9
ref3
ref6
Chekuri (ref34)
Pinedo (ref11) 2008
References_xml – ident: ref17
  doi: 10.1109/ISIT.2016.7541539
– ident: ref32
  doi: 10.1016/0167-6377(89)90063-1
– ident: ref8
  doi: 10.1109/CVPR.2018.00474
– ident: ref28
  doi: 10.1109/LCOMM.2020.3034992
– ident: ref15
  doi: 10.1145/3551659.3559044
– ident: ref25
  doi: 10.1109/TMC.2020.3004225
– ident: ref9
  doi: 10.1109/ICPR.2016.7900006
– ident: ref14
  doi: 10.1007/BF01585178
– ident: ref30
  doi: 10.1109/TMC.2019.2944371
– ident: ref23
  doi: 10.1109/TCOMM.2016.2599530
– ident: ref29
  doi: 10.1109/TPDS.2020.3032443
– start-page: 213
  volume-title: Proc. 11th Annu. ACM-SIAM Symp. Discrete Algorithms
  ident: ref34
  article-title: A PTAS for the multiple knapsack problem
– ident: ref27
  doi: 10.1109/IC2E48712.2020.00010
– ident: ref22
  doi: 10.1109/ICC.2015.7249203
– ident: ref31
  doi: 10.1109/TPDS.2022.3222509
– ident: ref33
  doi: 10.1016/0377-2217(92)90077-M
– ident: ref7
  doi: 10.1109/CVPR.2009.5206848
– ident: ref24
  doi: 10.26599/TST.2021.9010050
– ident: ref21
  doi: 10.1109/SPAWC.2015.7227025
– ident: ref13
  doi: 10.1007/BF01580430
– ident: ref16
  doi: 10.1109/MPRV.2009.82
– year: 2019
  ident: ref10
  article-title: Once for all: Train one network and specialize it for efficient deployment
– ident: ref36
  doi: 10.1016/0166-218X(85)90009-5
– year: 2015
  ident: ref39
  article-title: Keras
– ident: ref19
  doi: 10.1109/TPDS.2016.2605684
– ident: ref18
  doi: 10.1109/JSAC.2016.2611964
– ident: ref2
  doi: 10.1109/COMST.2017.2682318
– ident: ref12
  doi: 10.1007/978-3-540-24777-7
– ident: ref20
  doi: 10.1109/TPDS.2019.2929173
– year: 2017
  ident: ref38
  article-title: MobileNets: Efficient convolutional neural networks for mobile vision applications
– volume-title: Scheduling: Theory, Algorithms, and Systems
  year: 2008
  ident: ref11
– ident: ref1
  doi: 10.1109/JIOT.2016.2579198
– ident: ref3
  doi: 10.1109/JPROC.2020.2976475
– ident: ref6
  doi: 10.1109/CVPR.2016.90
– ident: ref35
  doi: 10.1109/BigData47090.2019.9005455
– ident: ref37
  doi: 10.1137/1.9781611975994.16
– ident: ref26
  doi: 10.1145/3345768.3355917
SSID ssj0014504
Score 2.4576225
Snippet With the emergence of edge computing, the problem of offloading jobs between an Edge Device (ED) and an Edge Server (ES) received significant attention in the...
SourceID proquest
crossref
ieee
SourceType Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 1
SubjectTerms Accuracy
Algorithms
Approximation algorithms
Artificial neural networks
Computational modeling
Constraints
Costs
Data models
Dynamic programming
Edge computing
Image classification
Inference
Inference algorithms
Integer programming
Linear programming
Machine learning
Maximization
Optimization
Polynomials
Scheduling
Servers
Title Offloading Algorithms for Maximizing Inference Accuracy on Edge Device in an Edge Intelligence System
URI https://ieeexplore.ieee.org/document/10105465
https://www.proquest.com/docview/2818400124
Volume 34
WOSCitedRecordID wos000994532500003&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVIEE
  databaseName: IEEE Electronic Library (IEL)
  customDbUrl:
  eissn: 1558-2183
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0014504
  issn: 1045-9219
  databaseCode: RIE
  dateStart: 19900101
  isFulltext: true
  titleUrlDefault: https://ieeexplore.ieee.org/
  providerName: IEEE
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1bS8MwFD644YM-OJ2K80YefBI62y5dm8fhBX3wAk7YW0mTk1nYWtlF1F9vkmZjIAq-hTYppV9P8iXn8gGcacaQSI7cC3jH98zs5yWZZJ6f6LUJKYZCCCs2ET88JIMBe3LJ6jYXBhFt8Bm2TdP68mUp5uaoTFu4ZgO0G9WgFsdxlay1dBnQyGoF6u1F5DFth86FGfjsov909dw2OuFtTVZiauTdVxYhq6ryYyq268tN459vtg1bjkiSXoX8Dqxh0YTGQqSBOJttwuZKxcFdwEelRqWNmye90bCc5LPX8ZRo5kru-Uc-zr_MnbtFFiDpCTGfcPFJyoJcyyGSKzRTC8kLwt2Vu5WinqQqgL4HLzfX_ctbzykteCJkdObxkPuZQk0WMBKKS6a40m3JQyo0owxU4Hc6MkFKqVLaaANuDF0qoWLNP4Tq7EO9KAs8AKIZGfIgUyqSkmouwJnhkHpjglk3C2PWAn_x6VPhypAbNYxRarcjPksNWqlBK3VoteB8OeStqsHxV-c9A89KxwqZFhwvAE6dmU5TUwqLGsZHD38ZdgQb5ulVgO4x1GeTOZ7Aunif5dPJqf0DvwElJ9iB
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1bS8MwFD7oFNQH7-K85sEnobPtUtc8Dqc4nFNwgm8lTU60MFvZRdRfb5JmYyAKvoU2oaVfT_Il5_IBnGjGEEuO3At43ffM7OfFqWSeH-u1CSmGQggrNtHoduOnJ3bvktVtLgwi2uAzrJmm9eXLQozNUZm2cM0G6Hk0DwsRpWFQpmtNnQY0smqBeoMReUxbonNiBj476923HmpGKbym6UqDGoH3mWXI6qr8mIztCnO19s93W4dVRyVJs8R-A-Yw34S1iUwDcVa7CSszNQe3AO-U6hc2cp40-8_FIBu9vA6J5q7kln9kr9mXudOe5AGSphDjARefpMjJpXxG0kIzuZAsJ9xdac-U9SRlCfRteLy67F1ce05rwRMhoyOPh9xPFWq6gJFQXDLFlW5LHlKhOWWgAr9elzFSSpXSZhtwY-pSCdXQDESo-g5U8iLHXSCakyEPUqUiKalmA5wZFqm3Jpiep2GDVcGffPpEuELkRg-jn9gNic8Sg1Zi0EocWlU4nQ55K6tw_NV528Az07FEpgoHE4ATZ6jDxBTDoobz0b1fhh3D0nXvtpN02t2bfVg2TyrDdQ-gMhqM8RAWxfsoGw6O7N_4DY9q28g
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Offloading+Algorithms+for+Maximizing+Inference+Accuracy+on+Edge+Device+in+an+Edge+Intelligence+System&rft.jtitle=IEEE+transactions+on+parallel+and+distributed+systems&rft.au=Fresa%2C+Andrea&rft.au=Champati%2C+Jaya+Prakash&rft.date=2023-07-01&rft.issn=1045-9219&rft.eissn=1558-2183&rft.volume=34&rft.issue=7&rft.spage=2025&rft.epage=2039&rft_id=info:doi/10.1109%2FTPDS.2023.3267458&rft.externalDBID=n%2Fa&rft.externalDocID=10_1109_TPDS_2023_3267458
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1045-9219&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1045-9219&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1045-9219&client=summon