Offloading Algorithms for Maximizing Inference Accuracy on Edge Device in an Edge Intelligence System
With the emergence of edge computing, the problem of offloading jobs between an Edge Device (ED) and an Edge Server (ES) received significant attention in the past. Motivated by the fact that an increasing number of applications are using Machine Learning (ML) inference from the data samples collect...
Uložené v:
| Vydané v: | IEEE transactions on parallel and distributed systems Ročník 34; číslo 7; s. 1 - 15 |
|---|---|
| Hlavní autori: | , |
| Médium: | Journal Article |
| Jazyk: | English |
| Vydavateľské údaje: |
New York
IEEE
01.07.2023
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Predmet: | |
| ISSN: | 1045-9219, 1558-2183 |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Abstract | With the emergence of edge computing, the problem of offloading jobs between an Edge Device (ED) and an Edge Server (ES) received significant attention in the past. Motivated by the fact that an increasing number of applications are using Machine Learning (ML) inference from the data samples collected at the EDs, we study the problem of offloading inference jobs by considering the following novel aspects: 1) in contrast to a typical computational job, the processing time of an inference job depends on the size of the ML model, and 2) recently proposed Deep Neural Networks (DNNs) for resource-constrained devices provide the choice of scaling down the model size by trading off the inference accuracy. Considering that multiple ML models are available at the ED, and a powerful ML model is available at the ES, we formulate an Integer Linear Programming (ILP) problem with the objective of maximizing the total inference accuracy of <inline-formula><tex-math notation="LaTeX">n</tex-math></inline-formula> data samples at the ED subject to a time constraint <inline-formula><tex-math notation="LaTeX">T</tex-math></inline-formula> on the makespan. Noting that the problem is NP-hard, we propose an approximation algorithm Accuracy Maximization using LP-Relaxation and Rounding (AMR<inline-formula><tex-math notation="LaTeX">^{2}</tex-math></inline-formula>) and prove that it results in a makespan at most <inline-formula><tex-math notation="LaTeX">2T</tex-math></inline-formula> and achieves a total accuracy that is lower by a small constant from the optimal total accuracy implying that AMR<inline-formula><tex-math notation="LaTeX">^{2}</tex-math></inline-formula> is asymptotically optimal. Further, if the data samples are identical we propose Accuracy Maximization using Dynamic Programming (AMDP), an optimal pseudo-polynomial time algorithm. Furthermore, we extend AMR<inline-formula><tex-math notation="LaTeX">^{2}</tex-math></inline-formula> for the case of multiple ESs, where each ES is equipped with a powerful ML model. As proof of concept, we implemented AMR<inline-formula><tex-math notation="LaTeX">^{2}</tex-math></inline-formula> on a Raspberry Pi, equipped with MobileNets, that is connected to a server equipped with ResNet, and studied the total accuracy and makespan performance of AMR<inline-formula><tex-math notation="LaTeX">^{2}</tex-math></inline-formula> for image classification. |
|---|---|
| AbstractList | With the emergence of edge computing, the problem of offloading jobs between an Edge Device (ED) and an Edge Server (ES) received significant attention in the past. Motivated by the fact that an increasing number of applications are using Machine Learning (ML) inference from the data samples collected at the EDs, we study the problem of offloading inference jobs by considering the following novel aspects: 1) in contrast to a typical computational job, the processing time of an inference job depends on the size of the ML model, and 2) recently proposed Deep Neural Networks (DNNs) for resource-constrained devices provide the choice of scaling down the model size by trading off the inference accuracy. Considering that multiple ML models are available at the ED, and a powerful ML model is available at the ES, we formulate an Integer Linear Programming (ILP) problem with the objective of maximizing the total inference accuracy of [Formula Omitted] data samples at the ED subject to a time constraint [Formula Omitted] on the makespan. Noting that the problem is NP-hard, we propose an approximation algorithm Accuracy Maximization using LP-Relaxation and Rounding (AMR[Formula Omitted]) and prove that it results in a makespan at most [Formula Omitted] and achieves a total accuracy that is lower by a small constant from the optimal total accuracy implying that AMR[Formula Omitted] is asymptotically optimal. Further, if the data samples are identical we propose Accuracy Maximization using Dynamic Programming (AMDP), an optimal pseudo-polynomial time algorithm. Furthermore, we extend AMR[Formula Omitted] for the case of multiple ESs, where each ES is equipped with a powerful ML model. As proof of concept, we implemented AMR[Formula Omitted] on a Raspberry Pi, equipped with MobileNets, that is connected to a server equipped with ResNet, and studied the total accuracy and makespan performance of AMR[Formula Omitted] for image classification. With the emergence of edge computing, the problem of offloading jobs between an Edge Device (ED) and an Edge Server (ES) received significant attention in the past. Motivated by the fact that an increasing number of applications are using Machine Learning (ML) inference from the data samples collected at the EDs, we study the problem of offloading inference jobs by considering the following novel aspects: 1) in contrast to a typical computational job, the processing time of an inference job depends on the size of the ML model, and 2) recently proposed Deep Neural Networks (DNNs) for resource-constrained devices provide the choice of scaling down the model size by trading off the inference accuracy. Considering that multiple ML models are available at the ED, and a powerful ML model is available at the ES, we formulate an Integer Linear Programming (ILP) problem with the objective of maximizing the total inference accuracy of <inline-formula><tex-math notation="LaTeX">n</tex-math></inline-formula> data samples at the ED subject to a time constraint <inline-formula><tex-math notation="LaTeX">T</tex-math></inline-formula> on the makespan. Noting that the problem is NP-hard, we propose an approximation algorithm Accuracy Maximization using LP-Relaxation and Rounding (AMR<inline-formula><tex-math notation="LaTeX">^{2}</tex-math></inline-formula>) and prove that it results in a makespan at most <inline-formula><tex-math notation="LaTeX">2T</tex-math></inline-formula> and achieves a total accuracy that is lower by a small constant from the optimal total accuracy implying that AMR<inline-formula><tex-math notation="LaTeX">^{2}</tex-math></inline-formula> is asymptotically optimal. Further, if the data samples are identical we propose Accuracy Maximization using Dynamic Programming (AMDP), an optimal pseudo-polynomial time algorithm. Furthermore, we extend AMR<inline-formula><tex-math notation="LaTeX">^{2}</tex-math></inline-formula> for the case of multiple ESs, where each ES is equipped with a powerful ML model. As proof of concept, we implemented AMR<inline-formula><tex-math notation="LaTeX">^{2}</tex-math></inline-formula> on a Raspberry Pi, equipped with MobileNets, that is connected to a server equipped with ResNet, and studied the total accuracy and makespan performance of AMR<inline-formula><tex-math notation="LaTeX">^{2}</tex-math></inline-formula> for image classification. |
| Author | Champati, Jaya Prakash Fresa, Andrea |
| Author_xml | – sequence: 1 givenname: Andrea surname: Fresa fullname: Fresa, Andrea organization: Edge Networks Group, IMDEA Networks Institute, Madrid, Spain – sequence: 2 givenname: Jaya Prakash orcidid: 0000-0002-5127-8497 surname: Champati fullname: Champati, Jaya Prakash organization: Edge Networks Group, IMDEA Networks Institute, Madrid, Spain |
| BookMark | eNp9UE1PwkAU3BhMBPQHmHho4rm4n3R7JIBKgsEEPDfr9m1d0u7ithjx19sKB-PB05vMm5n3MgPUc94BQtcEjwjB6d3mebYeUUzZiNFxwoU8Q30ihIwpkazXYsxFnFKSXqBBXW8xJlxg3kewMqb0KreuiCZl4YNt3qo6Mj5ET-rTVvar2yycgQBOQzTReh-UPkTeRfO8gGgGH7blrYvUiVm4BsrSFj_69aFuoLpE50aVNVyd5hC93M8308d4uXpYTCfLWNOUN7GiCr8akJSC0EblqVGmxbmiXMv2ZUMwY7kEzrkxjKREiUQmudEmYSTRhg3R7TF3F_z7Huom2_p9cO3JjEoieZtBeatKjiodfF0HMJm2jWqsd01QtswIzrpOs67TrOs0O3XaOskf5y7YSoXDv56bo8cCwC89wYKPBfsGeiWFJg |
| CODEN | ITDSEO |
| CitedBy_id | crossref_primary_10_1109_ACCESS_2024_3421652 crossref_primary_10_1186_s13677_024_00693_x crossref_primary_10_1109_JIOT_2025_3583477 crossref_primary_10_1109_TMC_2024_3466844 crossref_primary_10_1109_TWC_2024_3497593 crossref_primary_10_1109_OJCOMS_2025_3555947 crossref_primary_10_1109_ACCESS_2024_3469956 crossref_primary_10_1016_j_cosrev_2024_100656 |
| Cites_doi | 10.1109/ISIT.2016.7541539 10.1016/0167-6377(89)90063-1 10.1109/CVPR.2018.00474 10.1109/LCOMM.2020.3034992 10.1145/3551659.3559044 10.1109/TMC.2020.3004225 10.1109/ICPR.2016.7900006 10.1007/BF01585178 10.1109/TMC.2019.2944371 10.1109/TCOMM.2016.2599530 10.1109/TPDS.2020.3032443 10.1109/IC2E48712.2020.00010 10.1109/ICC.2015.7249203 10.1109/TPDS.2022.3222509 10.1016/0377-2217(92)90077-M 10.1109/CVPR.2009.5206848 10.26599/TST.2021.9010050 10.1109/SPAWC.2015.7227025 10.1007/BF01580430 10.1109/MPRV.2009.82 10.1016/0166-218X(85)90009-5 10.1109/TPDS.2016.2605684 10.1109/JSAC.2016.2611964 10.1109/COMST.2017.2682318 10.1007/978-3-540-24777-7 10.1109/TPDS.2019.2929173 10.1109/JIOT.2016.2579198 10.1109/JPROC.2020.2976475 10.1109/CVPR.2016.90 10.1109/BigData47090.2019.9005455 10.1137/1.9781611975994.16 10.1145/3345768.3355917 |
| ContentType | Journal Article |
| Copyright | Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023 |
| Copyright_xml | – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023 |
| DBID | 97E RIA RIE AAYXX CITATION 7SC 7SP 8FD JQ2 L7M L~C L~D |
| DOI | 10.1109/TPDS.2023.3267458 |
| DatabaseName | IEEE All-Society Periodicals Package (ASPP) 2005-present IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Electronic Library (IEL) CrossRef Computer and Information Systems Abstracts Electronics & Communications Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional |
| DatabaseTitle | CrossRef Technology Research Database Computer and Information Systems Abstracts – Academic Electronics & Communications Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Professional |
| DatabaseTitleList | Technology Research Database |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Engineering Computer Science |
| EISSN | 1558-2183 |
| EndPage | 15 |
| ExternalDocumentID | 10_1109_TPDS_2023_3267458 10105465 |
| Genre | orig-research |
| GroupedDBID | --Z -~X .DC 0R~ 29I 4.4 5GY 6IK 97E AAJGR AARMG AASAJ AAWTH ABAZT ABQJQ ABVLG ACGFO ACIWK AENEX AGQYO AHBIQ AKJIK AKQYR ALMA_UNASSIGNED_HOLDINGS ASUFR ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ CS3 DU5 EBS EJD HZ~ IEDLZ IFIPE IPLJI JAVBF LAI M43 MS~ O9- OCL P2P PQQKQ RIA RIE RNS TN5 TWZ UHB 5VS AAYXX ABFSI AETIX AGSQL AI. AIBXA ALLEH CITATION E.L H~9 ICLAB IFJZH RNI RZB VH1 7SC 7SP 8FD JQ2 L7M L~C L~D |
| ID | FETCH-LOGICAL-c294t-a2a0bfe822e5cfad9faf22eda24c8001f1033d8e444ff3191a5787dfcf7317cf3 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 12 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000994532500003&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 1045-9219 |
| IngestDate | Sun Nov 09 05:48:52 EST 2025 Sat Nov 29 06:06:50 EST 2025 Tue Nov 18 21:32:39 EST 2025 Wed Aug 27 02:21:20 EDT 2025 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 7 |
| Language | English |
| License | https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html https://doi.org/10.15223/policy-029 https://doi.org/10.15223/policy-037 |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c294t-a2a0bfe822e5cfad9faf22eda24c8001f1033d8e444ff3191a5787dfcf7317cf3 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ORCID | 0000-0002-5127-8497 0000-0002-2849-5151 |
| PQID | 2818400124 |
| PQPubID | 85437 |
| PageCount | 15 |
| ParticipantIDs | crossref_primary_10_1109_TPDS_2023_3267458 crossref_citationtrail_10_1109_TPDS_2023_3267458 ieee_primary_10105465 proquest_journals_2818400124 |
| PublicationCentury | 2000 |
| PublicationDate | 2023-07-01 |
| PublicationDateYYYYMMDD | 2023-07-01 |
| PublicationDate_xml | – month: 07 year: 2023 text: 2023-07-01 day: 01 |
| PublicationDecade | 2020 |
| PublicationPlace | New York |
| PublicationPlace_xml | – name: New York |
| PublicationTitle | IEEE transactions on parallel and distributed systems |
| PublicationTitleAbbrev | TPDS |
| PublicationYear | 2023 |
| Publisher | IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Publisher_xml | – name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| References | ref13 ref35 ref12 ref15 ref37 ref14 ref36 ref31 ref30 ref33 ref32 Cai (ref10) 2019 ref2 ref1 ref17 ref16 ref19 ref18 ref24 ref23 ref26 ref25 ref20 ref22 ref21 Howard (ref38) 2017 ref28 ref27 Chollet (ref39) 2015 ref29 ref8 ref7 ref9 ref3 ref6 Chekuri (ref34) Pinedo (ref11) 2008 |
| References_xml | – ident: ref17 doi: 10.1109/ISIT.2016.7541539 – ident: ref32 doi: 10.1016/0167-6377(89)90063-1 – ident: ref8 doi: 10.1109/CVPR.2018.00474 – ident: ref28 doi: 10.1109/LCOMM.2020.3034992 – ident: ref15 doi: 10.1145/3551659.3559044 – ident: ref25 doi: 10.1109/TMC.2020.3004225 – ident: ref9 doi: 10.1109/ICPR.2016.7900006 – ident: ref14 doi: 10.1007/BF01585178 – ident: ref30 doi: 10.1109/TMC.2019.2944371 – ident: ref23 doi: 10.1109/TCOMM.2016.2599530 – ident: ref29 doi: 10.1109/TPDS.2020.3032443 – start-page: 213 volume-title: Proc. 11th Annu. ACM-SIAM Symp. Discrete Algorithms ident: ref34 article-title: A PTAS for the multiple knapsack problem – ident: ref27 doi: 10.1109/IC2E48712.2020.00010 – ident: ref22 doi: 10.1109/ICC.2015.7249203 – ident: ref31 doi: 10.1109/TPDS.2022.3222509 – ident: ref33 doi: 10.1016/0377-2217(92)90077-M – ident: ref7 doi: 10.1109/CVPR.2009.5206848 – ident: ref24 doi: 10.26599/TST.2021.9010050 – ident: ref21 doi: 10.1109/SPAWC.2015.7227025 – ident: ref13 doi: 10.1007/BF01580430 – ident: ref16 doi: 10.1109/MPRV.2009.82 – year: 2019 ident: ref10 article-title: Once for all: Train one network and specialize it for efficient deployment – ident: ref36 doi: 10.1016/0166-218X(85)90009-5 – year: 2015 ident: ref39 article-title: Keras – ident: ref19 doi: 10.1109/TPDS.2016.2605684 – ident: ref18 doi: 10.1109/JSAC.2016.2611964 – ident: ref2 doi: 10.1109/COMST.2017.2682318 – ident: ref12 doi: 10.1007/978-3-540-24777-7 – ident: ref20 doi: 10.1109/TPDS.2019.2929173 – year: 2017 ident: ref38 article-title: MobileNets: Efficient convolutional neural networks for mobile vision applications – volume-title: Scheduling: Theory, Algorithms, and Systems year: 2008 ident: ref11 – ident: ref1 doi: 10.1109/JIOT.2016.2579198 – ident: ref3 doi: 10.1109/JPROC.2020.2976475 – ident: ref6 doi: 10.1109/CVPR.2016.90 – ident: ref35 doi: 10.1109/BigData47090.2019.9005455 – ident: ref37 doi: 10.1137/1.9781611975994.16 – ident: ref26 doi: 10.1145/3345768.3355917 |
| SSID | ssj0014504 |
| Score | 2.4576225 |
| Snippet | With the emergence of edge computing, the problem of offloading jobs between an Edge Device (ED) and an Edge Server (ES) received significant attention in the... |
| SourceID | proquest crossref ieee |
| SourceType | Aggregation Database Enrichment Source Index Database Publisher |
| StartPage | 1 |
| SubjectTerms | Accuracy Algorithms Approximation algorithms Artificial neural networks Computational modeling Constraints Costs Data models Dynamic programming Edge computing Image classification Inference Inference algorithms Integer programming Linear programming Machine learning Maximization Optimization Polynomials Scheduling Servers |
| Title | Offloading Algorithms for Maximizing Inference Accuracy on Edge Device in an Edge Intelligence System |
| URI | https://ieeexplore.ieee.org/document/10105465 https://www.proquest.com/docview/2818400124 |
| Volume | 34 |
| WOSCitedRecordID | wos000994532500003&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVIEE databaseName: IEEE Electronic Library (IEL) customDbUrl: eissn: 1558-2183 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0014504 issn: 1045-9219 databaseCode: RIE dateStart: 19900101 isFulltext: true titleUrlDefault: https://ieeexplore.ieee.org/ providerName: IEEE |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1bS8MwFD644YM-OJ2K80YefBI62y5dm8fhBX3wAk7YW0mTk1nYWtlF1F9vkmZjIAq-hTYppV9P8iXn8gGcacaQSI7cC3jH98zs5yWZZJ6f6LUJKYZCCCs2ET88JIMBe3LJ6jYXBhFt8Bm2TdP68mUp5uaoTFu4ZgO0G9WgFsdxlay1dBnQyGoF6u1F5DFth86FGfjsov909dw2OuFtTVZiauTdVxYhq6ryYyq268tN459vtg1bjkiSXoX8Dqxh0YTGQqSBOJttwuZKxcFdwEelRqWNmye90bCc5LPX8ZRo5kru-Uc-zr_MnbtFFiDpCTGfcPFJyoJcyyGSKzRTC8kLwt2Vu5WinqQqgL4HLzfX_ctbzykteCJkdObxkPuZQk0WMBKKS6a40m3JQyo0owxU4Hc6MkFKqVLaaANuDF0qoWLNP4Tq7EO9KAs8AKIZGfIgUyqSkmouwJnhkHpjglk3C2PWAn_x6VPhypAbNYxRarcjPksNWqlBK3VoteB8OeStqsHxV-c9A89KxwqZFhwvAE6dmU5TUwqLGsZHD38ZdgQb5ulVgO4x1GeTOZ7Aunif5dPJqf0DvwElJ9iB |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1bS8MwFD7oFNQH7-K85sEnobPtUtc8Dqc4nFNwgm8lTU60MFvZRdRfb5JmYyAKvoU2oaVfT_Il5_IBnGjGEEuO3At43ffM7OfFqWSeH-u1CSmGQggrNtHoduOnJ3bvktVtLgwi2uAzrJmm9eXLQozNUZm2cM0G6Hk0DwsRpWFQpmtNnQY0smqBeoMReUxbonNiBj476923HmpGKbym6UqDGoH3mWXI6qr8mIztCnO19s93W4dVRyVJs8R-A-Yw34S1iUwDcVa7CSszNQe3AO-U6hc2cp40-8_FIBu9vA6J5q7kln9kr9mXudOe5AGSphDjARefpMjJpXxG0kIzuZAsJ9xdac-U9SRlCfRteLy67F1ce05rwRMhoyOPh9xPFWq6gJFQXDLFlW5LHlKhOWWgAr9elzFSSpXSZhtwY-pSCdXQDESo-g5U8iLHXSCakyEPUqUiKalmA5wZFqm3Jpiep2GDVcGffPpEuELkRg-jn9gNic8Sg1Zi0EocWlU4nQ55K6tw_NV528Az07FEpgoHE4ATZ6jDxBTDoobz0b1fhh3D0nXvtpN02t2bfVg2TyrDdQ-gMhqM8RAWxfsoGw6O7N_4DY9q28g |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Offloading+Algorithms+for+Maximizing+Inference+Accuracy+on+Edge+Device+in+an+Edge+Intelligence+System&rft.jtitle=IEEE+transactions+on+parallel+and+distributed+systems&rft.au=Fresa%2C+Andrea&rft.au=Champati%2C+Jaya+Prakash&rft.date=2023-07-01&rft.issn=1045-9219&rft.eissn=1558-2183&rft.volume=34&rft.issue=7&rft.spage=2025&rft.epage=2039&rft_id=info:doi/10.1109%2FTPDS.2023.3267458&rft.externalDBID=n%2Fa&rft.externalDocID=10_1109_TPDS_2023_3267458 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1045-9219&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1045-9219&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1045-9219&client=summon |