MOSAIC: Heterogeneity-, Communication-, and Constraint-Aware Model Slicing and Execution for Accurate and Efficient Inference

Heterogeneous embedded systems have surfaced as a promising solution for accurate and efficient deep-learning inference on mobile devices. Despite extensive prior works, it still remains unexplored to investigate the system-software support that efficiently executes inference workloads by judiciousl...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Proceedings / International Conference on Parallel Architectures and Compilation Techniques S. 165 - 177
Hauptverfasser: Han, Myeonggyun, Hyun, Jihoon, Park, Seongbeom, Park, Jinsu, Baek, Woongki
Format: Tagungsbericht
Sprache:Englisch
Veröffentlicht: IEEE 01.09.2019
Schlagworte:
ISSN:2641-7936
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Abstract Heterogeneous embedded systems have surfaced as a promising solution for accurate and efficient deep-learning inference on mobile devices. Despite extensive prior works, it still remains unexplored to investigate the system-software support that efficiently executes inference workloads by judiciously considering their performance and energy heterogeneity, communication overheads, and constraints. To bridge this gap, we propose MOSAIC, heterogeneity-, communication-, and constraint-aware model slicing and execution for accurate and efficient inference on heterogeneous embedded systems. MOSAIC generates the efficient model slicing and execution plan for the target inference workload through dynamic programming. MOSAIC significantly reduces inference latency and energy, exhibits high estimation accuracy, and incurs small overheads.
AbstractList Heterogeneous embedded systems have surfaced as a promising solution for accurate and efficient deep-learning inference on mobile devices. Despite extensive prior works, it still remains unexplored to investigate the system-software support that efficiently executes inference workloads by judiciously considering their performance and energy heterogeneity, communication overheads, and constraints. To bridge this gap, we propose MOSAIC, heterogeneity-, communication-, and constraint-aware model slicing and execution for accurate and efficient inference on heterogeneous embedded systems. MOSAIC generates the efficient model slicing and execution plan for the target inference workload through dynamic programming. MOSAIC significantly reduces inference latency and energy, exhibits high estimation accuracy, and incurs small overheads.
Author Park, Seongbeom
Baek, Woongki
Hyun, Jihoon
Han, Myeonggyun
Park, Jinsu
Author_xml – sequence: 1
  givenname: Myeonggyun
  surname: Han
  fullname: Han, Myeonggyun
  organization: UNIST
– sequence: 2
  givenname: Jihoon
  surname: Hyun
  fullname: Hyun, Jihoon
  organization: UNIST
– sequence: 3
  givenname: Seongbeom
  surname: Park
  fullname: Park, Seongbeom
  organization: UNIST
– sequence: 4
  givenname: Jinsu
  surname: Park
  fullname: Park, Jinsu
  organization: UNIST
– sequence: 5
  givenname: Woongki
  surname: Baek
  fullname: Baek, Woongki
  organization: UNIST
BookMark eNotjE1PAjEYhKvRREDPHrz0B7jYt92P1ttmg0ICwQRMvJFu9y2pga7pligH_7ureJrMzDMzJBe-9UjILbAxAFMPL2W1HnMGaswY43BGhlBwCSIH8XZOBjxPISmUyK_IsOveGUshz8SAfC-Wq3JWPdIpRgztFj26eEzuadXu9wfvjI6u9b3Xvukz38WgnY9J-akD0kXb4I6uds44v_1DJl9oDr8TattAS2MOQUc8Vdb2HPpIZ95iQG_wmlxavevw5l9H5PVpsq6myXz5PKvKeaJ5kcWEWxBSNQo4YGoU0zVAoZuaN2ktedZk3BRZXXNbMy14KpkwuaxFLlFIDRkXI3J3-nWIuPkIbq_DcSOlgjzl4ge8R17C
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/PACT.2019.00021
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISBN 172813613X
9781728136134
EISSN 2641-7936
EndPage 177
ExternalDocumentID 8891642
Genre orig-research
GroupedDBID 123
23M
29O
6IE
6IL
ACGFS
AFFNX
ALMA_UNASSIGNED_HOLDINGS
CBEJK
M43
RIE
RIL
RNS
ID FETCH-LOGICAL-a275t-2f1389d9121e4c90ab117adb2d4b825d52c75bb2fb0a324803c68b368e38a1523
IEDL.DBID RIE
ISICitedReferencesCount 30
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000550990200013&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
IngestDate Wed Aug 27 02:43:19 EDT 2025
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-a275t-2f1389d9121e4c90ab117adb2d4b825d52c75bb2fb0a324803c68b368e38a1523
PageCount 13
ParticipantIDs ieee_primary_8891642
PublicationCentury 2000
PublicationDate 2019-Sept.
PublicationDateYYYYMMDD 2019-09-01
PublicationDate_xml – month: 09
  year: 2019
  text: 2019-Sept.
PublicationDecade 2010
PublicationTitle Proceedings / International Conference on Parallel Architectures and Compilation Techniques
PublicationTitleAbbrev PACT
PublicationYear 2019
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0041653
ssib057737306
Score 2.2549703
Snippet Heterogeneous embedded systems have surfaced as a promising solution for accurate and efficient deep-learning inference on mobile devices. Despite extensive...
SourceID ieee
SourceType Publisher
StartPage 165
SubjectTerms Computational modeling
Embedded systems
Energy consumption
Graphics processing units
Heterogeneous Embedded Systems
Inference
Memory management
Model Slicing and Execution
Performance evaluation
Title MOSAIC: Heterogeneity-, Communication-, and Constraint-Aware Model Slicing and Execution for Accurate and Efficient Inference
URI https://ieeexplore.ieee.org/document/8891642
WOSCitedRecordID wos000550990200013&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3PT8IwFG6QePCECsbf6cEjla5b19bbQiCQKJKAhhtpu86YkGFwqBf_d9tuoCZevG1rDy_t63uve-99HwBXXAqhaZQhYe0iiuzdDSlJMoSJMRlVQlLf9f54y0YjPpuJcQ20t70wxhhffGau3aPP5adLvXa_yjqc22AmsgZ3h7G47NXa6A5lLLTKGm-ssI0zaFhB-QRYdMZJd-oKuRw6JXbAoD-4VLwr6Tf-J8Q-aH335MHx1tscgJrJD0FjQ8oAqzPaBJ9395Nk2L2BA1fosrT6YWygjdrwVyuIfZd5Ch1dpyeJKFDyLlcGOmq0BZwsXL79yU_pfRjtlRPa8BYmWq8duEQ55OEnrLRwuBGxBR76vWl3gCqOBSQJowUimctUpiIggYm0wFIFAZOpImmk7OUxpUQzqhTJFJY29uI41DFXYcxNyKX1_eERqOfL3BwDqLTJAq6UcCBhhGCFuYhTzASRESeCn4CmW835SwmjMa8W8vTvz2dgz21XWc51DurFam0uwK5-K55fV5d-778AuxGvGA
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3NT8IwFG-ImugJFYzf9uCRStetW-ttIRCIgCSg4UbarjMmZBgE9eL_btttqIkXb_vo4aV7fR977_1-AFwzwbmiQYq4sYsoMLkbkoKkCBOtUyq5oG7q_bEfDYdsOuWjCmhsZmG01q75TN_YS1fLTxZqbX-VNRkzwUxgDO42DUzek09rldpDo8g36hqWdthEGtQvwHw8zJujuDWxrVwWnxJbaNAfbCrOmXSq_xNjH9S_p_LgaONvDkBFZ4egWtIywOKU1sDn4H4c91q3sGtbXRZGQ7QJtVED_hoGMfciS6Al7HQ0ESsUv4ulhpYcbQ7Hc1txf3JL2h9aOfWEJsCFsVJrCy-Rv3IAFEZa2CtFrIOHTnvS6qKCZQEJEtEVIqmtVSbcI54OFMdCel4kEkmSQJr0MaFERVRKkkosTPTFsK9CJv2QaZ8J4_39I7CVLTJ9DKBUOvWYlNzChBGCJWY8THDEiQgY4ewE1Oxuzl5yII1ZsZGnfz--ArvdyaA_6_eGd2dgz366vLnrHGytlmt9AXbU2-r5dXnp9OALA52yXw
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=proceeding&rft.title=Proceedings+%2F+International+Conference+on+Parallel+Architectures+and+Compilation+Techniques&rft.atitle=MOSAIC%3A+Heterogeneity-%2C+Communication-%2C+and+Constraint-Aware+Model+Slicing+and+Execution+for+Accurate+and+Efficient+Inference&rft.au=Han%2C+Myeonggyun&rft.au=Hyun%2C+Jihoon&rft.au=Park%2C+Seongbeom&rft.au=Park%2C+Jinsu&rft.date=2019-09-01&rft.pub=IEEE&rft.eissn=2641-7936&rft.spage=165&rft.epage=177&rft_id=info:doi/10.1109%2FPACT.2019.00021&rft.externalDocID=8891642