MOSAIC: Heterogeneity-, Communication-, and Constraint-Aware Model Slicing and Execution for Accurate and Efficient Inference
Heterogeneous embedded systems have surfaced as a promising solution for accurate and efficient deep-learning inference on mobile devices. Despite extensive prior works, it still remains unexplored to investigate the system-software support that efficiently executes inference workloads by judiciousl...
Gespeichert in:
| Veröffentlicht in: | Proceedings / International Conference on Parallel Architectures and Compilation Techniques S. 165 - 177 |
|---|---|
| Hauptverfasser: | , , , , |
| Format: | Tagungsbericht |
| Sprache: | Englisch |
| Veröffentlicht: |
IEEE
01.09.2019
|
| Schlagworte: | |
| ISSN: | 2641-7936 |
| Online-Zugang: | Volltext |
| Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
| Abstract | Heterogeneous embedded systems have surfaced as a promising solution for accurate and efficient deep-learning inference on mobile devices. Despite extensive prior works, it still remains unexplored to investigate the system-software support that efficiently executes inference workloads by judiciously considering their performance and energy heterogeneity, communication overheads, and constraints. To bridge this gap, we propose MOSAIC, heterogeneity-, communication-, and constraint-aware model slicing and execution for accurate and efficient inference on heterogeneous embedded systems. MOSAIC generates the efficient model slicing and execution plan for the target inference workload through dynamic programming. MOSAIC significantly reduces inference latency and energy, exhibits high estimation accuracy, and incurs small overheads. |
|---|---|
| AbstractList | Heterogeneous embedded systems have surfaced as a promising solution for accurate and efficient deep-learning inference on mobile devices. Despite extensive prior works, it still remains unexplored to investigate the system-software support that efficiently executes inference workloads by judiciously considering their performance and energy heterogeneity, communication overheads, and constraints. To bridge this gap, we propose MOSAIC, heterogeneity-, communication-, and constraint-aware model slicing and execution for accurate and efficient inference on heterogeneous embedded systems. MOSAIC generates the efficient model slicing and execution plan for the target inference workload through dynamic programming. MOSAIC significantly reduces inference latency and energy, exhibits high estimation accuracy, and incurs small overheads. |
| Author | Park, Seongbeom Baek, Woongki Hyun, Jihoon Han, Myeonggyun Park, Jinsu |
| Author_xml | – sequence: 1 givenname: Myeonggyun surname: Han fullname: Han, Myeonggyun organization: UNIST – sequence: 2 givenname: Jihoon surname: Hyun fullname: Hyun, Jihoon organization: UNIST – sequence: 3 givenname: Seongbeom surname: Park fullname: Park, Seongbeom organization: UNIST – sequence: 4 givenname: Jinsu surname: Park fullname: Park, Jinsu organization: UNIST – sequence: 5 givenname: Woongki surname: Baek fullname: Baek, Woongki organization: UNIST |
| BookMark | eNotjE1PAjEYhKvRREDPHrz0B7jYt92P1ttmg0ICwQRMvJFu9y2pga7pligH_7ureJrMzDMzJBe-9UjILbAxAFMPL2W1HnMGaswY43BGhlBwCSIH8XZOBjxPISmUyK_IsOveGUshz8SAfC-Wq3JWPdIpRgztFj26eEzuadXu9wfvjI6u9b3Xvukz38WgnY9J-akD0kXb4I6uds44v_1DJl9oDr8TattAS2MOQUc8Vdb2HPpIZ95iQG_wmlxavevw5l9H5PVpsq6myXz5PKvKeaJ5kcWEWxBSNQo4YGoU0zVAoZuaN2ktedZk3BRZXXNbMy14KpkwuaxFLlFIDRkXI3J3-nWIuPkIbq_DcSOlgjzl4ge8R17C |
| ContentType | Conference Proceeding |
| DBID | 6IE 6IL CBEJK RIE RIL |
| DOI | 10.1109/PACT.2019.00021 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Computer Science |
| EISBN | 172813613X 9781728136134 |
| EISSN | 2641-7936 |
| EndPage | 177 |
| ExternalDocumentID | 8891642 |
| Genre | orig-research |
| GroupedDBID | 123 23M 29O 6IE 6IL ACGFS AFFNX ALMA_UNASSIGNED_HOLDINGS CBEJK M43 RIE RIL RNS |
| ID | FETCH-LOGICAL-a275t-2f1389d9121e4c90ab117adb2d4b825d52c75bb2fb0a324803c68b368e38a1523 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 30 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000550990200013&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| IngestDate | Wed Aug 27 02:43:19 EDT 2025 |
| IsPeerReviewed | false |
| IsScholarly | true |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-a275t-2f1389d9121e4c90ab117adb2d4b825d52c75bb2fb0a324803c68b368e38a1523 |
| PageCount | 13 |
| ParticipantIDs | ieee_primary_8891642 |
| PublicationCentury | 2000 |
| PublicationDate | 2019-Sept. |
| PublicationDateYYYYMMDD | 2019-09-01 |
| PublicationDate_xml | – month: 09 year: 2019 text: 2019-Sept. |
| PublicationDecade | 2010 |
| PublicationTitle | Proceedings / International Conference on Parallel Architectures and Compilation Techniques |
| PublicationTitleAbbrev | PACT |
| PublicationYear | 2019 |
| Publisher | IEEE |
| Publisher_xml | – name: IEEE |
| SSID | ssj0041653 ssib057737306 |
| Score | 2.2549703 |
| Snippet | Heterogeneous embedded systems have surfaced as a promising solution for accurate and efficient deep-learning inference on mobile devices. Despite extensive... |
| SourceID | ieee |
| SourceType | Publisher |
| StartPage | 165 |
| SubjectTerms | Computational modeling Embedded systems Energy consumption Graphics processing units Heterogeneous Embedded Systems Inference Memory management Model Slicing and Execution Performance evaluation |
| Title | MOSAIC: Heterogeneity-, Communication-, and Constraint-Aware Model Slicing and Execution for Accurate and Efficient Inference |
| URI | https://ieeexplore.ieee.org/document/8891642 |
| WOSCitedRecordID | wos000550990200013&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3PT8IwFG6QePCECsbf6cEjla5b19bbQiCQKJKAhhtpu86YkGFwqBf_d9tuoCZevG1rDy_t63uve-99HwBXXAqhaZQhYe0iiuzdDSlJMoSJMRlVQlLf9f54y0YjPpuJcQ20t70wxhhffGau3aPP5adLvXa_yjqc22AmsgZ3h7G47NXa6A5lLLTKGm-ssI0zaFhB-QRYdMZJd-oKuRw6JXbAoD-4VLwr6Tf-J8Q-aH335MHx1tscgJrJD0FjQ8oAqzPaBJ9395Nk2L2BA1fosrT6YWygjdrwVyuIfZd5Ch1dpyeJKFDyLlcGOmq0BZwsXL79yU_pfRjtlRPa8BYmWq8duEQ55OEnrLRwuBGxBR76vWl3gCqOBSQJowUimctUpiIggYm0wFIFAZOpImmk7OUxpUQzqhTJFJY29uI41DFXYcxNyKX1_eERqOfL3BwDqLTJAq6UcCBhhGCFuYhTzASRESeCn4CmW835SwmjMa8W8vTvz2dgz21XWc51DurFam0uwK5-K55fV5d-778AuxGvGA |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3NT8IwFG-ImugJFYzf9uCRStetW-ttIRCIgCSg4UbarjMmZBgE9eL_btttqIkXb_vo4aV7fR977_1-AFwzwbmiQYq4sYsoMLkbkoKkCBOtUyq5oG7q_bEfDYdsOuWjCmhsZmG01q75TN_YS1fLTxZqbX-VNRkzwUxgDO42DUzek09rldpDo8g36hqWdthEGtQvwHw8zJujuDWxrVwWnxJbaNAfbCrOmXSq_xNjH9S_p_LgaONvDkBFZ4egWtIywOKU1sDn4H4c91q3sGtbXRZGQ7QJtVED_hoGMfciS6Al7HQ0ESsUv4ulhpYcbQ7Hc1txf3JL2h9aOfWEJsCFsVJrCy-Rv3IAFEZa2CtFrIOHTnvS6qKCZQEJEtEVIqmtVSbcI54OFMdCel4kEkmSQJr0MaFERVRKkkosTPTFsK9CJv2QaZ8J4_39I7CVLTJ9DKBUOvWYlNzChBGCJWY8THDEiQgY4ewE1Oxuzl5yII1ZsZGnfz--ArvdyaA_6_eGd2dgz366vLnrHGytlmt9AXbU2-r5dXnp9OALA52yXw |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=proceeding&rft.title=Proceedings+%2F+International+Conference+on+Parallel+Architectures+and+Compilation+Techniques&rft.atitle=MOSAIC%3A+Heterogeneity-%2C+Communication-%2C+and+Constraint-Aware+Model+Slicing+and+Execution+for+Accurate+and+Efficient+Inference&rft.au=Han%2C+Myeonggyun&rft.au=Hyun%2C+Jihoon&rft.au=Park%2C+Seongbeom&rft.au=Park%2C+Jinsu&rft.date=2019-09-01&rft.pub=IEEE&rft.eissn=2641-7936&rft.spage=165&rft.epage=177&rft_id=info:doi/10.1109%2FPACT.2019.00021&rft.externalDocID=8891642 |