Performance Analysis and Characterization of Training Deep Learning Models on Mobile Device

Training deep learning models on mobile devices recently becomes possible, because of increasing computation power on mobile hardware and the advantages of enhancing user experiences. Most of the existing work on machine learning at mobile devices is focused on the inference of deep learning models,...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:2019 IEEE 25th International Conference on Parallel and Distributed Systems (ICPADS) s. 506 - 515
Hlavní autoři: Liu, Jie, Liu, Jiawen, Du, Wan, Li, Dong
Médium: Konferenční příspěvek
Jazyk:angličtina
Vydáno: IEEE 01.12.2019
Témata:
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Abstract Training deep learning models on mobile devices recently becomes possible, because of increasing computation power on mobile hardware and the advantages of enhancing user experiences. Most of the existing work on machine learning at mobile devices is focused on the inference of deep learning models, but not training. The performance characterization of training deep learning models on mobile devices is largely unexplored, although understanding the performance characterization is critical for designing and implementing deep learning models on mobile devices. In this paper, we perform a variety of experiments on a representative mobile device (the NVIDIA TX2) to study the performance of training deep learning models. We introduce a benchmark suite and a tool to study performance of training deep learning models on mobile devices, from the perspectives of memory consumption, hardware utilization, and power consumption. The tool can correlate performance results with fine-grained operations in deep learning models, providing capabilities to capture performance variance and problems at a fine granularity. We reveal interesting performance problems and opportunities, including under-utilization of heterogeneous hardware, large energy consumption of the memory, and high predictability of workload characterization. Based on the performance analysis, we suggest interesting research directions.
AbstractList Training deep learning models on mobile devices recently becomes possible, because of increasing computation power on mobile hardware and the advantages of enhancing user experiences. Most of the existing work on machine learning at mobile devices is focused on the inference of deep learning models, but not training. The performance characterization of training deep learning models on mobile devices is largely unexplored, although understanding the performance characterization is critical for designing and implementing deep learning models on mobile devices. In this paper, we perform a variety of experiments on a representative mobile device (the NVIDIA TX2) to study the performance of training deep learning models. We introduce a benchmark suite and a tool to study performance of training deep learning models on mobile devices, from the perspectives of memory consumption, hardware utilization, and power consumption. The tool can correlate performance results with fine-grained operations in deep learning models, providing capabilities to capture performance variance and problems at a fine granularity. We reveal interesting performance problems and opportunities, including under-utilization of heterogeneous hardware, large energy consumption of the memory, and high predictability of workload characterization. Based on the performance analysis, we suggest interesting research directions.
Author Liu, Jie
Liu, Jiawen
Du, Wan
Li, Dong
Author_xml – sequence: 1
  givenname: Jie
  surname: Liu
  fullname: Liu, Jie
  organization: University of California, Merced
– sequence: 2
  givenname: Jiawen
  surname: Liu
  fullname: Liu, Jiawen
  organization: University of California, Merced
– sequence: 3
  givenname: Wan
  surname: Du
  fullname: Du, Wan
  organization: University of California, Merced
– sequence: 4
  givenname: Dong
  surname: Li
  fullname: Li, Dong
  organization: University of California, Merced
BookMark eNotjMFKw0AURUfQhdZ-gSDzA6lvZpK-mWVI1RZSLFhXLspL8qID6aRMglC_3qCuDpdzOTfiMvSBhbhXsFAK3MOm2OWr1xQtLhcalFsAAOKFmDu0CrVVOrNGXYv3Hce2j0cKNcs8UHce_CApNLL4pEj1yNF_0-j7IPtW7iP54MOHXDGfZMkUf9e2b7gb5PTZ9pXveNJfvuZbcdVSN_D8nzPx9vS4L9ZJ-fK8KfIy8RrMmLCDLNVkuQGTmdaRqVOuU3BICIYR7NJwRaadBKBztdGq4YxbXSE3ZMxM3P11PTMfTtEfKZ4P1mGGLjM_bWZRwQ
CODEN IEEPAD
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/ICPADS47876.2019.00077
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Xplore POP ALL
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE/IET Electronic Library (IEL) (UW System Shared)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
EISBN 9781728125831
1728125839
EndPage 515
ExternalDocumentID 8975795
Genre orig-research
GroupedDBID 6IE
6IL
CBEJK
RIE
RIL
ID FETCH-LOGICAL-i203t-e90542a8ed0353f9a3c4ec4097a703e70863eba3fa3c0799c321de5ef2b7eda33
IEDL.DBID RIE
ISICitedReferencesCount 40
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000530854900068&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
IngestDate Thu Jun 29 18:38:56 EDT 2023
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i203t-e90542a8ed0353f9a3c4ec4097a703e70863eba3fa3c0799c321de5ef2b7eda33
PageCount 10
ParticipantIDs ieee_primary_8975795
PublicationCentury 2000
PublicationDate 2019-Dec
PublicationDateYYYYMMDD 2019-12-01
PublicationDate_xml – month: 12
  year: 2019
  text: 2019-Dec
PublicationDecade 2010
PublicationTitle 2019 IEEE 25th International Conference on Parallel and Distributed Systems (ICPADS)
PublicationTitleAbbrev PADSW
PublicationYear 2019
Publisher IEEE
Publisher_xml – name: IEEE
Score 2.300947
Snippet Training deep learning models on mobile devices recently becomes possible, because of increasing computation power on mobile hardware and the advantages of...
SourceID ieee
SourceType Publisher
StartPage 506
SubjectTerms deep learning
hardware heterogeneity
mobile device
performance analysis
performance characterization
Title Performance Analysis and Characterization of Training Deep Learning Models on Mobile Device
URI https://ieeexplore.ieee.org/document/8975795
WOSCitedRecordID wos000530854900068&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LSwMxEA5t8eBJpRXf5ODRtXlsNpujtBYFLQtWKXgo2WQiBdktffj7TXbXFsGLt5AJBCZkkknm-z6Ergm1DnjiIiG59AmKCzIvLImYFRJUorWoiOffnuR4nE6nKmuhmy0WBgCq4jO4Dc3qL9-WZhOeyvqpkkIq0UZtKWWN1WpAv5So_uMguxu-BLKZUHpAAw8lkb9VU6pDY3Twv-kOUW-HvsPZ9lw5Qi0ouug921X44x8mEawLiwdbyuUaUYlLhyeN8AMeAixww6H6gYPw2ecK-zHPZe7DgTeHQNFDr6P7yeAhaoQRojkjfB2B8hctplOwhAvulOYmBhOYq7TfwCB9msIh19x5A5FKGc6oBQGO5RKs5vwYdYqygBOErSDUJLGy1EBsHUv9fcrRPJdGU6CxPkXd4JjZoua-mDU-Ofu7-xztB8_X5R4XqLNebuAS7Zmv9Xy1vKoW7Bujd5np
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NS8MwGA5zCnpS2cRvc_BoXZM0TXOUTdlwGwWnDDyMNHkjA-nGPvz9Jm3dELx4C0kgkJD3I3mf50HoNiTGAottwAUTLkGxXuaFxgE1XICMleIF8fxbXwyHyXgs0xq622BhAKAoPoN73yz-8s1Mr_1TWSuRggvJd9AujyJKSrRWBfsloWz12ulD58XTzfjiA-KZKEPxWzelcBtPh_9b8Ag1t_g7nG48yzGqQd5A7-m2xh__cIlglRvc3pAul5hKPLN4VEk_4A7AHFcsqh_YS599LrGbM5hlziC4YW8qmuj16XHU7gaVNEIwpSFbBSBdqEVVAiZknFmpmI5Ae-4q5a4wCJeoMMgUs24gFFJqRokBDpZmAoxi7ATV81kOpwgbHhIdR9IQDZGxNHERlSVZJrQiQCJ1hhp-Yybzkv1iUu3J-d_dN2i_Oxr0J_3e8PkCHfhTKIs_LlF9tVjDFdrTX6vpcnFdHN430aadMA
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2019+IEEE+25th+International+Conference+on+Parallel+and+Distributed+Systems+%28ICPADS%29&rft.atitle=Performance+Analysis+and+Characterization+of+Training+Deep+Learning+Models+on+Mobile+Device&rft.au=Liu%2C+Jie&rft.au=Liu%2C+Jiawen&rft.au=Du%2C+Wan&rft.au=Li%2C+Dong&rft.date=2019-12-01&rft.pub=IEEE&rft.spage=506&rft.epage=515&rft_id=info:doi/10.1109%2FICPADS47876.2019.00077&rft.externalDocID=8975795