ZeroBN: Learning Compact Neural Networks For Latency-Critical Edge Systems

Edge devices have been widely adopted to bring deep learning applications onto low power embedded systems, mitigating the privacy and latency issues of accessing cloud servers. The increasingly computational demand of complex neural network models leads to large latency on edge devices with limited...

Full description

Saved in:
Bibliographic Details
Published in:2021 58th ACM/IEEE Design Automation Conference (DAC) pp. 151 - 156
Main Authors: Huai, Shuo, Zhang, Lei, Liu, Di, Liu, Weichen, Subramaniam, Ravi
Format: Conference Proceeding
Language:English
Published: IEEE 05.12.2021
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Abstract Edge devices have been widely adopted to bring deep learning applications onto low power embedded systems, mitigating the privacy and latency issues of accessing cloud servers. The increasingly computational demand of complex neural network models leads to large latency on edge devices with limited resources. Many application scenarios are real-time and have a strict latency constraint, while conventional neural network compression methods are not latency-oriented. In this work, we propose a novel compact neural networks training method to reduce the model latency on latency-critical edge systems. A latency predictor is also introduced to guide and optimize this procedure. Coupled with the latency predictor, our method can guarantee the latency for a compact model by only one training process. The experiment results show that, compared to state-of-the-art model compression methods, our approach can well-fit the 'hard' latency constraint by significantly reducing the latency with a mild accuracy drop. To satisfy a 34ms latency constraint, we compact ResNet-50 with 0.82% of accuracy drop. And for GoogLeNet, we can even increase the accuracy by 0.3%
AbstractList Edge devices have been widely adopted to bring deep learning applications onto low power embedded systems, mitigating the privacy and latency issues of accessing cloud servers. The increasingly computational demand of complex neural network models leads to large latency on edge devices with limited resources. Many application scenarios are real-time and have a strict latency constraint, while conventional neural network compression methods are not latency-oriented. In this work, we propose a novel compact neural networks training method to reduce the model latency on latency-critical edge systems. A latency predictor is also introduced to guide and optimize this procedure. Coupled with the latency predictor, our method can guarantee the latency for a compact model by only one training process. The experiment results show that, compared to state-of-the-art model compression methods, our approach can well-fit the 'hard' latency constraint by significantly reducing the latency with a mild accuracy drop. To satisfy a 34ms latency constraint, we compact ResNet-50 with 0.82% of accuracy drop. And for GoogLeNet, we can even increase the accuracy by 0.3%
Author Subramaniam, Ravi
Liu, Di
Liu, Weichen
Zhang, Lei
Huai, Shuo
Author_xml – sequence: 1
  givenname: Shuo
  surname: Huai
  fullname: Huai, Shuo
  email: shuo001@ntu.edu.sg
  organization: Nanyang Technological University,School of Computer Science and Engineering,Singapore
– sequence: 2
  givenname: Lei
  surname: Zhang
  fullname: Zhang, Lei
  email: letty.zhang@ntu.edu.sg
  organization: Nanyang Technological University,HP-NTU Digital Manufacturing Corporate Lab,Singapore
– sequence: 3
  givenname: Di
  surname: Liu
  fullname: Liu, Di
  email: liu.di@ntu.edu.sg
  organization: Nanyang Technological University,HP-NTU Digital Manufacturing Corporate Lab,Singapore
– sequence: 4
  givenname: Weichen
  surname: Liu
  fullname: Liu, Weichen
  email: liu@ntu.edu.sg
  organization: Nanyang Technological University,School of Computer Science and Engineering,Singapore
– sequence: 5
  givenname: Ravi
  surname: Subramaniam
  fullname: Subramaniam, Ravi
  email: ravi.subramaniam@hp.com
  organization: Innovations and Experiences - Business Personal Systems, HP Inc.,Palo Alto,California,USA
BookMark eNotj8tKw0AYRkdQUGueQIS8QOI_94y7GlsvhLpQN27KZPKnDDZJmYmUvL0BuzlnceCD75qc90OPhNxRyCkFc_-0LGkBWuQMGM2NLBQHc0YSowuqlBScaQGXJInR16BAFmLmFXn7xjA8bh7SCm3ofb9Ly6E7WDemG_wNdj9rPA7hJ6brIaSVHbF3U1YGP3o311Wzw_RjiiN28YZctHYfMTl5Qb7Wq8_yJaven1_LZZVZzsWYiRqsoFoyARq1MBYlUOFcY5gqeAOuldByNEygtRQUhZY6VcuWawey4XxBbv93PSJuD8F3Nkzb02P-B_F1TT4
ContentType Conference Proceeding
DBID 6IE
6IH
CBEJK
RIE
RIO
DOI 10.1109/DAC18074.2021.9586309
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan (POP) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP) 1998-present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
EISBN 9781665432740
1665432748
EndPage 156
ExternalDocumentID 9586309
Genre orig-research
GrantInformation_xml – fundername: National Research Foundation
  funderid: 10.13039/501100001321
GroupedDBID 6IE
6IH
ACM
ALMA_UNASSIGNED_HOLDINGS
CBEJK
RIE
RIO
ID FETCH-LOGICAL-a334t-4b0a41752407e749ae5014ccd92683d0cf50f3e924eaa10610f1c6b5f37c05d33
IEDL.DBID RIE
ISICitedReferencesCount 13
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000766079700026&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
IngestDate Wed Aug 27 02:28:29 EDT 2025
IsDoiOpenAccess false
IsOpenAccess true
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-a334t-4b0a41752407e749ae5014ccd92683d0cf50f3e924eaa10610f1c6b5f37c05d33
OpenAccessLink https://dr.ntu.edu.sg/bitstream/10356/155572/3/ZeroBN_Accept_Version.pdf
PageCount 6
ParticipantIDs ieee_primary_9586309
PublicationCentury 2000
PublicationDate 2021-Dec.-5
PublicationDateYYYYMMDD 2021-12-05
PublicationDate_xml – month: 12
  year: 2021
  text: 2021-Dec.-5
  day: 05
PublicationDecade 2020
PublicationTitle 2021 58th ACM/IEEE Design Automation Conference (DAC)
PublicationTitleAbbrev DAC
PublicationYear 2021
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssib060584060
Score 2.2828262
Snippet Edge devices have been widely adopted to bring deep learning applications onto low power embedded systems, mitigating the privacy and latency issues of...
SourceID ieee
SourceType Publisher
StartPage 151
SubjectTerms Design automation
Embedded systems
Heuristic algorithms
Pipelines
Predictive models
Privacy
Training
Title ZeroBN: Learning Compact Neural Networks For Latency-Critical Edge Systems
URI https://ieeexplore.ieee.org/document/9586309
WOSCitedRecordID wos000766079700026&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NSwMxEA21ePCk0orf5ODRbbPN18ab1i4iUnpQKF5KNpkUQbqy3Qr-e5N0WxG8eNqwJAQmCZOXmfcGoSsNAlzGCw9ypEvYgLCkYP64syIzRLsiVTqLxSbkeJxNp2rSQtdbLgwAxOQz6IVmjOXb0qzCU1lf8UzQwNbbkVKsuVqbvROie943kYakkxLVv78dpkHqxYPAQdprxv4qohJ9SL7_v9kPUPeHjIcnWzdziFqw6KDHV6jKu_ENbvRR5zgebFPjoLah3_0npncvcV5W-EmHm_FXsqlrgEd2DrgRK--il3z0PHxImrIIiaaU1d6MRDPv9QMWA8mUhhAbNMaqgcioJcZx4ih4YAVaB8RHXGpEwR2VhnBL6RFqL8oFHCPsJLGO2IJbfwsUBpTvo4wVAEyw1KQnqBPsMPtYK1_MGhOc_v37DO0FU8dkD36O2nW1ggu0az7rt2V1GZfrG_X-leg
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3PS8MwFA5jCnpS2cTf5uDRbumStI03nRtTZ9lhwvAy0uRlCLJK1wn-9yZZNxG8eGooTQKvCS9f3vu-h9CVhAhMwjMLcmITsA5hQcbsdmdZoog0WShk4otNxGmaTCZiVEPXGy4MAPjkM2i5po_l61wt3VVZW_Akoo6tt8WZHXbF1lqvHhffs96JVDSdkIj2_W03dGIvFgZ2wlbV-1cZFe9F-nv_m38fNX_oeHi0cTQHqAbzBnp8hSK_S29wpZA6w35rqxI7vQ35bh8-wXuB-3mBh9Kdjb-CdWUD3NMzwJVceRO99Hvj7iCoCiMEklJWWkMSyazfd2gMYiYkuOigUlp0ooRqogwnhoKFViClw3zEhCrKuKGxIlxTeojq83wORwibmGhDdMa1PQdGCoT9RigdAbCIhSo8Rg1nh-nHSvtiWpng5O_Xl2hnMH4eTocP6dMp2nVm96kf_AzVy2IJ52hbfZZvi-LC_7pv9IeZLw
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2021+58th+ACM%2FIEEE+Design+Automation+Conference+%28DAC%29&rft.atitle=ZeroBN%3A+Learning+Compact+Neural+Networks+For+Latency-Critical+Edge+Systems&rft.au=Huai%2C+Shuo&rft.au=Zhang%2C+Lei&rft.au=Liu%2C+Di&rft.au=Liu%2C+Weichen&rft.date=2021-12-05&rft.pub=IEEE&rft.spage=151&rft.epage=156&rft_id=info:doi/10.1109%2FDAC18074.2021.9586309&rft.externalDocID=9586309