ZeroBN: Learning Compact Neural Networks For Latency-Critical Edge Systems
Edge devices have been widely adopted to bring deep learning applications onto low power embedded systems, mitigating the privacy and latency issues of accessing cloud servers. The increasingly computational demand of complex neural network models leads to large latency on edge devices with limited...
Saved in:
| Published in: | 2021 58th ACM/IEEE Design Automation Conference (DAC) pp. 151 - 156 |
|---|---|
| Main Authors: | , , , , |
| Format: | Conference Proceeding |
| Language: | English |
| Published: |
IEEE
05.12.2021
|
| Subjects: | |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Abstract | Edge devices have been widely adopted to bring deep learning applications onto low power embedded systems, mitigating the privacy and latency issues of accessing cloud servers. The increasingly computational demand of complex neural network models leads to large latency on edge devices with limited resources. Many application scenarios are real-time and have a strict latency constraint, while conventional neural network compression methods are not latency-oriented. In this work, we propose a novel compact neural networks training method to reduce the model latency on latency-critical edge systems. A latency predictor is also introduced to guide and optimize this procedure. Coupled with the latency predictor, our method can guarantee the latency for a compact model by only one training process. The experiment results show that, compared to state-of-the-art model compression methods, our approach can well-fit the 'hard' latency constraint by significantly reducing the latency with a mild accuracy drop. To satisfy a 34ms latency constraint, we compact ResNet-50 with 0.82% of accuracy drop. And for GoogLeNet, we can even increase the accuracy by 0.3% |
|---|---|
| AbstractList | Edge devices have been widely adopted to bring deep learning applications onto low power embedded systems, mitigating the privacy and latency issues of accessing cloud servers. The increasingly computational demand of complex neural network models leads to large latency on edge devices with limited resources. Many application scenarios are real-time and have a strict latency constraint, while conventional neural network compression methods are not latency-oriented. In this work, we propose a novel compact neural networks training method to reduce the model latency on latency-critical edge systems. A latency predictor is also introduced to guide and optimize this procedure. Coupled with the latency predictor, our method can guarantee the latency for a compact model by only one training process. The experiment results show that, compared to state-of-the-art model compression methods, our approach can well-fit the 'hard' latency constraint by significantly reducing the latency with a mild accuracy drop. To satisfy a 34ms latency constraint, we compact ResNet-50 with 0.82% of accuracy drop. And for GoogLeNet, we can even increase the accuracy by 0.3% |
| Author | Subramaniam, Ravi Liu, Di Liu, Weichen Zhang, Lei Huai, Shuo |
| Author_xml | – sequence: 1 givenname: Shuo surname: Huai fullname: Huai, Shuo email: shuo001@ntu.edu.sg organization: Nanyang Technological University,School of Computer Science and Engineering,Singapore – sequence: 2 givenname: Lei surname: Zhang fullname: Zhang, Lei email: letty.zhang@ntu.edu.sg organization: Nanyang Technological University,HP-NTU Digital Manufacturing Corporate Lab,Singapore – sequence: 3 givenname: Di surname: Liu fullname: Liu, Di email: liu.di@ntu.edu.sg organization: Nanyang Technological University,HP-NTU Digital Manufacturing Corporate Lab,Singapore – sequence: 4 givenname: Weichen surname: Liu fullname: Liu, Weichen email: liu@ntu.edu.sg organization: Nanyang Technological University,School of Computer Science and Engineering,Singapore – sequence: 5 givenname: Ravi surname: Subramaniam fullname: Subramaniam, Ravi email: ravi.subramaniam@hp.com organization: Innovations and Experiences - Business Personal Systems, HP Inc.,Palo Alto,California,USA |
| BookMark | eNotj8tKw0AYRkdQUGueQIS8QOI_94y7GlsvhLpQN27KZPKnDDZJmYmUvL0BuzlnceCD75qc90OPhNxRyCkFc_-0LGkBWuQMGM2NLBQHc0YSowuqlBScaQGXJInR16BAFmLmFXn7xjA8bh7SCm3ofb9Ly6E7WDemG_wNdj9rPA7hJ6brIaSVHbF3U1YGP3o311Wzw_RjiiN28YZctHYfMTl5Qb7Wq8_yJaven1_LZZVZzsWYiRqsoFoyARq1MBYlUOFcY5gqeAOuldByNEygtRQUhZY6VcuWawey4XxBbv93PSJuD8F3Nkzb02P-B_F1TT4 |
| ContentType | Conference Proceeding |
| DBID | 6IE 6IH CBEJK RIE RIO |
| DOI | 10.1109/DAC18074.2021.9586309 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan (POP) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Xplore Digital Library IEEE Proceedings Order Plans (POP) 1998-present |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| EISBN | 9781665432740 1665432748 |
| EndPage | 156 |
| ExternalDocumentID | 9586309 |
| Genre | orig-research |
| GrantInformation_xml | – fundername: National Research Foundation funderid: 10.13039/501100001321 |
| GroupedDBID | 6IE 6IH ACM ALMA_UNASSIGNED_HOLDINGS CBEJK RIE RIO |
| ID | FETCH-LOGICAL-a334t-4b0a41752407e749ae5014ccd92683d0cf50f3e924eaa10610f1c6b5f37c05d33 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 13 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000766079700026&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| IngestDate | Wed Aug 27 02:28:29 EDT 2025 |
| IsDoiOpenAccess | false |
| IsOpenAccess | true |
| IsPeerReviewed | false |
| IsScholarly | true |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-a334t-4b0a41752407e749ae5014ccd92683d0cf50f3e924eaa10610f1c6b5f37c05d33 |
| OpenAccessLink | https://dr.ntu.edu.sg/bitstream/10356/155572/3/ZeroBN_Accept_Version.pdf |
| PageCount | 6 |
| ParticipantIDs | ieee_primary_9586309 |
| PublicationCentury | 2000 |
| PublicationDate | 2021-Dec.-5 |
| PublicationDateYYYYMMDD | 2021-12-05 |
| PublicationDate_xml | – month: 12 year: 2021 text: 2021-Dec.-5 day: 05 |
| PublicationDecade | 2020 |
| PublicationTitle | 2021 58th ACM/IEEE Design Automation Conference (DAC) |
| PublicationTitleAbbrev | DAC |
| PublicationYear | 2021 |
| Publisher | IEEE |
| Publisher_xml | – name: IEEE |
| SSID | ssib060584060 |
| Score | 2.2829232 |
| Snippet | Edge devices have been widely adopted to bring deep learning applications onto low power embedded systems, mitigating the privacy and latency issues of... |
| SourceID | ieee |
| SourceType | Publisher |
| StartPage | 151 |
| SubjectTerms | Design automation Embedded systems Heuristic algorithms Pipelines Predictive models Privacy Training |
| Title | ZeroBN: Learning Compact Neural Networks For Latency-Critical Edge Systems |
| URI | https://ieeexplore.ieee.org/document/9586309 |
| WOSCitedRecordID | wos000766079700026&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3PS8MwFA5zePCksom_ycGj2dImTRpvOjdEZOygMLyMNHkZgqzSdYL_vUnWTQQvnhoKIfDS5nsv733fQ-jKQ5I1lFuSgGaECwFEe-QjKgXHmIRcZDo2m5DjcT6dqkkLXW-5MAAQi8-gF4Yxl29LswpXZX2V5YIFtt6OlGLN1dp8OyG757GJNiSdhKr-_e0gCVIvPghMk14z91cTlYgho_3_rX6Auj9kPDzZwswhasGigx5foSrvxje40Ued4_hjmxoHtQ397h-xvHuJR2WFn3TwjL_Ipq8BHto54EasvIteRsPnwQNp2iIQzRivCS-o5h71QywGkisNITdojFWpyJmlxmXUMfCBFWgdIj7qEiOKzDFpaGYZO0LtRbmAY4SFCf4XS73bVfDUmaJILHWpU_7U0zLhJ6gT7DD7WCtfzBoTnP79-gztBVPHYo_sHLXragUXaNd81m_L6jJu1zcVY5U9 |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3PS8MwFA5jCnpS2cTf5uDRbGmT_vKmc2XqLDtMGF5GmrwMQVbpOsH_3iTrJoIXTw2FEHhp872X977vIXRlIElJyhXxQDDCwxCIMMhHEh80YxHEYSBcs4koy-LJJBk10PWGCwMArvgMOnbocvmqkEt7VdZNgjhklq23FXDu0xVba_312PyeQSda03Q8mnTvb3ueFXsxYaDvderZv9qoOBRJ9_63_j5q_9Dx8GgDNAeoAfMWenyFsrjLbnCtkDrD7teWFbZ6G-LdPFyB9wKnRYmHwvrGX2Td2QD31QxwLVfeRi9pf9wbkLoxAhGM8YrwnApucN9GYxDxRIDNDkqpEj-MmaJSB1QzMKEVCGFjPqo9GeaBZpGkgWLsEDXnxRyOEA6l9cCYbxyvnPta5rmnqPZ1Ys49EXn8GLWsHaYfK-2LaW2Ck79fX6Kdwfh5OB0-ZE-naNea3ZV-BGeoWZVLOEfb8rN6W5QXbuu-AQs4mIQ |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2021+58th+ACM%2FIEEE+Design+Automation+Conference+%28DAC%29&rft.atitle=ZeroBN%3A+Learning+Compact+Neural+Networks+For+Latency-Critical+Edge+Systems&rft.au=Huai%2C+Shuo&rft.au=Zhang%2C+Lei&rft.au=Liu%2C+Di&rft.au=Liu%2C+Weichen&rft.date=2021-12-05&rft.pub=IEEE&rft.spage=151&rft.epage=156&rft_id=info:doi/10.1109%2FDAC18074.2021.9586309&rft.externalDocID=9586309 |