Iterative Deep Neural Network Quantization With Lipschitz Constraint
Network quantization offers an effective solution to deep neural network compression for practical usage. Existing network quantization methods cannot theoretically guarantee the convergence. This paper proposes a novel iterative framework for network quantization with arbitrary bit-widths. We prese...
Uloženo v:
| Vydáno v: | IEEE transactions on multimedia Ročník 22; číslo 7; s. 1874 - 1888 |
|---|---|
| Hlavní autoři: | , , , , |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
Piscataway
IEEE
01.07.2020
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Témata: | |
| ISSN: | 1520-9210, 1941-0077 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Abstract | Network quantization offers an effective solution to deep neural network compression for practical usage. Existing network quantization methods cannot theoretically guarantee the convergence. This paper proposes a novel iterative framework for network quantization with arbitrary bit-widths. We present two Lipschitz constraint based quantization strategies, namely width-level network quantization (WLQ) and multi-level network quantization (MLQ), for high-bit and extremely low-bit (ternary) quantization, respectively. In WLQ, Lipschitz based partition is developed to divide parameters in each layer into two groups: one for quantization and the other for re-training to eliminate the quantization loss. WLQ is further extended to MLQ by introducing layer partition to suppress the quantization loss for extremely low bit-widths. The Lipschitz based partition is proven to guarantee the convergence of the quantized networks. Moreover, the proposed framework is complementary to network compression methods such as activation quantization, pruning and efficient network architectures. The proposed framework is evaluated over extensive state-of-the-art deep neural networks, i.e., AlexNet, VGG-16, GoogleNet and ResNet18. Experimental results show that the proposed framework improves the performance of tasks like classification, object detection and semantic segmentation. |
|---|---|
| AbstractList | Network quantization offers an effective solution to deep neural network compression for practical usage. Existing network quantization methods cannot theoretically guarantee the convergence. This paper proposes a novel iterative framework for network quantization with arbitrary bit-widths. We present two Lipschitz constraint based quantization strategies, namely width-level network quantization (WLQ) and multi-level network quantization (MLQ), for high-bit and extremely low-bit (ternary) quantization, respectively. In WLQ, Lipschitz based partition is developed to divide parameters in each layer into two groups: one for quantization and the other for re-training to eliminate the quantization loss. WLQ is further extended to MLQ by introducing layer partition to suppress the quantization loss for extremely low bit-widths. The Lipschitz based partition is proven to guarantee the convergence of the quantized networks. Moreover, the proposed framework is complementary to network compression methods such as activation quantization, pruning and efficient network architectures. The proposed framework is evaluated over extensive state-of-the-art deep neural networks, i.e., AlexNet, VGG-16, GoogleNet and ResNet18. Experimental results show that the proposed framework improves the performance of tasks like classification, object detection and semantic segmentation. |
| Author | Qi, Yingyong Zou, Junni Xiong, Hongkai Xu, Yuhui Dai, Wenrui |
| Author_xml | – sequence: 1 givenname: Yuhui orcidid: 0000-0002-7109-7140 surname: Xu fullname: Xu, Yuhui email: yuhuixu@sjtu.edu.cn organization: Department of Electronic Engineering, Shanghai Jiao Tong University, ShanghaiChina – sequence: 2 givenname: Wenrui orcidid: 0000-0003-2522-5778 surname: Dai fullname: Dai, Wenrui email: daiwenrui@sjtu.edu.cn organization: Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai, China – sequence: 3 givenname: Yingyong surname: Qi fullname: Qi, Yingyong email: yqi@uci.edu organization: Department of Mathematics, University of California at Irvine, Irvine, CA, USA – sequence: 4 givenname: Junni orcidid: 0000-0002-9694-9880 surname: Zou fullname: Zou, Junni email: zou-jn@cs.sjtu.edu.cn organization: Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai, China – sequence: 5 givenname: Hongkai orcidid: 0000-0003-4552-0029 surname: Xiong fullname: Xiong, Hongkai email: xionghongkai@sjtu.edu.cn organization: Department of Electronic Engineering, Shanghai Jiao Tong University, ShanghaiChina |
| BookMark | eNp9kL1PwzAUxC1UJNrCjsQSiTnFdhwnHlHLR6UWhFTEaDnJi-pSnGA7IPrX49CKgYHpbrjfO70boYFpDCB0TvCEECyuVsvlhGIiJlQwkafZERoSwUiMcZYNgk8pjgUl-ASNnNtgTFiKsyGazT1Y5fUHRDOANnqAzqptEP_Z2NfoqVPG610INCZ60X4dLXTryrX2u2jaGOet0safouNabR2cHXSMnm9vVtP7ePF4N59eL-KSCuJjRtIiqQkvKlURyAsu6qx3Ba5YmQIInipaVSKrec5SVagkp4wlRZkAVaKCZIwu93db27x34LzcNJ01oVJSRjEVnAseUnifKm3jnIVatla_KfslCZb9VjJsJfut5GGrgPA_SKn9z9P9g9v_wIs9qAHgtyfPc0ZZknwDbGV5Vw |
| CODEN | ITMUF8 |
| CitedBy_id | crossref_primary_10_1109_TMM_2024_3395844 crossref_primary_10_1109_TCSII_2022_3205945 crossref_primary_10_1109_TDSC_2022_3198934 crossref_primary_10_1109_TMM_2021_3134158 crossref_primary_10_1109_TMM_2022_3233255 crossref_primary_10_1109_ACCESS_2023_3280552 crossref_primary_10_1109_JSTARS_2021_3120009 crossref_primary_10_1002_cta_3834 crossref_primary_10_1109_ACCESS_2021_3054879 crossref_primary_10_1109_TCSVT_2022_3231789 crossref_primary_10_1088_1742_6596_2006_1_012012 crossref_primary_10_1109_TMM_2021_3124095 crossref_primary_10_1109_TMM_2022_3189496 crossref_primary_10_1109_TMM_2023_3338052 crossref_primary_10_1109_TMM_2022_3208530 |
| Cites_doi | 10.1109/CVPR.2017.15 10.1109/ICCV.2017.155 10.1109/CVPR.2014.220 10.1109/CVPR.2016.90 10.1109/TPAMI.2015.2502579 10.1109/TPAMI.2017.2699184 10.1109/TNNLS.2018.2808319 10.1109/TIP.2018.2814344 10.1109/ICCV.2015.169 10.1109/CVPR.2018.00452 10.1145/3007787.3001163 10.1109/ICCV.2017.541 10.1016/j.patcog.2017.03.021 10.1109/CVPR.2017.574 10.1145/2647868.2654889 10.1109/CVPR.2018.00716 10.1109/CVPR.2015.7298594 10.1109/CVPR.2015.7298965 10.1109/CVPR.2014.244 |
| ContentType | Journal Article |
| Copyright | Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020 |
| Copyright_xml | – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2020 |
| DBID | 97E RIA RIE AAYXX CITATION 7SC 7SP 8FD JQ2 L7M L~C L~D |
| DOI | 10.1109/TMM.2019.2949857 |
| DatabaseName | IEEE Xplore (IEEE) IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Electronic Library (IEL) CrossRef Computer and Information Systems Abstracts Electronics & Communications Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional |
| DatabaseTitle | CrossRef Technology Research Database Computer and Information Systems Abstracts – Academic Electronics & Communications Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Professional |
| DatabaseTitleList | Technology Research Database |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Engineering Computer Science |
| EISSN | 1941-0077 |
| EndPage | 1888 |
| ExternalDocumentID | 10_1109_TMM_2019_2949857 8884243 |
| Genre | orig-research |
| GrantInformation_xml | – fundername: National Natural Science Foundation of China grantid: 61971285; 61529101; 61831018; 61425011; 61622112; 61720106001; 61932022; 61931023 funderid: 10.13039/501100001809 – fundername: Program of Shanghai Academic Research Leader grantid: 17XD1401900 funderid: 10.13039/501100012247 |
| GroupedDBID | -~X 0R~ 29I 4.4 5GY 5VS 6IK 97E AAJGR AARMG AASAJ AAWTH ABAZT ABQJQ ABVLG ACGFO ACGFS ACIWK AENEX AETIX AGQYO AGSQL AHBIQ AI. AIBXA AKJIK AKQYR ALLEH ALMA_UNASSIGNED_HOLDINGS ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ CS3 DU5 EBS EJD HZ~ H~9 IFIPE IFJZH IPLJI JAVBF LAI M43 O9- OCL P2P PQQKQ RIA RIE RNS TN5 VH1 ZY4 AAYXX CITATION 7SC 7SP 8FD JQ2 L7M L~C L~D |
| ID | FETCH-LOGICAL-c291t-415b3f16bdad1e8b69f7ad1eb0d4c5ee965a2dd97f6845aba382443bc3e2a9de3 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 15 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000545990500018&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 1520-9210 |
| IngestDate | Sun Nov 30 04:56:45 EST 2025 Sat Nov 29 03:10:03 EST 2025 Tue Nov 18 22:26:24 EST 2025 Wed Aug 27 02:36:58 EDT 2025 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 7 |
| Language | English |
| License | https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html https://doi.org/10.15223/policy-029 https://doi.org/10.15223/policy-037 |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c291t-415b3f16bdad1e8b69f7ad1eb0d4c5ee965a2dd97f6845aba382443bc3e2a9de3 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ORCID | 0000-0003-2522-5778 0000-0002-9694-9880 0000-0003-4552-0029 0000-0002-7109-7140 |
| PQID | 2420296696 |
| PQPubID | 75737 |
| PageCount | 15 |
| ParticipantIDs | proquest_journals_2420296696 crossref_primary_10_1109_TMM_2019_2949857 ieee_primary_8884243 crossref_citationtrail_10_1109_TMM_2019_2949857 |
| PublicationCentury | 2000 |
| PublicationDate | 2020-07-01 |
| PublicationDateYYYYMMDD | 2020-07-01 |
| PublicationDate_xml | – month: 07 year: 2020 text: 2020-07-01 day: 01 |
| PublicationDecade | 2020 |
| PublicationPlace | Piscataway |
| PublicationPlace_xml | – name: Piscataway |
| PublicationTitle | IEEE transactions on multimedia |
| PublicationTitleAbbrev | TMM |
| PublicationYear | 2020 |
| Publisher | IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Publisher_xml | – name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| References | ref35 ref37 ref15 chen (ref4) 2014 ref30 chen (ref23) 2015 ref10 li (ref28) 2016 guo (ref44) 2016 han (ref11) 2015 ref2 ref39 ref38 ref16 ref19 ref18 zhou (ref25) 2017 hinton (ref32) 2015 ren (ref3) 2015 jaderberg (ref14) 2014 gysel (ref43) 2016 gong (ref12) 2014 tang (ref27) 2017 song (ref22) 2018 ref24 ref47 zhou (ref41) 2016 ref20 ref21 simonyan (ref8) 2014 huang (ref33) 2017 li (ref17) 2016 choi (ref42) 2018 bartlett (ref36) 0 zhu (ref40) 2016 ref29 howard (ref34) 2017 ref7 li (ref46) 2018 rastegari (ref26) 2016 ref6 han (ref9) 2015 ref5 krizhevsky (ref1) 2012 xu (ref13) 2018 liu (ref45) 2016 zhang (ref31) 2018 |
| References_xml | – year: 2016 ident: ref28 article-title: Ternary weight networks publication-title: arXiv 1605 04711 [cs] – ident: ref16 doi: 10.1109/CVPR.2017.15 – year: 2016 ident: ref41 article-title: DoReFa-Net: Training low bitwidth convolutional neural networks with low bitwidth gradients publication-title: arXiv 1606 06160 [cs] – year: 2016 ident: ref43 article-title: Ristretto: Hardware-oriented approximation of convolutional neural networks – ident: ref18 doi: 10.1109/ICCV.2017.155 – start-page: 91 year: 2015 ident: ref3 article-title: Faster R-CNN: Towards real-time object detection with region proposal networks publication-title: Proc Adv Neural Inf Process Syst – ident: ref6 doi: 10.1109/CVPR.2014.220 – start-page: 525 year: 2016 ident: ref26 article-title: XNOR-Net: Imagenet classification using binary convolutional neural networks publication-title: Proc Eur Conf Comput Vis – ident: ref38 doi: 10.1109/CVPR.2016.90 – year: 2015 ident: ref11 article-title: Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding publication-title: arXiv 1510 00149 [cs] – start-page: 4335 year: 2018 ident: ref13 article-title: Deep neural network compression with single and multiple level quantization publication-title: Proc Assoc Advancement Artif Intell Conf Artif Intell – ident: ref15 doi: 10.1109/TPAMI.2015.2502579 – year: 2016 ident: ref17 article-title: Pruning filters for efficient convnets publication-title: arXiv 1608 08710 – ident: ref47 doi: 10.1109/TPAMI.2017.2699184 – year: 2015 ident: ref32 article-title: Distilling the knowledge in a neural network publication-title: ArXiv 1503 02531 – start-page: 2625 year: 2017 ident: ref27 article-title: How to train a compact binary neural network with high accuracy publication-title: Proc Assoc Advancement Artif Intell Conf Artif Intell – ident: ref24 doi: 10.1109/TNNLS.2018.2808319 – start-page: 365 year: 2018 ident: ref31 article-title: LQ-Nets: Learned quantization for highly accurate and compact deep neural networks publication-title: Proc Eur Conf Comput Vis – ident: ref21 doi: 10.1109/TIP.2018.2814344 – ident: ref2 doi: 10.1109/ICCV.2015.169 – ident: ref30 doi: 10.1109/CVPR.2018.00452 – ident: ref10 doi: 10.1145/3007787.3001163 – start-page: 21 year: 2016 ident: ref45 article-title: SSD: Single shot multibox detector publication-title: Proc Eur Conf Comput Vis – ident: ref19 doi: 10.1109/ICCV.2017.541 – start-page: 1097 year: 2012 ident: ref1 article-title: Imagenet classification with deep convolutional neural networks publication-title: Proc Adv Neural Inf Process Syst – start-page: 1135 year: 2015 ident: ref9 article-title: Learning both weights and connections for efficient neural network publication-title: Proc Adv Neural Inf Process Syst – start-page: 6240 year: 0 ident: ref36 article-title: Spectrally-normalized margin bounds for neural networks publication-title: Proc Adv Neural Inf Process Syst – year: 2017 ident: ref34 article-title: MobileNets: Efficient convolutional neural networks for mobile vision applications publication-title: arXiv 1704 04861 – ident: ref20 doi: 10.1016/j.patcog.2017.03.021 – year: 2014 ident: ref4 article-title: Semantic image segmentation with deep convolutional nets and fully connected CRFs publication-title: arXiv 1412 7062 – ident: ref29 doi: 10.1109/CVPR.2017.574 – year: 2014 ident: ref12 article-title: Compressing deep convolutional networks using vector quantization publication-title: arXiv 1412 6115 – year: 2018 ident: ref46 article-title: Tiny-DSOD: Lightweight object detection for resource-restricted usages publication-title: arXiv 1807 11013 – year: 2018 ident: ref42 article-title: Pact: Parameterized clipping activation for quantized neural networks publication-title: arXiv 1805 06085 – ident: ref37 doi: 10.1145/2647868.2654889 – ident: ref35 doi: 10.1109/CVPR.2018.00716 – ident: ref39 doi: 10.1109/CVPR.2015.7298594 – ident: ref5 doi: 10.1109/CVPR.2015.7298965 – year: 2014 ident: ref14 article-title: Speeding up convolutional neural networks with low rank expansions publication-title: arXiv 1405 3866 – year: 2014 ident: ref8 article-title: Very deep convolutional networks for large-scale image recognition publication-title: arXiv 1409 1556 – year: 2017 ident: ref33 article-title: Like what you like: Knowledge distill via neuron selectivity transfer publication-title: arXiv 1707 01219 – year: 2016 ident: ref40 article-title: Trained ternary quantization publication-title: arXiv 1612 01064 – ident: ref7 doi: 10.1109/CVPR.2014.244 – start-page: 394 year: 2018 ident: ref22 article-title: Binary generative adversarial network for image retrieval publication-title: Proc Assoc Advancement Artif Intell Conf Artif Intell – start-page: 2285 year: 2015 ident: ref23 article-title: Compressing neural networks with the hashing trick publication-title: Proc Int Conf Mach Learn – year: 2017 ident: ref25 article-title: Incremental network quantization: Towards lossless CNNs with low-precision weights publication-title: arXiv 1702 03044 – start-page: 1379 year: 2016 ident: ref44 article-title: Dynamic network surgery for efficient DNNs publication-title: Proc Adv Neural Inf Process Syst |
| SSID | ssj0014507 |
| Score | 2.374103 |
| Snippet | Network quantization offers an effective solution to deep neural network compression for practical usage. Existing network quantization methods cannot... |
| SourceID | proquest crossref ieee |
| SourceType | Aggregation Database Enrichment Source Index Database Publisher |
| StartPage | 1874 |
| SubjectTerms | Artificial neural networks Computational modeling Computer architecture Convergence Convolution Image coding iterative quantization Lipschitz constraint Measurement Network compression Neural networks Object detection Object recognition Partitions Performance enhancement Quantization (signal) Semantic segmentation Semantics State-of-the-art reviews |
| Title | Iterative Deep Neural Network Quantization With Lipschitz Constraint |
| URI | https://ieeexplore.ieee.org/document/8884243 https://www.proquest.com/docview/2420296696 |
| Volume | 22 |
| WOSCitedRecordID | wos000545990500018&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVIEE databaseName: IEEE Electronic Library (IEL) customDbUrl: eissn: 1941-0077 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0014507 issn: 1520-9210 databaseCode: RIE dateStart: 19990101 isFulltext: true titleUrlDefault: https://ieeexplore.ieee.org/ providerName: IEEE |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3NS8MwFH_M4UEPTjfF6ZQcvAh2y9K0TY7iHApuKEzcrSTtKw5kG1vnwb_eJO3GQBG85ZA04b2-vPfyPn4AVylTIRrnx8tUZBwUQY1Isa72VJBpFEIbI7YAm4iGQzEey-cK3GxqYRDRJZ9h2w5dLD-dJSv7VNYx3hpn3N-BnSgKi1qtTcSAB6402qgjanej65AklZ3RYGBzuGSbSS6FVURbKshhqvy4iJ126df-d65DOCitSHJbsP0IKjitQ22N0EBKga3D_la7wQb0Hl0LZXO_kR7inNjGHOYrwyITnLysDJXLskzyNsnfydNkvrRxhi9icT0dmkR-DK_9-9Hdg1eiKHgJk93cMxpa-1k31KlKuyh0KLPIjjRNeRIgyjBQLE1llIWCB0orXxiV7-vER6Zkiv4JVKezKZ4CQaTIE660NRqVLzVnGFIUmGkj1xlvQmdN2DgpW4zbs33EztWgMjasiC0r4pIVTbjerJgX7TX-mNuwpN_MK6nehNaad3Epf8vYGB6UGU9Ohme_rzqHPWY9Z5d424JqvljhBewmn_lkubh0v9Y3MOHMxw |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3dS8MwED90CuqD06k4P_Pgi2BdlqZd8ihOUdyKwsS9laS94kDmcJ0P_vUmaVcGiuBbHpIm3PVyd7mPH8BZylSIxvnxMtUxDoqgRqRYW3sqyDQKoY0RW4BNdKJIDIfycQkuqloYRHTJZ3hphy6Wn74nM_tU1jLeGmfcX4aVgHNGi2qtKmbAA1ccbRQStfvReVCSytag37dZXPKSSS6FVUULSsihqvy4ip1-ua3_72RbsFnakeSqYPw2LOG4AfU5RgMpRbYBGwsNB3ege--aKJsbjnQRJ8S25jBfiYpccPI0M3QuCzPJyyh_Jb3RZGojDV_EIns6PIl8F55vbwbXd16Jo-AlTLZzz-ho7WftUKcqbaPQocw6dqRpypMAUYaBYmkqO1koeKC08oVR-r5OfGRKpujvQW38PsZ9IIgUecKVtmaj8qXmDEOKAjNtJDvjTWjNCRsnZZNxe7a32DkbVMaGFbFlRVyyognn1YpJ0WDjj7k7lvTVvJLqTTia8y4uJXAaG9ODMuPLyfDg91WnsHY36Pfi3n30cAjrzPrRLg33CGr5xwyPYTX5zEfTjxP3m30D_9zQDg |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Iterative+Deep+Neural+Network+Quantization+With+Lipschitz+Constraint&rft.jtitle=IEEE+transactions+on+multimedia&rft.au=Xu%2C+Yuhui&rft.au=Dai%2C+Wenrui&rft.au=Qi%2C+Yingyong&rft.au=Zou%2C+Junni&rft.date=2020-07-01&rft.issn=1520-9210&rft.eissn=1941-0077&rft.volume=22&rft.issue=7&rft.spage=1874&rft.epage=1888&rft_id=info:doi/10.1109%2FTMM.2019.2949857&rft.externalDBID=n%2Fa&rft.externalDocID=10_1109_TMM_2019_2949857 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1520-9210&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1520-9210&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1520-9210&client=summon |