Efficient Acceleration of Deep Learning Inference on Resource-Constrained Edge Devices: A Review

Successful integration of deep neural networks (DNNs) or deep learning (DL) has resulted in breakthroughs in many areas. However, deploying these highly accurate models for data-driven, learned, automatic, and practical machine learning (ML) solutions to end-user applications remains challenging. DL...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Proceedings of the IEEE Jg. 111; H. 1; S. 1 - 50
Hauptverfasser: Shuvo, Md. Maruf Hossain, Islam, Syed Kamrul, Cheng, Jianlin, Morshed, Bashir I.
Format: Journal Article
Sprache:Englisch
Veröffentlicht: New York IEEE 01.01.2023
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Schlagworte:
ISSN:0018-9219, 1558-2256
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Abstract Successful integration of deep neural networks (DNNs) or deep learning (DL) has resulted in breakthroughs in many areas. However, deploying these highly accurate models for data-driven, learned, automatic, and practical machine learning (ML) solutions to end-user applications remains challenging. DL algorithms are often computationally expensive, power-hungry, and require large memory to process complex and iterative operations of millions of parameters. Hence, training and inference of DL models are typically performed on high-performance computing (HPC) clusters in the cloud. Data transmission to the cloud results in high latency, round-trip delay, security and privacy concerns, and the inability of real-time decisions. Thus, processing on edge devices can significantly reduce cloud transmission cost. Edge devices are end devices closest to the user, such as mobile phones, cyber-physical systems (CPSs), wearables, the Internet of Things (IoT), embedded and autonomous systems, and intelligent sensors. These devices have limited memory, computing resources, and power-handling capability. Therefore, optimization techniques at both the hardware and software levels have been developed to handle the DL deployment efficiently on the edge. Understanding the existing research, challenges, and opportunities is fundamental to leveraging the next generation of edge devices with artificial intelligence (AI) capability. Mainly, four research directions have been pursued for efficient DL inference on edge devices: 1) novel DL architecture and algorithm design; 2) optimization of existing DL methods; 3) development of algorithm-hardware codesign; and 4) efficient accelerator design for DL deployment. This article focuses on surveying each of the four research directions, providing a comprehensive review of the state-of-the-art tools and techniques for efficient edge inference.
AbstractList Successful integration of deep neural networks (DNNs) or deep learning (DL) has resulted in breakthroughs in many areas. However, deploying these highly accurate models for data-driven, learned, automatic, and practical machine learning (ML) solutions to end-user applications remains challenging. DL algorithms are often computationally expensive, power-hungry, and require large memory to process complex and iterative operations of millions of parameters. Hence, training and inference of DL models are typically performed on high-performance computing (HPC) clusters in the cloud. Data transmission to the cloud results in high latency, round-trip delay, security and privacy concerns, and the inability of real-time decisions. Thus, processing on edge devices can significantly reduce cloud transmission cost. Edge devices are end devices closest to the user, such as mobile phones, cyber-physical systems (CPSs), wearables, the Internet of Things (IoT), embedded and autonomous systems, and intelligent sensors. These devices have limited memory, computing resources, and power-handling capability. Therefore, optimization techniques at both the hardware and software levels have been developed to handle the DL deployment efficiently on the edge. Understanding the existing research, challenges, and opportunities is fundamental to leveraging the next generation of edge devices with artificial intelligence (AI) capability. Mainly, four research directions have been pursued for efficient DL inference on edge devices: 1) novel DL architecture and algorithm design; 2) optimization of existing DL methods; 3) development of algorithm-hardware codesign; and 4) efficient accelerator design for DL deployment. This article focuses on surveying each of the four research directions, providing a comprehensive review of the state-of-the-art tools and techniques for efficient edge inference.
Author Shuvo, Md. Maruf Hossain
Morshed, Bashir I.
Cheng, Jianlin
Islam, Syed Kamrul
Author_xml – sequence: 1
  givenname: Md. Maruf Hossain
  orcidid: 0000-0002-3498-4947
  surname: Shuvo
  fullname: Shuvo, Md. Maruf Hossain
  organization: Department of Electrical Engineering and Computer Science, Analog/Mixed Signal VLSI and Devices Laboratory (AVDL), University of Missouri, Columbia, MO, USA
– sequence: 2
  givenname: Syed Kamrul
  orcidid: 0000-0002-0501-0027
  surname: Islam
  fullname: Islam, Syed Kamrul
  organization: Department of Electrical Engineering and Computer Science, Analog/Mixed Signal VLSI and Devices Laboratory (AVDL), University of Missouri, Columbia, MO, USA
– sequence: 3
  givenname: Jianlin
  surname: Cheng
  fullname: Cheng, Jianlin
  organization: Department of Electrical Engineering and Computer Science, Bioinformatics and Machine Learning Laboratory (BML), University of Missouri, Columbia, MO, USA
– sequence: 4
  givenname: Bashir I.
  orcidid: 0000-0002-2178-433X
  surname: Morshed
  fullname: Morshed, Bashir I.
  organization: Department of Computer Science, Cyber Physical Systems (CPS) Laboratory, Texas Tech University, Lubbock, TX, USA
BookMark eNp9kMtOwzAQRS0EEm3hB2ATiXXK2HlN2FWhQFGlogrWwXHGlaviFDst4u9JH2LBgtUs5p55nD47tY0lxq44DDmH_Pb5ZT4rhgKEGEZCpDHyE9bjSYKhEEl6ynoAHMNc8Pyc9b1fAkCUpFGPvY-1NsqQbYORUrQiJ1vT2KDRwT3ROpiSdNbYRTCxmhxZRUHXnZNvNk5RWDTWt04aS3UwrhfUQVujyN8Foy60NfR1wc60XHm6PNYBe3sYvxZP4XT2OClG01DFkLQhRw51JutY6Ayx6u4DjVyoSmshBaYVZjKmREecK53VkkQudR1VCJBBpTAasJvD3LVrPjfk23LZnWi7laXI0gghR8y7lDiklGu8d6TLtTMf0n2XHMqdyXJvstyZLI8mOwj_QMq0e02711f_o9cH1BDR7648xwQAox8xu4MF
CODEN IEEPAD
CitedBy_id crossref_primary_10_1109_TII_2024_3396559
crossref_primary_10_1109_TCAD_2024_3355190
crossref_primary_10_1109_JBHI_2022_3233486
crossref_primary_10_1109_ACCESS_2025_3544515
crossref_primary_10_1016_j_measurement_2025_117618
crossref_primary_10_3390_iot6020029
crossref_primary_10_1109_ACCESS_2025_3599356
crossref_primary_10_1049_cdt2_6214436
crossref_primary_10_1016_j_physa_2024_129909
crossref_primary_10_1016_j_engappai_2025_112241
crossref_primary_10_1515_auto_2024_0155
crossref_primary_10_1109_ACCESS_2024_3485550
crossref_primary_10_1109_ACCESS_2025_3565296
crossref_primary_10_1007_s10489_024_05615_7
crossref_primary_10_1109_JIOT_2024_3456569
crossref_primary_10_3390_fi17100433
crossref_primary_10_3390_agronomy15051128
crossref_primary_10_1093_jas_skae098
crossref_primary_10_1109_ACCESS_2025_3562612
crossref_primary_10_1007_s11554_025_01670_6
crossref_primary_10_1109_ACCESS_2024_3452184
crossref_primary_10_1016_j_rser_2025_116231
crossref_primary_10_1016_j_nanoen_2024_110186
crossref_primary_10_3390_computers14090374
crossref_primary_10_1007_s00607_024_01377_9
crossref_primary_10_1016_j_knosys_2025_113882
crossref_primary_10_1007_s11053_025_10475_0
crossref_primary_10_1109_TII_2024_3507954
crossref_primary_10_3389_fcomp_2025_1538277
crossref_primary_10_1016_j_jfca_2025_108161
crossref_primary_10_1007_s11760_025_04219_z
crossref_primary_10_3390_agriculture15121238
crossref_primary_10_1109_TPDS_2024_3469545
crossref_primary_10_1109_TMC_2024_3465434
crossref_primary_10_1109_ACCESS_2024_3358827
crossref_primary_10_1016_j_atech_2025_101330
crossref_primary_10_1109_TVT_2024_3395292
crossref_primary_10_1016_j_jhazmat_2024_134723
crossref_primary_10_1109_JSEN_2025_3583820
crossref_primary_10_1016_j_comnet_2025_111592
crossref_primary_10_3390_agriculture15151643
crossref_primary_10_3390_bioengineering10111324
crossref_primary_10_1007_s10462_024_10774_7
crossref_primary_10_1007_s10586_025_05226_y
crossref_primary_10_1109_ACCESS_2025_3608445
crossref_primary_10_1145_3768630
crossref_primary_10_3389_fcomp_2025_1563942
crossref_primary_10_3389_fnins_2025_1656892
crossref_primary_10_1016_j_jnca_2025_104231
crossref_primary_10_1109_TCAD_2024_3448379
crossref_primary_10_1016_j_suscom_2024_101004
crossref_primary_10_1109_ACCESS_2023_3276438
crossref_primary_10_1016_j_dsp_2024_104851
crossref_primary_10_3390_s24185965
crossref_primary_10_1016_j_compbiolchem_2025_108620
crossref_primary_10_3390_agriculture15151653
crossref_primary_10_1007_s10462_024_11033_5
crossref_primary_10_1016_j_cmpbup_2025_100205
crossref_primary_10_1038_s41598_025_15351_8
crossref_primary_10_1007_s00607_025_01427_w
crossref_primary_10_1038_s41598_025_16043_z
crossref_primary_10_1016_j_lfs_2025_123931
crossref_primary_10_1016_j_eswa_2025_128427
crossref_primary_10_3390_info15030161
crossref_primary_10_1109_JSEN_2024_3361980
crossref_primary_10_1109_TSC_2025_3586152
crossref_primary_10_3390_atmos15060689
crossref_primary_10_1109_JPROC_2024_3437365
crossref_primary_10_32604_cmc_2024_057353
crossref_primary_10_1016_j_comnet_2025_111654
crossref_primary_10_1016_j_fraope_2025_100256
crossref_primary_10_1016_j_neucom_2025_130097
crossref_primary_10_1109_JSEN_2024_3502539
crossref_primary_10_1109_TSC_2024_3359148
crossref_primary_10_1109_JIOT_2025_3542534
crossref_primary_10_2298_CSIS240503020S
crossref_primary_10_1371_journal_pone_0323689
crossref_primary_10_1016_j_patcog_2024_110749
crossref_primary_10_1016_j_sysarc_2025_103409
crossref_primary_10_1109_TII_2024_3353851
crossref_primary_10_3390_electronics12132908
crossref_primary_10_32628_IJSRST2513120
crossref_primary_10_1109_TCAD_2024_3443694
crossref_primary_10_3390_s25123784
crossref_primary_10_1109_JIOT_2025_3546577
crossref_primary_10_1109_JSTARS_2024_3424954
crossref_primary_10_1587_transele_2024LHP0004
crossref_primary_10_3390_s24165262
crossref_primary_10_3390_su17125255
crossref_primary_10_1145_3724420
crossref_primary_10_3390_plants14101481
crossref_primary_10_21833_ijaas_2025_09_003
crossref_primary_10_1016_j_sysarc_2025_103536
crossref_primary_10_1016_j_aei_2025_103516
crossref_primary_10_1007_s00500_023_09109_5
crossref_primary_10_1016_j_iot_2025_101488
crossref_primary_10_1109_TCSI_2024_3446582
crossref_primary_10_1016_j_aej_2025_08_030
crossref_primary_10_1109_TIM_2024_3364264
crossref_primary_10_3390_s24175586
crossref_primary_10_1007_s11831_023_10032_z
crossref_primary_10_1109_JIOT_2024_3483232
crossref_primary_10_1016_j_cose_2024_103733
crossref_primary_10_3390_s24092807
crossref_primary_10_1016_j_neucom_2025_131163
crossref_primary_10_3390_s23146485
crossref_primary_10_3390_math13111878
crossref_primary_10_1109_TCSI_2024_3369230
crossref_primary_10_1007_s10846_024_02141_z
crossref_primary_10_1111_exsy_13760
crossref_primary_10_3390_s25010083
crossref_primary_10_3390_sym17060891
crossref_primary_10_1097_ICU_0000000000000985
crossref_primary_10_1109_TIM_2025_3581658
crossref_primary_10_3390_s24155069
crossref_primary_10_1109_TCE_2023_3339468
crossref_primary_10_3390_electronics14102038
Cites_doi 10.1109/PerCom45495.2020.9127387
10.1109/CVPR.2016.308
10.1007/978-3-642-27645-3
10.1109/JIOT.2020.2967734
10.3390/a13050125
10.1109/TPAMI.2013.50
10.1109/JPROC.2021.3098483
10.1109/HPCA.2017.29
10.1145/3038912.3052577
10.1145/1143844.1143865
10.1109/TVLSI.2019.2939429
10.1109/IEMTE.1998.723053
10.1109/ISCA.2018.00062
10.1109/IJCNN.2019.8852463
10.1145/3089801.3089803
10.1007/978-3-031-14748-7_2
10.1109/ICCV.2017.298
10.1145/3241539.3241557
10.1109/ICCE.2016.7430525
10.1145/3152127
10.1006/ijhc.2001.0499
10.1109/VLSIC.2018.8502438
10.1109/ISCA.2018.00040
10.1109/CODESISSS.2015.7331375
10.1109/TPAMI.2021.3055564
10.1109/JPROC.2019.2918951
10.1145/2810103.2813687
10.14778/3137628.3137664
10.1145/3309551
10.1145/3140659.3080215
10.1109/VLSIC.2018.8502276
10.1109/TPAMI.2018.2858232
10.1145/3377713.3377721
10.3390/electronics10091025
10.1007/978-3-030-37334-4_2
10.1145/3081333.3081336
10.1038/s41586-019-1424-8
10.18653/v1/2022.emnlp-main.741
10.1145/3007787.3001138
10.1007/978-3-030-05318-5_3
10.1145/2872887.2750389
10.1109/ICCV.2019.00140
10.1109/ISSCC.2014.6757323
10.1109/IPSN.2016.7460664
10.1109/ACCESS.2021.3131396
10.1109/JPROC.2021.3119950
10.1145/2554688.2554785
10.1109/MeMeA54994.2022.9856420
10.1109/PACT.2019.00009
10.1145/3093337.3037698
10.1109/APCCAS.2018.8605639
10.1109/IEDM.2017.8268341
10.1109/VLSID.2019.00055
10.3390/fi12070113
10.1109/MICRO.2014.58
10.1145/3218603.3218643
10.1109/MICRO.2018.00062
10.1073/pnas.1604850113
10.1109/ISCA.2018.00060
10.1109/ISCA.2018.00012
10.1145/3287624.3287627
10.1109/PDP2018.2018.00023
10.1109/JSSC.2017.2778281
10.1109/DAC.2018.8465915
10.1007/978-3-319-46493-0_32
10.1109/TPDS.2020.3030548
10.1609/aaai.v34i04.5963
10.1109/ICDCS.2017.226
10.1145/3089801.3089804
10.1145/2627369.2627613
10.1016/j.pmcj.2017.07.014
10.23919/VLSIC.2019.8778056
10.1109/TIFS.2015.2400395
10.1109/ISNE.2017.7968748
10.1109/CVPR.2016.90
10.1109/UCC48980.2020.00066
10.1145/3089801.3089806
10.1145/3079856.3080246
10.1145/2906388.2906396
10.1109/CVPR.2019.01147
10.1145/3161174
10.1109/JETCAS.2019.2910232
10.1109/ASPDAC.2017.7858419
10.1109/MICRO.2016.7783723
10.3233/JIFS-169699
10.1109/ICCCN.2018.8487352
10.23919/VLSIC.2019.8778006
10.1109/I2MTC50364.2021.9459794
10.1109/MWC.006.2100699
10.1109/EEEI.2014.7005803
10.1109/ISVLSI.2016.111
10.1145/3240765.3240801
10.1109/MSP.2017.2765695
10.1145/3352460.3358269
10.23919/VLSIC.2019.8778193
10.1145/3089801.3089805
10.1109/MM.2018.053631145
10.1109/ISCA.2018.00069
10.1016/j.neucom.2017.09.046
10.1109/FCCM.2017.64
10.1109/CVPRW.2018.00215
10.1109/ICDCS.2017.94
10.1016/j.sysarc.2020.101887
10.1609/aimag.v35i4.2513
10.1609/aaai.v32i1.11601
10.1109/ICASSP.2017.7953288
10.1145/3132211.3134456
10.1109/TC.2019.2914438
10.1145/3154448.3154452
10.1109/ISCA.2018.00068
10.1145/3174243.3174253
10.1109/BTAS46853.2019.9185981
10.3390/electronics5040088
10.3390/s21092984
10.1145/3081333.3081359
10.1145/3079856.3080244
10.1109/ICPR.2018.8546129
10.1109/5.726791
10.3390/s18113726
10.1109/MDAT.2017.2741463
10.1145/2994551.2994564
10.1145/3131895
10.1145/3242897
10.2200/s00832ed1v01y201802aim037
10.1109/DAC.2018.8465793
10.1145/3494834.3500240
10.1109/ISCA.2018.00061
10.18653/v1/2020.acl-main.686
10.1109/ISQED54688.2022.9806197
10.1145/2996864
10.1145/3229556.3229562
10.1109/ACCESS.2018.2870052
10.1145/3297858.3304028
10.1109/FCCM.2017.47
10.1145/3373376.3378534
10.1145/3287624.3287715
10.48550/arXiv.1503.02531
10.1109/CVPR.2018.00171
10.1007/978-3-030-27562-4_6
10.1145/3007787.3001163
10.1109/HPCA.2019.00027
10.1109/EMC2-NIPS53020.2019.00013
10.1145/2820975.2820980
10.1145/2935643.2935650
10.1007/978-3-030-01261-8_25
10.1109/SmartWorld-UIC-ATC-SCALCOM-IOP-SCI.2019.00194
10.1145/3289602.3293902
10.1109/ICCD.2017.49
10.1109/TNET.2020.3042320
10.1109/JPROC.2019.2921977
10.1145/3140659.3080254
10.1109/TC.2019.2924215
10.1109/ICTAI.2019.00197
10.1145/2647868.2654889
10.1145/3007787.3001140
10.1145/3352460.3358295
10.1609/aaai.v31i1.11231
10.23919/VLSIC.2017.8008533
10.3390/electronics8030292
10.18653/v1/2020.emnlp-demos.6
10.1145/3299710.3211336
10.1109/CVPR.2018.00474
10.1109/ICPR.2018.8545462
10.1145/3267809.3267828
10.1145/3093336.3037746
10.1109/AIPR50011.2020.9425332
10.1109/ACCESS.2018.2874823
10.1109/CASES.2015.7324548
10.1109/CVPR.2016.91
10.1609/aaai.v33i01.33013779
10.1109/TNNLS.2018.2852335
10.1609/aaai.v32i1.11876
10.1109/LCA.2016.2597140
10.1145/3472291
10.1111/1754-9485.13261
10.1038/s41586-018-0180-5
10.1109/ISSCC.2019.8662302
10.1109/ICASSP.2015.7178146
10.1007/s11633-016-1006-2
10.1145/3289602.3293905
10.1007/978-0-387-84858-7_14
10.1109/WACV.2018.00083
10.1109/ICCCBDA51879.2021.9442600
10.1109/CVPR.2019.00293
10.21437/Interspeech.2021-1816
10.1109/ISLPED.2017.8009173
10.1145/3352460.3358291
10.1007/978-1-4302-4000-6_9
10.1609/aaai.v34i04.5954
10.1109/ICPR.2016.7900006
10.23919/DATE.2017.7927211
10.1007/s00521-018-3761-1
10.1609/aaai.v35i2.16179
10.1109/FCCM.2019.00033
10.1109/VLSIC.2018.8502404
10.1109/ISCAS51556.2021.9401083
10.1109/JPROC.2017.2761740
10.1145/3109761.3109804
10.1109/TIM.2020.3018831
10.1086/719650
10.1007/s10994-019-05855-6
10.7551/mitpress/12832.003.0015
10.23919/DATE48585.2020.9116529
10.1109/SEC.2016.38
10.1109/LCA.2020.2979965
10.1145/3318216.3363316
10.1109/CVPR.2018.00454
10.1109/CVPR.2017.243
10.1007/978-3-030-01249-6_18
10.1109/ISSCC.2018.8310262
10.4108/eai.30-11-2016.2267463
10.1109/JSSC.2016.2616357
10.1109/TCAD.2018.2858384
10.1613/jair.301
10.3390/en15207495
10.1109/ACCESS.2018.2890150
10.1109/CVPR.2016.435
10.1109/CVPR42600.2020.01464
10.1145/3174243.3174261
10.1007/978-3-030-01246-5_2
10.1109/VLSIC.2016.7573525
10.1145/3296957.3173176
10.1109/PERCOMW.2016.7457169
10.1109/EMC2.2018.00012
10.1145/3020078.3021745
10.1109/CVPRW.2011.5981829
10.5244/C.28.88
10.1109/LES.2018.2815954
10.1109/MPRV.2017.2940968
10.4324/9781410605337-29
10.1109/LCOMM.2019.2921755
10.18653/v1/2020.findings-emnlp.372
10.1145/3020078.3021740
10.1016/j.micpro.2020.102991
10.1109/CVPR.2018.00745
10.23919/VLSIC.2017.8008534
10.1145/2750858.2804262
10.1109/FPL.2018.00035
10.1109/JETCAS.2018.2835809
10.1109/TNNLS.2018.2844093
10.1145/3287624.3287714
10.1109/MICRO.2018.00024
10.1109/CVPR.2017.574
10.1145/2964284.2973801
10.1109/CVPR.2018.00716
10.1109/JPROC.2020.2976475
10.1109/PADSW.2018.8645013
10.1109/CVPR.2017.195
10.1145/3289602.3293910
10.1109/MICRO.2016.7783720
10.1109/INFOCOM.2018.8486403
10.3390/electronics8111321
10.2200/s00960ed2v01y201910aim043
10.1109/MICRO.2018.00011
10.1007/978-3-030-01267-0_44
10.1007/978-3-319-97909-0_46
10.1109/EMBC.2017.8036767
10.1109/CVPR42600.2020.00396
10.1145/3418297
10.1109/ICASSP.2016.7472657
10.1109/IPDPS53621.2022.00110
10.1145/1150402.1150464
10.1109/TII.2018.2842821
10.48550/arXiv.1810.04805
10.1109/FPL.2018.00075
10.1145/3234150
10.1016/j.patcog.2015.03.009
10.1145/3212725.3212728
10.1109/ICPADS.2017.00069
10.1109/CHASE.2017.115
10.1109/ISSCC.2017.7870353
10.1109/INFOCOM48880.2022.9796896
10.1145/3081333.3081360
10.1109/ICCVW.2019.00447
10.1109/CVPR.2017.643
10.1137/1.9781611970364
10.1109/TVLSI.2018.2797600
10.1145/3240765.3243473
10.1109/CVPR.2018.00291
10.1109/ICACSIS.2017.8355051
10.1007/978-3-540-75171-7_2
10.1145/3289602.3293898
10.1109/TCAD.2017.2705069
10.1109/TMC.2020.2999956
ContentType Journal Article
Copyright Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023
Copyright_xml – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023
DBID 97E
ESBDL
RIA
RIE
AAYXX
CITATION
7SP
8FD
L7M
DOI 10.1109/JPROC.2022.3226481
DatabaseName IEEE All-Society Periodicals Package (ASPP) 2005–Present
IEEE Xplore Open Access Journals
IEEE All-Society Periodicals Package (ASPP) 1998–Present
IEEE/IET Electronic Library
CrossRef
Electronics & Communications Abstracts
Technology Research Database
Advanced Technologies Database with Aerospace
DatabaseTitle CrossRef
Technology Research Database
Advanced Technologies Database with Aerospace
Electronics & Communications Abstracts
DatabaseTitleList
Technology Research Database
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE/IET Electronic Library
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISSN 1558-2256
EndPage 50
ExternalDocumentID 10_1109_JPROC_2022_3226481
9985008
Genre orig-research
GrantInformation_xml – fundername: College of Engineering, University of Missouri, Columbia, MO, USA
GroupedDBID -DZ
-~X
.DC
0R~
123
29P
4.4
6IK
85S
97E
AAJGR
AAWTH
ABAZT
ABJNI
ABQJQ
ABVLG
ACBEA
ACGFS
AENEX
AFOGA
AGQYO
AGSQL
AHBIQ
ALMA_UNASSIGNED_HOLDINGS
AZLTO
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CS3
DU5
EBS
ESBDL
HZ~
IFIPE
IPLJI
JAVBF
LAI
M43
O9-
OCL
RIA
RIE
RIU
RNS
TAE
TN5
TWZ
UHB
UKR
UQL
YNT
ZCA
~02
1OL
3EH
9M8
AAYXX
ABFSI
AETEA
AETIX
AGNAY
AIBXA
ALLEH
CITATION
EJD
FA8
H~9
IAAWW
IBMZZ
ICLAB
IDIHD
IFJZH
MVM
VOH
WHG
XOL
ZXP
ZY4
7SP
8FD
L7M
ID FETCH-LOGICAL-c405t-1810d7ad42f788b0000f812cbff2a286b87a4e5f311cf7dae29afd3b80070bc83
IEDL.DBID RIE
ISICitedReferencesCount 174
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000899996300001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 0018-9219
IngestDate Sun Nov 09 08:26:08 EST 2025
Sat Nov 29 06:01:44 EST 2025
Tue Nov 18 22:20:17 EST 2025
Wed Aug 27 02:29:16 EDT 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 1
Language English
License https://creativecommons.org/licenses/by/4.0/legalcode
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c405t-1810d7ad42f788b0000f812cbff2a286b87a4e5f311cf7dae29afd3b80070bc83
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ORCID 0000-0002-3498-4947
0000-0002-0501-0027
0000-0002-2178-433X
OpenAccessLink https://ieeexplore.ieee.org/document/9985008
PQID 2763809889
PQPubID 85453
PageCount 50
ParticipantIDs crossref_primary_10_1109_JPROC_2022_3226481
proquest_journals_2763809889
ieee_primary_9985008
crossref_citationtrail_10_1109_JPROC_2022_3226481
PublicationCentury 2000
PublicationDate 2023-01-01
PublicationDateYYYYMMDD 2023-01-01
PublicationDate_xml – month: 01
  year: 2023
  text: 2023-01-01
  day: 01
PublicationDecade 2020
PublicationPlace New York
PublicationPlace_xml – name: New York
PublicationTitle Proceedings of the IEEE
PublicationTitleAbbrev JPROC
PublicationYear 2023
Publisher IEEE
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml – name: IEEE
– name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References ref57
Maaz (ref71) 2022
ref207
ref328
ref56
ref208
ref329
ref326
ref58
ref327
ref53
ref203
ref324
ref52
ref325
ref55
ref201
ref322
ref54
ref323
Sak (ref89) 2014
ref209
ref210
ref331
ref332
Shacham (ref330) 2022
ref218
ref339
ref219
ref216
ref337
ref217
ref338
ref42
ref214
ref335
ref215
ref336
ref212
ref333
ref43
Mao (ref365)
Pan (ref298) 2020
(ref266) 2022
ref49
Courbariaux (ref134) 2016
ref100
ref342
ref101
ref222
ref343
ref340
ref220
ref341
Lin (ref115); 48
ref34
ref37
ref304
ref36
ref423
ref424
ref421
ref422
Han (ref157) 2015
ref39
ref38
ref308
ref309
ref310
Ogden (ref370)
ref317
ref23
ref318
ref26
ref315
ref25
Wang (ref193) 2018
ref316
ref20
ref313
ref314
ref22
ref311
ref21
(ref253) 2022
ref28
ref27
Zagoruyko (ref205) 2016
ref319
ref320
ref200
ref321
Sergeev (ref297) 2018
Esser (ref136)
ref128
Hubara (ref142) 2016
ref249
ref129
ref97
ref126
ref247
ref96
ref248
ref369
ref124
ref245
ref125
ref246
ref367
Howard (ref70) 2017
Kim (ref175) 2015
ref375
Buchlovsky (ref301) 2019
ref376
ref95
ref131
ref252
ref373
ref374
ref250
ref371
ref130
ref251
ref372
McMahan (ref359) 2017
Konečný (ref363) 2016
ref139
(ref62) 2022
ref258
ref379
ref85
ref138
ref259
ref256
ref377
ref87
ref378
Zhang (ref334) 2017
ref82
ref144
ref386
ref81
ref387
ref384
ref83
ref385
ref140
ref261
ref262
ref383
ref380
Hojabr (ref180)
ref381
ref79
ref108
Chen (ref149)
ref78
ref109
Shazeer (ref300)
ref348
ref107
ref349
ref75
ref104
ref346
ref74
ref226
ref347
ref223
ref344
ref76
ref103
ref224
(ref272) 2022
Molchanov (ref105) 2016
Lai (ref159) 2018
Hubara (ref141); 4107
ref111
ref232
ref353
ref233
ref354
ref230
ref351
Harlap (ref368) 2018
ref72
ref110
ref231
ref352
Gupta (ref116); 37
ref350
ref68
ref119
ref67
ref117
ref238
Kumar (ref31); 70
ref118
ref239
(ref268) 2022
ref64
ref236
ref357
Wendelken (ref3) 2022
ref237
ref358
ref66
ref234
ref355
ref65
ref235
ref356
Brown (ref46) 2020
Huang (ref296)
Zhang (ref92)
ref60
ref122
ref243
ref364
ref123
ref244
Narang (ref228) 2017
ref241
ref362
ref121
ref242
ref360
ref240
ref361
Baluja (ref165) 2018
Li (ref41) 2018; abs/1810.06339
ref168
ref289
Kalchbrenner (ref93) 2015
ref169
ref177
ref178
ref173
ref174
ref171
ref179
He (ref94)
ref181
ref188
Zoph (ref61) 2016
ref186
Dai (ref302)
ref187
(ref269) 2022
ref182
ref183
Han (ref47) 2015
Neil (ref88)
ref388
Zoph (ref59) 2017
ref147
ref389
Krizhevsky (ref50); 25
Zhou (ref143) 2016
ref390
Guihot (ref279) 2012
Gupta (ref29)
ref276
ref397
Gong (ref163) 2014
ref277
ref398
ref395
ref275
ref396
ref151
ref393
ref273
ref394
ref391
ref150
ref392
Wen (ref98); 29
ref278
ref399
Li (ref156) 2022
Lin (ref133) 2015
ref280
ref287
ref167
ref288
ref164
ref285
ref286
ref162
ref283
ref284
ref160
ref281
ref161
Yao (ref295) 2022
Liu (ref274) 2020
Soudry (ref135)
Vanhoucke (ref120)
Vaswani (ref45)
Gale (ref225) 2019
Sarwar (ref166)
ref2
ref1
Moritz (ref303)
ref191
ref192
ref190
ref199
ref197
ref195
ref196
Courbariaux (ref132)
Denton (ref202)
Chen (ref221) 2017
Lebedev (ref176) 2014
Wen (ref410) 2020
Zhang (ref170)
Xu (ref172) 2017
Zhou (ref152) 2017
Rasley (ref290)
Wang (ref154)
(ref260) 2022
Bohdal (ref194) 2020
Rallapalli (ref282) 2016
Lin (ref264); 33
ref8
ref7
ref9
ref4
Baines (ref299) 2021
ref6
ref5
Guan (ref263) 2021; 64
Agostinelli (ref44) 2014
Zhu (ref229) 2017
(ref265) 2022
Wang (ref227) 2020
Kuchaiev (ref90) 2017
Mao (ref99) 2017
Xu (ref153) 2018
Cai (ref418) 2018
Zhu (ref146) 2016
Li (ref145) 2016
Crowley (ref204)
Romero (ref185) 2014
Srivastava (ref48) 2014; 15
(ref267) 2022
Lu (ref293) 2022
Refaeilzadeh (ref33) 2020; 5
Micikevicius (ref112) 2017
(ref271) 2022
Bie (ref155) 2019
Meng (ref80) 2019
van der Westhuizen (ref86) 2018
Liu (ref158)
Iandola (ref73) 2016
You (ref106) 2019
Ruder (ref35) 2016
Xiong (ref77) 2019
Wu (ref91) 2016
(ref63) 2022
Jelčicová (ref102) 2022
Yang (ref345)
(ref306) 2022
Simonyan (ref51) 2014
(ref270) 2021
Chaulwar (ref189) 2022
Courbariaux (ref137) 2016
Jozefowicz (ref84)
Tann (ref211)
Bolukbasi (ref213) 2017
Shafiee (ref198) 2017
Xu (ref24) 2017
Gysel (ref114)
Zhu (ref40) 2005
Wang (ref366)
Liu (ref69) 2018
(ref305) 2022
Ryu (ref127)
Abadi (ref255)
Li (ref294) 2021
ref13
ref405
ref12
ref406
ref15
ref403
ref14
ref404
ref401
ref402
ref11
ref10
ref400
(ref312) 2022
ref17
ref409
ref16
ref19
Ren (ref291)
ref407
ref18
ref408
Wang (ref113) 2018
ref416
ref417
ref414
ref415
ref412
ref413
ref411
Covell (ref148) 2019
ref419
(ref307) 2022
(ref254) 2022
ref420
Dennis (ref30) 2020
Paszke (ref257); 32
Murphy (ref32) 2012
Ge (ref382) 2022
Sau (ref206) 2016
Huang (ref184) 2017
Tang (ref292)
References_xml – ident: ref384
  doi: 10.1109/PerCom45495.2020.9127387
– ident: ref54
  doi: 10.1109/CVPR.2016.308
– year: 2018
  ident: ref153
  article-title: Alternating multi-bit quantization for recurrent neural networks
  publication-title: arXiv:1802.00150
– ident: ref43
  doi: 10.1007/978-3-642-27645-3
– ident: ref367
  doi: 10.1109/JIOT.2020.2967734
– year: 2022
  ident: ref382
  article-title: Lossless acceleration for Seq2seq generation with aggressive decoding
  publication-title: arXiv:2205.10350
– ident: ref16
  doi: 10.3390/a13050125
– volume-title: Stratus High-Level Synthesis
  year: 2022
  ident: ref271
– ident: ref412
  doi: 10.1109/TPAMI.2013.50
– ident: ref15
  doi: 10.1109/JPROC.2021.3098483
– ident: ref311
  doi: 10.1109/HPCA.2017.29
– year: 2022
  ident: ref189
  article-title: Extreme compression of sentence-transformer ranker models: Faster inference, longer battery life, and less storage on edge devices
  publication-title: arXiv:2207.12852
– year: 2017
  ident: ref229
  article-title: To prune, or not to prune: Exploring the efficacy of pruning for model compression
  publication-title: arXiv:1710.01878
– ident: ref392
  doi: 10.1145/3038912.3052577
– ident: ref37
  doi: 10.1145/1143844.1143865
– year: 2018
  ident: ref165
  article-title: No multiplication? No floating point? No problem! Training networks for efficient inference
  publication-title: arXiv:1809.09244
– ident: ref407
  doi: 10.1109/TVLSI.2019.2939429
– volume-title: Tensilica DNA Processor Family for on-Device AI
  year: 2022
  ident: ref305
– ident: ref331
  doi: 10.1109/IEMTE.1998.723053
– start-page: 1
  volume-title: Proc. Adv. Neural Inf. Process. Syst.
  ident: ref296
  article-title: GPipe: Efficient training of giant neural networks using pipeline parallelism
– year: 2018
  ident: ref193
  article-title: Dataset distillation
  publication-title: arXiv:1811.10959
– ident: ref251
  doi: 10.1109/ISCA.2018.00062
– ident: ref107
  doi: 10.1109/IJCNN.2019.8852463
– ident: ref399
  doi: 10.1145/3089801.3089803
– ident: ref351
  doi: 10.1007/978-3-031-14748-7_2
– year: 2016
  ident: ref73
  article-title: SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5 MB model size
  publication-title: arXiv:1602.07360
– year: 2014
  ident: ref176
  article-title: Speeding-up convolutional neural networks using fine-tuned CP-decomposition
  publication-title: arXiv:1412.6553
– volume: 70
  start-page: 1935
  volume-title: Proc. 34th Int. Conf. Mach. Learn.
  ident: ref31
  article-title: Resource-efficient machine learning in 2 KB RAM for the Internet of Things
– ident: ref195
  doi: 10.1109/ICCV.2017.298
– ident: ref23
  doi: 10.1145/3241539.3241557
– year: 2019
  ident: ref225
  article-title: The state of sparsity in deep neural networks
  publication-title: arXiv:1902.09574
– year: 2016
  ident: ref105
  article-title: Pruning convolutional neural networks for resource efficient inference
  publication-title: arXiv:1611.06440
– ident: ref207
  doi: 10.1109/ICCE.2016.7430525
– ident: ref394
  doi: 10.1145/3152127
– ident: ref356
  doi: 10.1006/ijhc.2001.0499
– year: 2017
  ident: ref112
  article-title: Mixed precision training
  publication-title: arXiv:1710.03740
– ident: ref327
  doi: 10.1109/VLSIC.2018.8502438
– ident: ref338
  doi: 10.1109/ISCA.2018.00040
– year: 2019
  ident: ref148
  article-title: Table-based neural units: Fully quantizing networks for multiply-free inference
  publication-title: arXiv:1906.04798
– ident: ref210
  doi: 10.1109/CODESISSS.2015.7331375
– ident: ref178
  doi: 10.1109/TPAMI.2021.3055564
– ident: ref18
  doi: 10.1109/JPROC.2019.2918951
– ident: ref362
  doi: 10.1145/2810103.2813687
– start-page: 1
  volume-title: Proc. Int. Conf. Learn. Represent.
  ident: ref114
  article-title: Hardware-oriented approximation of convolutional neural networks
– ident: ref209
  doi: 10.14778/3137628.3137664
– ident: ref7
  doi: 10.1145/3309551
– ident: ref101
  doi: 10.1145/3140659.3080215
– ident: ref318
  doi: 10.1109/VLSIC.2018.8502276
– year: 2016
  ident: ref91
  article-title: Google’s neural machine translation system: Bridging the gap between human and machine translation
  publication-title: arXiv:1609.08144
– ident: ref405
  doi: 10.1109/TPAMI.2018.2858232
– ident: ref138
  doi: 10.1145/3377713.3377721
– ident: ref17
  doi: 10.3390/electronics10091025
– ident: ref258
  doi: 10.1007/978-3-030-37334-4_2
– ident: ref395
  doi: 10.1145/3081333.3081336
– ident: ref324
  doi: 10.1038/s41586-019-1424-8
– ident: ref381
  doi: 10.18653/v1/2022.emnlp-main.741
– ident: ref245
  doi: 10.1145/3007787.3001138
– volume-title: Pixel Visual Core: Image Processing and Machine Learning on Pixel 2
  year: 2022
  ident: ref330
– volume: 33
  start-page: 11711
  volume-title: Proc. Adv. Neural Inf. Process. Syst.
  ident: ref264
  article-title: MCUNet: Tiny deep learning on IoT devices
– year: 2020
  ident: ref410
  article-title: Time series data augmentation for deep learning: A survey
  publication-title: arXiv:2002.12478
– volume: abs/1810.06339
  year: 2018
  ident: ref41
  article-title: Deep reinforcement learning
  publication-title: CoRR
– ident: ref60
  doi: 10.1007/978-3-030-05318-5_3
– start-page: 1331
  volume-title: Proc. Int. Conf. Mach. Learn.
  ident: ref29
  article-title: ProtoNN: Compressed and accurate KNN for resource-scarce devices
– ident: ref286
  doi: 10.1145/2872887.2750389
– ident: ref64
  doi: 10.1109/ICCV.2019.00140
– ident: ref118
  doi: 10.1109/ISSCC.2014.6757323
– ident: ref284
  doi: 10.1109/IPSN.2016.7460664
– year: 2017
  ident: ref228
  article-title: Exploring sparsity in recurrent neural networks
  publication-title: arXiv:1704.05119
– ident: ref377
  doi: 10.1109/ACCESS.2021.3131396
– ident: ref20
  doi: 10.1109/JPROC.2021.3119950
– ident: ref247
  doi: 10.1145/2554688.2554785
– start-page: 1
  volume-title: Proc. Adv. Neural Inf. Process. Syst.
  ident: ref94
  article-title: Wider and deeper, cheaper and faster: Tensorized LSTMs for sequence learning
– ident: ref27
  doi: 10.1109/MeMeA54994.2022.9856420
– start-page: 1
  volume-title: Proc. USENIX Workshop Hot Topics Edge Comput. (HotEdge)
  ident: ref365
  article-title: A privacy-preserving deep learning approach for face recognition with edge computing
– year: 2014
  ident: ref185
  article-title: FitNets: Hints for thin deep nets
  publication-title: arXiv:1412.6550
– ident: ref230
  doi: 10.1109/PACT.2019.00009
– ident: ref25
  doi: 10.1145/3093337.3037698
– year: 2016
  ident: ref206
  article-title: Deep model compression: Distilling knowledge from noisy teachers
  publication-title: arXiv:1610.09650
– ident: ref328
  doi: 10.1109/APCCAS.2018.8605639
– ident: ref340
  doi: 10.1109/IEDM.2017.8268341
– ident: ref82
  doi: 10.1109/VLSID.2019.00055
– start-page: 1
  volume-title: Proc. 56th ACM/IEEE Design Automat. Conf. (DAC)
  ident: ref180
  article-title: SkippyNN: An embedded stochastic-computing accelerator for convolutional neural networks
– year: 2015
  ident: ref47
  article-title: Learning both weights and connections for efficient neural networks
  publication-title: arXiv:1506.02626
– start-page: 2342
  volume-title: Proc. Int. Conf. Mach. Learn.
  ident: ref84
  article-title: An empirical exploration of recurrent network architectures
– year: 2016
  ident: ref143
  article-title: DoReFa-Net: Training low bitwidth convolutional neural networks with low bitwidth gradients
  publication-title: arXiv:1606.06160
– ident: ref8
  doi: 10.3390/fi12070113
– volume-title: Intel® Edison Development Platform
  year: 2022
  ident: ref254
– ident: ref326
  doi: 10.1109/MICRO.2014.58
– ident: ref173
  doi: 10.1145/3218603.3218643
– volume: 37
  start-page: 1737
  volume-title: Proc. 32nd Int. Conf. Mach. Learn. (ICML)
  ident: ref116
  article-title: Deep learning with limited numerical precision
– ident: ref339
  doi: 10.1109/MICRO.2018.00062
– ident: ref147
  doi: 10.1073/pnas.1604850113
– ident: ref222
  doi: 10.1109/ISCA.2018.00060
– year: 2019
  ident: ref106
  article-title: Gate decorator: Global filter pruning method for accelerating deep convolutional neural networks
  publication-title: arXiv:1909.08174
– ident: ref329
  doi: 10.1109/ISCA.2018.00012
– volume-title: DesignWare ARC EV Processors for Embedded Vision
  year: 2022
  ident: ref306
– ident: ref171
  doi: 10.1145/3287624.3287627
– year: 2017
  ident: ref213
  article-title: Adaptive neural networks for efficient inference
  publication-title: arXiv:1702.07811
– ident: ref355
  doi: 10.1109/PDP2018.2018.00023
– volume-title: NVIDIA TensorRT
  year: 2022
  ident: ref268
– year: 2019
  ident: ref80
  article-title: Efficient Winograd convolution via integer arithmetic
  publication-title: arXiv:1901.01965
– volume-title: Open Neural Network Exchange
  year: 2022
  ident: ref260
– ident: ref320
  doi: 10.1109/JSSC.2017.2778281
– ident: ref125
  doi: 10.1109/DAC.2018.8465915
– ident: ref140
  doi: 10.1007/978-3-319-46493-0_32
– ident: ref13
  doi: 10.1109/TPDS.2020.3030548
– ident: ref191
  doi: 10.1609/aaai.v34i04.5963
– ident: ref214
  doi: 10.1109/ICDCS.2017.226
– ident: ref278
  doi: 10.1145/3089801.3089804
– year: 2017
  ident: ref99
  article-title: Exploring the regularity of sparse structure in convolutional neural networks
  publication-title: arXiv:1705.08922
– ident: ref174
  doi: 10.1145/2627369.2627613
– volume: 29
  start-page: 2074
  volume-title: Proc. Adv. Neural Inf. Process. Syst.
  ident: ref98
  article-title: Learning structured sparsity in deep neural networks
– ident: ref353
  doi: 10.1016/j.pmcj.2017.07.014
– start-page: 2285
  volume-title: Proc. ICML
  ident: ref149
  article-title: Compressing neural networks with the hashing trick
– start-page: 1269
  volume-title: Proc. Adv. Neural Inf. Process. Syst.
  ident: ref202
  article-title: Exploiting linear structure within convolutional networks for efficient evaluation
– start-page: 3505
  volume-title: Proc. 26th ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining
  ident: ref290
  article-title: DeepSpeed
– ident: ref325
  doi: 10.23919/VLSIC.2019.8778056
– volume-title: Deploy Machine Learning Models on Mobile and IoT Devices
  year: 2022
  ident: ref272
– ident: ref383
  doi: 10.1109/TIFS.2015.2400395
– ident: ref400
  doi: 10.1109/ISNE.2017.7968748
– ident: ref53
  doi: 10.1109/CVPR.2016.90
– start-page: 963
  volume-title: Proc. Adv. Neural Inf. Process. Syst.
  ident: ref135
  article-title: Expectation backpropagation: Parameter-free training of multilayer neural networks with continuous or discrete weights
– ident: ref304
  doi: 10.1109/UCC48980.2020.00066
– year: 2014
  ident: ref51
  article-title: Very deep convolutional networks for large-scale image recognition
  publication-title: arXiv:1409.1556
– year: 2017
  ident: ref152
  article-title: Incremental network quantization: Towards lossless CNNs with low-precision weights
  publication-title: arXiv:1702.03044
– ident: ref197
  doi: 10.1145/3089801.3089806
– volume: 32
  start-page: 8026
  volume-title: Proc. Adv. Neural Inf. Process. Syst.
  ident: ref257
  article-title: PyTorch: An imperative style, high-performance deep learning library
– ident: ref319
  doi: 10.1145/3079856.3080246
– ident: ref208
  doi: 10.1145/2906388.2906396
– year: 2022
  ident: ref295
  article-title: ZeroQuant: Efficient and affordable post-training quantization for large-scale transformers
  publication-title: arXiv:2206.01861
– volume-title: Products, Helping You Bring Local AI to Applications From Prototype to Production
  year: 2022
  ident: ref312
– ident: ref220
  doi: 10.1109/CVPR.2019.01147
– ident: ref347
  doi: 10.1145/3161174
– year: 2018
  ident: ref297
  article-title: Horovod: Fast and easy distributed deep learning in TensorFlow
  publication-title: arXiv:1802.05799
– ident: ref237
  doi: 10.1109/JETCAS.2019.2910232
– ident: ref341
  doi: 10.1109/ASPDAC.2017.7858419
– volume-title: X-CUBE-AI, AI Expansion Pack for STM32CubeMX
  year: 2022
  ident: ref266
– volume: 4107
  volume-title: Proc. Advances in Neural Information Processing Systems
  ident: ref141
  article-title: Binarized neural networks
– volume: 15
  start-page: 1929
  issue: 1
  year: 2014
  ident: ref48
  article-title: Dropout: A simple way to prevent neural networks from overfitting
  publication-title: J. Mach. Learn. Res.
– year: 2017
  ident: ref59
  article-title: Learning transferable architectures for scalable image recognition
  publication-title: arXiv:1707.07012
– ident: ref233
  doi: 10.1109/MICRO.2016.7783723
– year: 2016
  ident: ref146
  article-title: Trained ternary quantization
  publication-title: arXiv:1612.01064
– volume-title: DNNDK User Guide
  year: 2022
  ident: ref267
– ident: ref389
  doi: 10.3233/JIFS-169699
– ident: ref352
  doi: 10.1109/ICCCN.2018.8487352
– ident: ref308
  doi: 10.23919/VLSIC.2019.8778006
– year: 2014
  ident: ref163
  article-title: Compressing deep convolutional networks using vector quantization
  publication-title: arXiv:1412.6115
– ident: ref151
  doi: 10.1109/I2MTC50364.2021.9459794
– ident: ref423
  doi: 10.1109/MWC.006.2100699
– year: 2015
  ident: ref157
  article-title: Deep compression: Compressing deep neural networks with pruning, trained quantization and Huffman coding
  publication-title: arXiv:1510.00149
– ident: ref332
  doi: 10.1109/EEEI.2014.7005803
– ident: ref130
  doi: 10.1109/ISVLSI.2016.111
– volume-title: Qualcomm Neural Processing SDK for AI
  year: 2021
  ident: ref270
– ident: ref415
  doi: 10.1145/3240765.3240801
– ident: ref6
  doi: 10.1109/MSP.2017.2765695
– ident: ref97
  doi: 10.1145/3352460.3358269
– ident: ref309
  doi: 10.23919/VLSIC.2019.8778193
– year: 2016
  ident: ref205
  article-title: Paying more attention to attention: Improving the performance of convolutional neural networks via attention transfer
  publication-title: arXiv:1612.03928
– ident: ref259
  doi: 10.1145/3089801.3089805
– start-page: 1
  volume-title: Proc. 56th Annu. Design Autom. Conf.
  ident: ref127
  article-title: BitBlade: Area and energy-efficient precision-scalable neural network accelerator with bitwise summation
– ident: ref322
  doi: 10.1109/MM.2018.053631145
– ident: ref126
  doi: 10.1109/ISCA.2018.00069
– volume-title: AI for the Edge
  year: 2022
  ident: ref307
– start-page: 10118
  volume-title: Proc. Int. Conf. Mach. Learn.
  ident: ref292
  article-title: 1-bit Adam: Communication efficient large-scale training with Adam’s convergence speed
– year: 2016
  ident: ref137
  article-title: BinaryNet: Training deep neural networks with weights and activations constrained to +1 or −1
  publication-title: arXiv:1602.02830
– ident: ref406
  doi: 10.1016/j.neucom.2017.09.046
– ident: ref81
  doi: 10.1109/FCCM.2017.64
– ident: ref74
  doi: 10.1109/CVPRW.2018.00215
– ident: ref21
  doi: 10.1109/ICDCS.2017.94
– ident: ref9
  doi: 10.1016/j.sysarc.2020.101887
– volume-title: Worldwide and U.S. IoT Cellular Connections Forecast, 2021–2025
  year: 2022
  ident: ref3
– ident: ref357
  doi: 10.1609/aimag.v35i4.2513
– volume-title: CEVA Deep Neural Network (CDNN)
  year: 2022
  ident: ref269
– ident: ref187
  doi: 10.1609/aaai.v32i1.11601
– ident: ref150
  doi: 10.1109/ICASSP.2017.7953288
– start-page: 50
  volume-title: Proc. ACM Symp. Cloud Comput.
  ident: ref302
  article-title: BigDL
– ident: ref22
  doi: 10.1145/3132211.3134456
– year: 2017
  ident: ref70
  article-title: MobileNets: Efficient convolutional neural networks for mobile vision applications
  publication-title: arXiv:1704.04861
– ident: ref95
  doi: 10.1109/TC.2019.2914438
– year: 2021
  ident: ref294
  article-title: 1-bit Lamb: Communication efficient large-scale large-batch training with Lamb’s convergence speed
  publication-title: arXiv:2104.06069
– ident: ref396
  doi: 10.1145/3154448.3154452
– ident: ref241
  doi: 10.1109/ISCA.2018.00068
– ident: ref314
  doi: 10.1145/3174243.3174253
– ident: ref386
  doi: 10.1109/BTAS46853.2019.9185981
– volume-title: EdgeML: Machine learning for resource-constrained edge devices
  year: 2020
  ident: ref30
– year: 2016
  ident: ref61
  article-title: Neural architecture search with reinforcement learning
  publication-title: arXiv:1611.01578
– ident: ref277
  doi: 10.3390/electronics5040088
– ident: ref335
  doi: 10.3390/s21092984
– ident: ref393
  doi: 10.1145/3081333.3081359
– ident: ref419
  doi: 10.1145/3079856.3080244
– ident: ref104
  doi: 10.1109/ICPR.2018.8546129
– start-page: 551
  volume-title: Proc. USENIX Annu. Tech. Conf.
  ident: ref291
  article-title: ZeRO-Offload: Democratizing billion-scale model training
– ident: ref49
  doi: 10.1109/5.726791
– year: 2019
  ident: ref155
  article-title: A simplified fully quantized transformer for end-to-end speech recognition
  publication-title: arXiv:1911.03604
– ident: ref390
  doi: 10.3390/s18113726
– year: 2018
  ident: ref159
  article-title: CMSIS-NN: Efficient neural network kernels for arm Cortex-M CPUs
  publication-title: arXiv:1801.06601
– ident: ref235
  doi: 10.1109/MDAT.2017.2741463
– volume-title: Semi-supervised learning literature survey
  year: 2005
  ident: ref40
– year: 2022
  ident: ref102
  article-title: Delta keyword transformer: Bringing transformers to the edge through dynamically pruned multi-head self-attention
  publication-title: arXiv:2204.03479
– ident: ref200
  doi: 10.1145/2994551.2994564
– year: 2016
  ident: ref35
  article-title: An overview of gradient descent optimization algorithms
  publication-title: arXiv:1609.04747
– ident: ref398
  doi: 10.1145/3131895
– ident: ref413
  doi: 10.1145/3242897
– ident: ref422
  doi: 10.2200/s00832ed1v01y201802aim037
– ident: ref342
  doi: 10.1109/DAC.2018.8465793
– ident: ref361
  doi: 10.1145/3494834.3500240
– ident: ref240
  doi: 10.1109/ISCA.2018.00061
– volume: 5
  start-page: 532
  year: 2020
  ident: ref33
  article-title: Cross-validation
  publication-title: Encyclopedia Database Syst.
– ident: ref68
  doi: 10.18653/v1/2020.acl-main.686
– ident: ref103
  doi: 10.1109/ISQED54688.2022.9806197
– ident: ref287
  doi: 10.1145/2996864
– start-page: 1
  volume-title: Proc. 15th ACM Conf. Embedded Netw. Sensor Syst.
  ident: ref158
  article-title: DeepIoT: Compressing deep neural network structures for sensing systems with a compressor-critic framework
– volume-title: Jetson TX2 Module
  year: 2022
  ident: ref253
– ident: ref378
  doi: 10.1145/3229556.3229562
– ident: ref424
  doi: 10.1109/ACCESS.2018.2870052
– ident: ref239
  doi: 10.1145/3297858.3304028
– ident: ref111
  doi: 10.1109/FCCM.2017.47
– year: 2022
  ident: ref156
  article-title: I-ViT: Integer-only quantization for efficient vision transformer inference
  publication-title: arXiv:2207.01405
– year: 2018
  ident: ref113
  article-title: Training deep neural networks with 8-bit floating point numbers
  publication-title: arXiv:1812.08011
– ident: ref262
  doi: 10.1145/3373376.3378534
– ident: ref344
  doi: 10.1145/3287624.3287715
– start-page: 236
  volume-title: Proc. Int. Symp. Comput. Archit.
  ident: ref345
  article-title: Sparse ReRAM engine: Joint exploration of activation and weight sparsity in compressed neural networks
– ident: ref182
  doi: 10.48550/arXiv.1503.02531
– ident: ref196
  doi: 10.1109/CVPR.2018.00171
– ident: ref372
  doi: 10.1007/978-3-030-27562-4_6
– ident: ref243
  doi: 10.1145/3007787.3001163
– year: 2016
  ident: ref145
  article-title: Ternary weight networks
  publication-title: arXiv:1605.04711
– ident: ref420
  doi: 10.1109/HPCA.2019.00027
– ident: ref199
  doi: 10.1109/EMC2-NIPS53020.2019.00013
– ident: ref348
  doi: 10.1145/2820975.2820980
– start-page: 1117
  volume-title: Proc. Adv. Neural Inf. Process. Syst.
  ident: ref136
  article-title: Backpropagation for energy-efficient neuromorphic computing
– year: 2017
  ident: ref172
  article-title: ApproxDBN: Approximate computing for discriminative deep belief networks
  publication-title: arXiv:1704.03993
– ident: ref281
  doi: 10.1145/2935643.2935650
– start-page: 1
  volume-title: Proc. Deep Learn. Unsupervised Feature Learn. Workshop
  ident: ref120
  article-title: Improving the speed of neural networks on CPUs
– ident: ref217
  doi: 10.1007/978-3-030-01261-8_25
– ident: ref373
  doi: 10.1109/SmartWorld-UIC-ATC-SCALCOM-IOP-SCI.2019.00194
– ident: ref164
  doi: 10.1145/3289602.3293902
– start-page: 701
  volume-title: Proc. Design, Autom. Test Eur. Conf. Exhib. (DATE)
  ident: ref170
  article-title: ApproxANN: An approximate computing framework for artificial neural network
– year: 2015
  ident: ref175
  article-title: Compression of deep convolutional neural networks for fast and low power mobile applications
  publication-title: arXiv:1511.06530
– ident: ref216
  doi: 10.1109/ICCD.2017.49
– ident: ref376
  doi: 10.1109/TNET.2020.3042320
– ident: ref19
  doi: 10.1109/JPROC.2019.2921977
– ident: ref244
  doi: 10.1145/3140659.3080254
– volume: 25
  start-page: 1097
  volume-title: Proc. Adv. Neural Inf. Process. Syst. (NIPS)
  ident: ref50
  article-title: ImageNet classification with deep convolutional neural networks
– ident: ref249
  doi: 10.1109/TC.2019.2924215
– start-page: 1822
  volume-title: Proc. Adv. Neural Inf. Process. Syst.
  ident: ref92
  article-title: Architectural complexity measures of recurrent neural networks
– ident: ref231
  doi: 10.1109/ICTAI.2019.00197
– ident: ref256
  doi: 10.1145/2647868.2654889
– ident: ref336
  doi: 10.1145/3007787.3001140
– volume-title: AutoML
  year: 2022
  ident: ref62
– ident: ref129
  doi: 10.1145/3352460.3358295
– year: 2019
  ident: ref301
  article-title: TF-replicator: Distributed machine learning for researchers
  publication-title: arXiv:1902.00465
– ident: ref55
  doi: 10.1609/aaai.v31i1.11231
– ident: ref131
  doi: 10.23919/VLSIC.2017.8008533
– start-page: 2407
  volume-title: Proc. 24th ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining
  ident: ref366
  article-title: Not just privacy
– ident: ref5
  doi: 10.3390/electronics8030292
– volume-title: FairScale: A general purpose modular PyTorch library for high performance and large scale training
  year: 2021
  ident: ref299
– ident: ref226
  doi: 10.18653/v1/2020.emnlp-demos.6
– year: 2017
  ident: ref90
  article-title: Factorization tricks for LSTM networks
  publication-title: arXiv:1703.10722
– ident: ref401
  doi: 10.1145/3299710.3211336
– ident: ref72
  doi: 10.1109/CVPR.2018.00474
– ident: ref236
  doi: 10.1109/ICPR.2018.8545462
– year: 2020
  ident: ref298
  article-title: Benchmark tests of convolutional neural network and graph convolutional network on HorovodRunner enabled spark clusters
  publication-title: arXiv:2005.05510
– start-page: 561
  volume-title: Proc. 13th USENIX Symp. Operating Syst. Design Implement. (OSDI)
  ident: ref303
  article-title: Ray: A distributed framework for emerging AI applications
– ident: ref26
  doi: 10.1145/3267809.3267828
– ident: ref177
  doi: 10.1145/3093336.3037746
– ident: ref387
  doi: 10.1109/AIPR50011.2020.9425332
– ident: ref343
  doi: 10.1109/ACCESS.2018.2874823
– ident: ref169
  doi: 10.1109/CASES.2015.7324548
– volume-title: Machine Learning: A Probabilistic Perspective
  year: 2012
  ident: ref32
– year: 2014
  ident: ref44
  article-title: Learning activation functions to improve deep neural networks
  publication-title: arXiv:1412.6830
– ident: ref283
  doi: 10.1109/CVPR.2016.91
– ident: ref183
  doi: 10.1609/aaai.v33i01.33013779
– start-page: 145
  volume-title: Proc. Design, Autom. Test Eur. Conf. Exhib. (DATE)
  ident: ref166
  article-title: Multiplier-less artificial neurons exploiting error resiliency for energy-efficient neural computing
– ident: ref248
  doi: 10.1109/TNNLS.2018.2852335
– ident: ref186
  doi: 10.1609/aaai.v32i1.11876
– ident: ref123
  doi: 10.1109/LCA.2016.2597140
– year: 2019
  ident: ref77
  article-title: ANTNets: Mobile convolutional neural networks for resource efficient image classification
  publication-title: arXiv:1904.03775
– ident: ref421
  doi: 10.1145/3472291
– year: 2015
  ident: ref93
  article-title: Grid long short-term memory
  publication-title: arXiv:1507.01526
– year: 2018
  ident: ref368
  article-title: PipeDream: Fast and efficient pipeline parallel DNN training
  publication-title: arXiv:1806.03377
– ident: ref411
  doi: 10.1111/1754-9485.13261
– start-page: 265
  volume-title: Proc. 12th USENIX Symp. Operating Syst. Design Implement.
  ident: ref255
  article-title: TensorFlow: A system for large-scale machine learning
– start-page: 1
  volume-title: Proc. USENIX Workshop Hot Topics Edge Comput.
  ident: ref370
  article-title: MODI: Mobile deep inference made efficient by edge computing
– ident: ref337
  doi: 10.1038/s41586-018-0180-5
– ident: ref232
  doi: 10.1109/ISSCC.2019.8662302
– ident: ref117
  doi: 10.1109/ICASSP.2015.7178146
– year: 2018
  ident: ref69
  article-title: DARTS: Differentiable architecture search
  publication-title: arXiv:1806.09055
– year: 2017
  ident: ref184
  article-title: Like what you like: Knowledge distill via neuron selectivity transfer
  publication-title: arXiv:1707.01219
– year: 2017
  ident: ref198
  article-title: SquishedNets: Squishing SqueezeNet further for edge device scenarios via deep evolutionary synthesis
  publication-title: arXiv:1711.07459
– ident: ref87
  doi: 10.1007/s11633-016-1006-2
– year: 2016
  ident: ref142
  article-title: Quantized neural networks: Training neural networks with low precision weights and activations
  publication-title: arXiv:1609.07061
– ident: ref83
  doi: 10.1145/3289602.3293905
– ident: ref38
  doi: 10.1007/978-0-387-84858-7_14
– ident: ref100
  doi: 10.1109/WACV.2018.00083
– ident: ref408
  doi: 10.1109/ICCCBDA51879.2021.9442600
– ident: ref67
  doi: 10.1109/CVPR.2019.00293
– year: 2015
  ident: ref133
  article-title: Neural networks with few multiplications
  publication-title: arXiv:1510.03009
– year: 2020
  ident: ref227
  article-title: SpAtten: Efficient sparse attention architecture with cascade token and head pruning
  publication-title: arXiv:2012.09852
– ident: ref369
  doi: 10.21437/Interspeech.2021-1816
– volume-title: AutoKeras
  year: 2022
  ident: ref63
– start-page: 1
  volume-title: Proc. 11th IEEE/ACM/IFIP Int. Conf. Hardw./Softw. Codesign Syst. Synth.
  ident: ref211
  article-title: Runtime configurable deep neural networks for energy-accuracy trade-off
– year: 2020
  ident: ref194
  article-title: Flexible dataset distillation: Learn labels instead of images
  publication-title: arXiv:2006.08572
– ident: ref168
  doi: 10.1109/ISLPED.2017.8009173
– ident: ref246
  doi: 10.1145/3352460.3358291
– start-page: 231
  volume-title: Pro Android Apps Performance Optimization
  year: 2012
  ident: ref279
  article-title: RenderScript
  doi: 10.1007/978-1-4302-4000-6_9
– ident: ref261
  doi: 10.1609/aaai.v34i04.5954
– ident: ref212
  doi: 10.1109/ICPR.2016.7900006
– ident: ref371
  doi: 10.23919/DATE.2017.7927211
– ident: ref12
  doi: 10.1007/s00521-018-3761-1
– ident: ref275
  doi: 10.1609/aaai.v35i2.16179
– ident: ref416
  doi: 10.1109/FCCM.2019.00033
– ident: ref310
  doi: 10.1109/VLSIC.2018.8502404
– ident: ref403
  doi: 10.1109/ISCAS51556.2021.9401083
– ident: ref2
  doi: 10.1109/JPROC.2017.2761740
– ident: ref201
  doi: 10.1145/3109761.3109804
– ident: ref409
  doi: 10.1109/TIM.2020.3018831
– ident: ref273
  doi: 10.1086/719650
– ident: ref39
  doi: 10.1007/s10994-019-05855-6
– ident: ref1
  doi: 10.7551/mitpress/12832.003.0015
– start-page: 1
  volume-title: Proc. Adv. Neural Inf. Process. Syst.
  ident: ref300
  article-title: Mesh-TensorFlow: Deep learning for supercomputers
– year: 2017
  ident: ref24
  article-title: DeepCache: Principled cache for mobile deep vision
  publication-title: arXiv:1712.01670
– ident: ref333
  doi: 10.23919/DATE48585.2020.9116529
– ident: ref350
  doi: 10.1109/SEC.2016.38
– ident: ref234
  doi: 10.1109/LCA.2020.2979965
– ident: ref349
  doi: 10.1145/3318216.3363316
– ident: ref190
  doi: 10.1109/CVPR.2018.00454
– ident: ref57
  doi: 10.1109/CVPR.2017.243
– year: 2017
  ident: ref221
  article-title: Rethinking atrous convolution for semantic image segmentation
  publication-title: arXiv:1706.05587
– ident: ref65
  doi: 10.1007/978-3-030-01249-6_18
– start-page: 5998
  volume-title: Proc. Adv. Neural Inf. Process. Syst.
  ident: ref45
  article-title: Attention is all you need
– ident: ref124
  doi: 10.1109/ISSCC.2018.8310262
– ident: ref285
  doi: 10.4108/eai.30-11-2016.2267463
– ident: ref250
  doi: 10.1109/JSSC.2016.2616357
– ident: ref375
  doi: 10.1109/TCAD.2018.2858384
– ident: ref42
  doi: 10.1613/jair.301
– ident: ref252
  doi: 10.3390/en15207495
– ident: ref11
  doi: 10.1109/ACCESS.2018.2890150
– ident: ref79
  doi: 10.1109/CVPR.2016.435
– ident: ref223
  doi: 10.1109/CVPR42600.2020.01464
– ident: ref313
  doi: 10.1145/3174243.3174261
– ident: ref66
  doi: 10.1007/978-3-030-01246-5_2
– year: 2017
  ident: ref334
  article-title: Hello edge: Keyword spotting on microcontrollers
  publication-title: arXiv:1711.07128
– ident: ref119
  doi: 10.1109/VLSIC.2016.7573525
– year: 2020
  ident: ref46
  article-title: Language models are few-shot learners
  publication-title: arXiv:2005.14165
– start-page: 2888
  volume-title: Proc. Adv. Neural Inf. Process. Syst.
  ident: ref204
  article-title: Moonshine: Distilling with cheap convolutions
– start-page: 916
  volume-title: Are very deep neural networks feasible on mobile devices?
  year: 2016
  ident: ref282
– ident: ref289
  doi: 10.1145/3296957.3173176
– ident: ref388
  doi: 10.1109/PERCOMW.2016.7457169
– year: 2022
  ident: ref71
  article-title: EdgeNeXt: Efficiently amalgamated CNN-transformer architecture for mobile vision applications
  publication-title: arXiv:2206.10589
– ident: ref122
  doi: 10.1109/EMC2.2018.00012
– ident: ref160
  doi: 10.1145/3020078.3021745
– ident: ref288
  doi: 10.1109/CVPRW.2011.5981829
– ident: ref203
  doi: 10.5244/C.28.88
– year: 2016
  ident: ref363
  article-title: Federated learning: Strategies for improving communication efficiency
  publication-title: arXiv:1610.05492
– ident: ref280
  doi: 10.1109/LES.2018.2815954
– start-page: 1273
  volume-title: Artificial Intelligence and Statistics
  year: 2017
  ident: ref359
  article-title: Communication-efficient learning of deep networks from decentralized data
– ident: ref346
  doi: 10.1109/MPRV.2017.2940968
– ident: ref52
  doi: 10.4324/9781410605337-29
– year: 2018
  ident: ref418
  article-title: ProxylessNAS: Direct neural architecture search on target task and hardware
  publication-title: arXiv:1812.00332
– ident: ref364
  doi: 10.1109/LCOMM.2019.2921755
– year: 2020
  ident: ref274
  article-title: CoCoPIE: Making mobile AI sweet as PIE—Compression-compilation co-design goes a long way
  publication-title: arXiv:2003.06700
– ident: ref188
  doi: 10.18653/v1/2020.findings-emnlp.372
– ident: ref110
  doi: 10.1145/3020078.3021740
– ident: ref109
  doi: 10.1016/j.micpro.2020.102991
– ident: ref58
  doi: 10.1109/CVPR.2018.00745
– ident: ref321
  doi: 10.23919/VLSIC.2017.8008534
– ident: ref397
  doi: 10.1145/2750858.2804262
– ident: ref128
  doi: 10.1109/FPL.2018.00035
– volume: 48
  start-page: 2849
  volume-title: Proc. 33rd Int. Conf. Int. Conf. Mach. Learn.
  ident: ref115
  article-title: Fixed point quantization of deep convolutional networks
– ident: ref167
  doi: 10.1109/JETCAS.2018.2835809
– ident: ref316
  doi: 10.1109/TNNLS.2018.2844093
– ident: ref179
  doi: 10.1145/3287624.3287714
– ident: ref238
  doi: 10.1109/MICRO.2018.00024
– volume-title: Deep Learning HDL Toolbox
  year: 2022
  ident: ref265
– ident: ref144
  doi: 10.1109/CVPR.2017.574
– ident: ref276
  doi: 10.1145/2964284.2973801
– ident: ref75
  doi: 10.1109/CVPR.2018.00716
– year: 2022
  ident: ref293
  article-title: Maximizing communication efficiency for large-scale training via 0/1 Adam
  publication-title: arXiv:2202.06009
– ident: ref10
  doi: 10.1109/JPROC.2020.2976475
– ident: ref374
  doi: 10.1109/PADSW.2018.8645013
– ident: ref56
  doi: 10.1109/CVPR.2017.195
– ident: ref417
  doi: 10.1145/3289602.3293910
– ident: ref414
  doi: 10.1109/MICRO.2016.7783720
– ident: ref28
  doi: 10.1109/INFOCOM.2018.8486403
– ident: ref404
  doi: 10.3390/electronics8111321
– ident: ref360
  doi: 10.2200/s00960ed2v01y201910aim043
– ident: ref242
  doi: 10.1109/MICRO.2018.00011
– ident: ref139
  doi: 10.1007/978-3-030-01267-0_44
– ident: ref385
  doi: 10.1007/978-3-319-97909-0_46
– ident: ref219
  doi: 10.1109/EMBC.2017.8036767
– start-page: 3882
  volume-title: Proc. Adv. Neural Inf. Process. Syst.
  ident: ref88
  article-title: Phased LSTM: Accelerating recurrent network training for long or event-based sequences
– year: 2016
  ident: ref134
  article-title: Binarized neural networks: Training deep neural networks with weights and activations constrained to +1 or −1
  publication-title: arXiv:1602.02830
– ident: ref192
  doi: 10.1109/CVPR42600.2020.00396
– volume: 64
  start-page: 62
  issue: 6
  year: 2021
  ident: ref263
  article-title: CoCoPIE
  publication-title: Commun. ACM
  doi: 10.1145/3418297
– ident: ref85
  doi: 10.1109/ICASSP.2016.7472657
– ident: ref379
  doi: 10.1109/IPDPS53621.2022.00110
– ident: ref181
  doi: 10.1145/1150402.1150464
– ident: ref215
  doi: 10.1109/TII.2018.2842821
– ident: ref224
  doi: 10.48550/arXiv.1810.04805
– ident: ref317
  doi: 10.1109/FPL.2018.00075
– year: 2018
  ident: ref86
  article-title: The unreasonable effectiveness of the forget gate
  publication-title: arXiv:1804.04849
– ident: ref4
  doi: 10.1145/3234150
– ident: ref34
  doi: 10.1016/j.patcog.2015.03.009
– ident: ref391
  doi: 10.1145/3212725.3212728
– ident: ref354
  doi: 10.1109/ICPADS.2017.00069
– ident: ref358
  doi: 10.1109/CHASE.2017.115
– ident: ref218
  doi: 10.1109/ISSCC.2017.7870353
– ident: ref380
  doi: 10.1109/INFOCOM48880.2022.9796896
– ident: ref162
  doi: 10.1145/3081333.3081360
– ident: ref14
  doi: 10.1109/ICCVW.2019.00447
– ident: ref108
  doi: 10.1109/CVPR.2017.643
– ident: ref78
  doi: 10.1137/1.9781611970364
– ident: ref323
  doi: 10.1109/TVLSI.2018.2797600
– ident: ref402
  doi: 10.1145/3240765.3243473
– ident: ref76
  doi: 10.1109/CVPR.2018.00291
– ident: ref121
  doi: 10.1109/ICACSIS.2017.8355051
– ident: ref36
  doi: 10.1007/978-3-540-75171-7_2
– ident: ref96
  doi: 10.1145/3289602.3293898
– start-page: 3123
  volume-title: Proc. Adv. Neural Inf. Process. Syst.
  ident: ref132
  article-title: BinaryConnect: Training deep neural networks with binary weights during propagations
– ident: ref315
  doi: 10.1109/TCAD.2017.2705069
– year: 2014
  ident: ref89
  article-title: Long short-term memory based recurrent neural network architectures for large vocabulary speech recognition
  publication-title: arXiv:1402.1128
– ident: ref161
  doi: 10.1109/TMC.2020.2999956
– start-page: 601
  volume-title: Proc. Adv. Neural Inf. Process. Syst.
  ident: ref154
  article-title: HitNet: Hybrid ternary recurrent neural network
SSID ssj0003563
Score 2.7179554
SecondaryResourceType review_article
Snippet Successful integration of deep neural networks (DNNs) or deep learning (DL) has resulted in breakthroughs in many areas. However, deploying these highly...
SourceID proquest
crossref
ieee
SourceType Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 1
SubjectTerms Algorithms
Algorithm–hardware codesign
Artificial intelligence
artificial intelligence (AI)
artificial intelligence on edge (edge-AI)
Artificial neural networks
Cloud computing
Co-design
Computer architecture
Cyber-physical systems
Data transmission
Deep learning
deep learning (DL)
Design optimization
Hardware
Image edge detection
Inference
Internet of Things
Iterative methods
Machine learning
Memory devices
model compression
Network latency
neural accelerator
Optimization
Optimization techniques
Real-time systems
State-of-the-art reviews
Training
Title Efficient Acceleration of Deep Learning Inference on Resource-Constrained Edge Devices: A Review
URI https://ieeexplore.ieee.org/document/9985008
https://www.proquest.com/docview/2763809889
Volume 111
WOSCitedRecordID wos000899996300001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVIEE
  databaseName: IEEE/IET Electronic Library
  customDbUrl:
  eissn: 1558-2256
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0003563
  issn: 0018-9219
  databaseCode: RIE
  dateStart: 19630101
  isFulltext: true
  titleUrlDefault: https://ieeexplore.ieee.org/
  providerName: IEEE
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1JSwMxFA61eNCDWxWrVXLwptPO2mS8ldqiIrWIQm9jlpciSFu6-Pt9ycwURRG8DUwehPkmb0nyfY-QC6MTDAOCe0msmRdL4XtCSIGAYPYuZOJLcEThBzYY8NEoHVbI1ZoLAwDu8hk07aM7y9dTtbJbZS0sDRLH7N1grJ1ztdZeN0qKrmkBLmBchiVBxk9b98Onxy6WgmHYjCxvlAffgpDrqvLDFbv40t_938z2yE6RR9JODvw-qcDkgGx_UReskdeek4dAQ9pRCsNLDjadGnoDMKOFtOqY3pWkP4pvy_18z7bydA0kQNOeHgMaOa9yTTs0P1E4JC_93nP31isaKngK87Klh9Hc10zoODRY-br80GCAV9KYUIS8LTkTMSQmCgJlmBYQpsLoSHIrCiQVj45IdTKdwDGhiKTyMbkTmI7E4Le5ARlzK3YDDEs4XSdB-YUzVaiN2zm_Z67q8NPMoZJZVLIClTq5XNvMcq2NP0fXLA7rkQUEddIogcyK5bjIQvSi3E85T09-tzolW7aPfL630iDV5XwFZ2RTfSzfFvNz96d9AhSrz60
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1JSwMxFH6ICurBXaxrDt50dNZOxlvRikutIgrexiwvRZBWbPX3-5LJFEURvA1MHoT5Jm9J8n0PYM_ojMKA4EGW6jxIpQgDIaQgQCh7FzILJTqicCfvdvnjY3E7AQdjLgwiustneGgf3Vm-Hqh3u1V2RKVB5pi9U7Zzlmdrjf1ukvm-aREtYVqINUUmLI4ub-9uTqgYjOPDxDJHefQtDLm-Kj-csYswZwv_m9sizPtMkrUq6JdgAvvLMPdFX3AFntpOIIIMWUspCjAV3Gxg2CniK_Piqj12UdP-GL2td_QD28zTtZBAzdq6h2Tk_Moxa7HqTGEVHs7a9yfngW-pECjKzEYBxfNQ50KnsaHa12WIhkK8ksbEIuZNyXORYmaSKFIm1wLjQhidSG5lgaTiyRpM9gd9XAdGWKqQ0jtBCUmKYZMblCm3cjeYUxGnGxDVX7hUXm_czvmldHVHWJQOldKiUnpUGrA_tnmt1Db-HL1icRiP9BA0YKsGsvQLcljG5Ed5WHBebPxutQsz5_fXnbJz0b3ahFnbVb7aadmCydHbO27DtPoYPQ_fdtxf9wkSU9L2
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Efficient+Acceleration+of+Deep+Learning+Inference+on+Resource-Constrained+Edge+Devices%3A+A+Review&rft.jtitle=Proceedings+of+the+IEEE&rft.au=Shuvo%2C+Md.+Maruf+Hossain&rft.au=Islam%2C+Syed+Kamrul&rft.au=Cheng%2C+Jianlin&rft.au=Morshed%2C+Bashir+I.&rft.date=2023-01-01&rft.pub=IEEE&rft.issn=0018-9219&rft.spage=1&rft.epage=50&rft_id=info:doi/10.1109%2FJPROC.2022.3226481&rft.externalDocID=9985008
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0018-9219&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0018-9219&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0018-9219&client=summon