An Empirical Study on Neural Networks Pruning: Trimming for Reducing Memory or Workload

Most of existing studies on neural network pruning only consider memory-based pruning strategies. However pruning for computational workload is often more important in hardware deployments due to a greater focus on model computation reductions. In addition, most pruning schemes restore model accurac...

Full description

Saved in:
Bibliographic Details
Published in:2023 5th International Conference on Data-driven Optimization of Complex Systems (DOCS) pp. 1 - 7
Main Authors: Xiao, Kaiwen, Cai, Xiaodong, Xu, Ke
Format: Conference Proceeding
Language:English
Published: IEEE 22.09.2023
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Abstract Most of existing studies on neural network pruning only consider memory-based pruning strategies. However pruning for computational workload is often more important in hardware deployments due to a greater focus on model computation reductions. In addition, most pruning schemes restore model accuracy during pruning at the expense of adding hyperparameters, extending training time and training complexity. This work proposes a statistical-based globally soft iterative pruning scheme. With little extra calculation, an extremely sparse model can be obtained without additional hyperparameters and extended training time. Moreover, this work proposes the concept of computational intensity to balance model memory and computational workload during pruning. Focusing on memory orientated pruning, we can achieve \mathbf{303}\times, \mathbf{100}\times and \mathbf{25}\times parameter compression on LeNet-5 (MNIST), VGG (CIFAR-10) and AlexNet (ImageNet) models, respectively. In particular, combined with cluster quantization, the LeNet-5 model parameters can be compressed to \mathbf{3232}\times . Focusing on workload orientated pruning, we can reduce the computation by \mathbf{7}.\mathbf{6}\times on the AlexNet model, without accuracy loss, significantly higher than prior work. In addition, in order to verify the versatility of the pruning method, we also migrate the pruning task to the object detection and implement \mathbf{10}\times parameter compression and \mathbf{2}.\mathbf{8}\times computation compression for YOLOv2 with reduced mAP within 1%.
AbstractList Most of existing studies on neural network pruning only consider memory-based pruning strategies. However pruning for computational workload is often more important in hardware deployments due to a greater focus on model computation reductions. In addition, most pruning schemes restore model accuracy during pruning at the expense of adding hyperparameters, extending training time and training complexity. This work proposes a statistical-based globally soft iterative pruning scheme. With little extra calculation, an extremely sparse model can be obtained without additional hyperparameters and extended training time. Moreover, this work proposes the concept of computational intensity to balance model memory and computational workload during pruning. Focusing on memory orientated pruning, we can achieve \mathbf{303}\times, \mathbf{100}\times and \mathbf{25}\times parameter compression on LeNet-5 (MNIST), VGG (CIFAR-10) and AlexNet (ImageNet) models, respectively. In particular, combined with cluster quantization, the LeNet-5 model parameters can be compressed to \mathbf{3232}\times . Focusing on workload orientated pruning, we can reduce the computation by \mathbf{7}.\mathbf{6}\times on the AlexNet model, without accuracy loss, significantly higher than prior work. In addition, in order to verify the versatility of the pruning method, we also migrate the pruning task to the object detection and implement \mathbf{10}\times parameter compression and \mathbf{2}.\mathbf{8}\times computation compression for YOLOv2 with reduced mAP within 1%.
Author Xu, Ke
Xiao, Kaiwen
Cai, Xiaodong
Author_xml – sequence: 1
  givenname: Kaiwen
  surname: Xiao
  fullname: Xiao, Kaiwen
  organization: School of Artificial Intelligence, Anhui University,Hefei,China
– sequence: 2
  givenname: Xiaodong
  surname: Cai
  fullname: Cai, Xiaodong
  organization: School of Artificial Intelligence, Anhui University,Hefei,China
– sequence: 3
  givenname: Ke
  surname: Xu
  fullname: Xu, Ke
  organization: School of Artificial Intelligence, Anhui University,Hefei,China
BookMark eNo1T1FLwzAYjKAPOvcPBPMHWvMlTdvPt1HnFOYmbrLHkTaJBNt0ZC2yf29Efbo7uDvursi5770h5BZYCsDw7mFdbXKGRZFyxkUKjGOGMj8jUyywFJIJFJLDJdnNPJ13Bxdco1q6GUZ9or2nKzOGqFdm-OrD55G-htE7_3FPt8F1XWTU9oG-GT02P-LFdH2IwUB30d72Sl-TC6vao5n-4YS8P8631VOyXC-eq9kycQA4JJZJYxkgIjSYxZVG1rKQXGRaY25Aa9tYoXmtS8wKsI2AutS6UAo1YK7EhNz89jpjzP4Q16lw2v__Fd847VCT
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/DOCS60977.2023.10294956
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEL
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
EISBN 9798350393521
EndPage 7
ExternalDocumentID 10294956
Genre orig-research
GrantInformation_xml – fundername: National Natural Science Foundation of China
  grantid: 62206003
  funderid: 10.13039/501100001809
GroupedDBID 6IE
6IL
CBEJK
RIE
RIL
ID FETCH-LOGICAL-i119t-f05ef019991c94029e5b575234dd96e1ddfcf3d2bd89471fc31b8dd7aa9d196a3
IEDL.DBID RIE
IngestDate Wed Jan 10 09:28:11 EST 2024
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i119t-f05ef019991c94029e5b575234dd96e1ddfcf3d2bd89471fc31b8dd7aa9d196a3
PageCount 7
ParticipantIDs ieee_primary_10294956
PublicationCentury 2000
PublicationDate 2023-Sept.-22
PublicationDateYYYYMMDD 2023-09-22
PublicationDate_xml – month: 09
  year: 2023
  text: 2023-Sept.-22
  day: 22
PublicationDecade 2020
PublicationTitle 2023 5th International Conference on Data-driven Optimization of Complex Systems (DOCS)
PublicationTitleAbbrev DOCS
PublicationYear 2023
Publisher IEEE
Publisher_xml – name: IEEE
Score 1.8449864
Snippet Most of existing studies on neural network pruning only consider memory-based pruning strategies. However pruning for computational workload is often more...
SourceID ieee
SourceType Publisher
StartPage 1
SubjectTerms Computational modeling
Focusing
Image coding
Neural networks
Object detection
Quantization (signal)
Training
Title An Empirical Study on Neural Networks Pruning: Trimming for Reducing Memory or Workload
URI https://ieeexplore.ieee.org/document/10294956
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NSwMxEA22ePCkYsVvcvCamuxudxNvUls8aC1atbeym5nAgt0t21bw35vZtooHD96SkBCYEOZl8uYNY5dJaOLU4yABgQYRWa2EAbQiQrTo8bNTda3D1_tkMNDjsRmuk9XrXBhErMln2KZm_ZcPpV1SqMzf8MAQoG-wRpLEq2StNWdLSXN1-9h9jqUHNG2qCd7ezP5VN6V2G_3df264x1o_CXh8-O1a9tkWFgfs7abgveksr0U9OBEAP3lZcJLX8P3Bis899-uWFOu45qMqn059i3tcyp9IopU6D0St9QsrTnHy9zKFFnvp90bdO7GuiyBypcxCONlBJ0k_QFnj338GO5lHXUEYAZgYFYCzLoQgA22873E2VJkGSNLUgL9waXjImkVZ4BHjMYJGmWGmnYzSMMyU0VZGmGQdrTw2OWYtsspktpK-mGwMcvLH-CnbIdsToSIIzlhzUS3xnG3bj0U-ry7qA_sCDvKZcQ
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NSwMxEA1aBT2pWPHbHLxuTfYz8Sa1UrFdi1btrexmZqFgd8u2Ffz3Zrat4sGDtyQkBDKEeZm8ecPYZeTpMLE4yAFXgeMbJR0NaBwf0aDFz5msah2-dqI4VoOB7i2T1atcGESsyGfYoGb1lw-FmVOozN5wVxOgX2cbge-7YpGutWRtSaGvbh-bz6GwkKZBVcEbq_m_KqdUjuNu559b7rL6Twoe7307lz22hvk-e7vJeWs8GVWyHpwogJ-8yDkJbNh-vGB0T-26OUU7rnm_HI3HtsUtMuVPJNJKnS6Ra-3CklOk_L1IoM5e7lr9ZttZVkZwRlLqmZOJADNBCgLSaPsC1BikFne5ng-gQ5QAmck8cFNQ2nqfzHgyVQBRkmiwVy7xDlgtL3I8ZDxEUChSTFUm_MTzUqmVET5GaaCkRSdHrE6nMpwsxC-GqwM5_mP8gm21-93OsHMfP5ywbbID0Stc95TVZuUcz9im-ZiNpuV5ZbwvOMmcuA
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2023+5th+International+Conference+on+Data-driven+Optimization+of+Complex+Systems+%28DOCS%29&rft.atitle=An+Empirical+Study+on+Neural+Networks+Pruning%3A+Trimming+for+Reducing+Memory+or+Workload&rft.au=Xiao%2C+Kaiwen&rft.au=Cai%2C+Xiaodong&rft.au=Xu%2C+Ke&rft.date=2023-09-22&rft.pub=IEEE&rft.spage=1&rft.epage=7&rft_id=info:doi/10.1109%2FDOCS60977.2023.10294956&rft.externalDocID=10294956