An Empirical Study on Neural Networks Pruning: Trimming for Reducing Memory or Workload
Most of existing studies on neural network pruning only consider memory-based pruning strategies. However pruning for computational workload is often more important in hardware deployments due to a greater focus on model computation reductions. In addition, most pruning schemes restore model accurac...
Saved in:
| Published in: | 2023 5th International Conference on Data-driven Optimization of Complex Systems (DOCS) pp. 1 - 7 |
|---|---|
| Main Authors: | , , |
| Format: | Conference Proceeding |
| Language: | English |
| Published: |
IEEE
22.09.2023
|
| Subjects: | |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Abstract | Most of existing studies on neural network pruning only consider memory-based pruning strategies. However pruning for computational workload is often more important in hardware deployments due to a greater focus on model computation reductions. In addition, most pruning schemes restore model accuracy during pruning at the expense of adding hyperparameters, extending training time and training complexity. This work proposes a statistical-based globally soft iterative pruning scheme. With little extra calculation, an extremely sparse model can be obtained without additional hyperparameters and extended training time. Moreover, this work proposes the concept of computational intensity to balance model memory and computational workload during pruning. Focusing on memory orientated pruning, we can achieve \mathbf{303}\times, \mathbf{100}\times and \mathbf{25}\times parameter compression on LeNet-5 (MNIST), VGG (CIFAR-10) and AlexNet (ImageNet) models, respectively. In particular, combined with cluster quantization, the LeNet-5 model parameters can be compressed to \mathbf{3232}\times . Focusing on workload orientated pruning, we can reduce the computation by \mathbf{7}.\mathbf{6}\times on the AlexNet model, without accuracy loss, significantly higher than prior work. In addition, in order to verify the versatility of the pruning method, we also migrate the pruning task to the object detection and implement \mathbf{10}\times parameter compression and \mathbf{2}.\mathbf{8}\times computation compression for YOLOv2 with reduced mAP within 1%. |
|---|---|
| AbstractList | Most of existing studies on neural network pruning only consider memory-based pruning strategies. However pruning for computational workload is often more important in hardware deployments due to a greater focus on model computation reductions. In addition, most pruning schemes restore model accuracy during pruning at the expense of adding hyperparameters, extending training time and training complexity. This work proposes a statistical-based globally soft iterative pruning scheme. With little extra calculation, an extremely sparse model can be obtained without additional hyperparameters and extended training time. Moreover, this work proposes the concept of computational intensity to balance model memory and computational workload during pruning. Focusing on memory orientated pruning, we can achieve \mathbf{303}\times, \mathbf{100}\times and \mathbf{25}\times parameter compression on LeNet-5 (MNIST), VGG (CIFAR-10) and AlexNet (ImageNet) models, respectively. In particular, combined with cluster quantization, the LeNet-5 model parameters can be compressed to \mathbf{3232}\times . Focusing on workload orientated pruning, we can reduce the computation by \mathbf{7}.\mathbf{6}\times on the AlexNet model, without accuracy loss, significantly higher than prior work. In addition, in order to verify the versatility of the pruning method, we also migrate the pruning task to the object detection and implement \mathbf{10}\times parameter compression and \mathbf{2}.\mathbf{8}\times computation compression for YOLOv2 with reduced mAP within 1%. |
| Author | Xu, Ke Xiao, Kaiwen Cai, Xiaodong |
| Author_xml | – sequence: 1 givenname: Kaiwen surname: Xiao fullname: Xiao, Kaiwen organization: School of Artificial Intelligence, Anhui University,Hefei,China – sequence: 2 givenname: Xiaodong surname: Cai fullname: Cai, Xiaodong organization: School of Artificial Intelligence, Anhui University,Hefei,China – sequence: 3 givenname: Ke surname: Xu fullname: Xu, Ke organization: School of Artificial Intelligence, Anhui University,Hefei,China |
| BookMark | eNo1T1FLwzAYjKAPOvcPBPMHWvMlTdvPt1HnFOYmbrLHkTaJBNt0ZC2yf29Efbo7uDvursi5770h5BZYCsDw7mFdbXKGRZFyxkUKjGOGMj8jUyywFJIJFJLDJdnNPJ13Bxdco1q6GUZ9or2nKzOGqFdm-OrD55G-htE7_3FPt8F1XWTU9oG-GT02P-LFdH2IwUB30d72Sl-TC6vao5n-4YS8P8631VOyXC-eq9kycQA4JJZJYxkgIjSYxZVG1rKQXGRaY25Aa9tYoXmtS8wKsI2AutS6UAo1YK7EhNz89jpjzP4Q16lw2v__Fd847VCT |
| ContentType | Conference Proceeding |
| DBID | 6IE 6IL CBEJK RIE RIL |
| DOI | 10.1109/DOCS60977.2023.10294956 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEL url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| EISBN | 9798350393521 |
| EndPage | 7 |
| ExternalDocumentID | 10294956 |
| Genre | orig-research |
| GrantInformation_xml | – fundername: National Natural Science Foundation of China grantid: 62206003 funderid: 10.13039/501100001809 |
| GroupedDBID | 6IE 6IL CBEJK RIE RIL |
| ID | FETCH-LOGICAL-i119t-f05ef019991c94029e5b575234dd96e1ddfcf3d2bd89471fc31b8dd7aa9d196a3 |
| IEDL.DBID | RIE |
| IngestDate | Wed Jan 10 09:28:11 EST 2024 |
| IsPeerReviewed | false |
| IsScholarly | false |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-i119t-f05ef019991c94029e5b575234dd96e1ddfcf3d2bd89471fc31b8dd7aa9d196a3 |
| PageCount | 7 |
| ParticipantIDs | ieee_primary_10294956 |
| PublicationCentury | 2000 |
| PublicationDate | 2023-Sept.-22 |
| PublicationDateYYYYMMDD | 2023-09-22 |
| PublicationDate_xml | – month: 09 year: 2023 text: 2023-Sept.-22 day: 22 |
| PublicationDecade | 2020 |
| PublicationTitle | 2023 5th International Conference on Data-driven Optimization of Complex Systems (DOCS) |
| PublicationTitleAbbrev | DOCS |
| PublicationYear | 2023 |
| Publisher | IEEE |
| Publisher_xml | – name: IEEE |
| Score | 1.8449864 |
| Snippet | Most of existing studies on neural network pruning only consider memory-based pruning strategies. However pruning for computational workload is often more... |
| SourceID | ieee |
| SourceType | Publisher |
| StartPage | 1 |
| SubjectTerms | Computational modeling Focusing Image coding Neural networks Object detection Quantization (signal) Training |
| Title | An Empirical Study on Neural Networks Pruning: Trimming for Reducing Memory or Workload |
| URI | https://ieeexplore.ieee.org/document/10294956 |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NSwMxEA22ePCkYsVvcvCamuxudxNvUls8aC1atbeym5nAgt0t21bw35vZtooHD96SkBCYEOZl8uYNY5dJaOLU4yABgQYRWa2EAbQiQrTo8bNTda3D1_tkMNDjsRmuk9XrXBhErMln2KZm_ZcPpV1SqMzf8MAQoG-wRpLEq2StNWdLSXN1-9h9jqUHNG2qCd7ezP5VN6V2G_3df264x1o_CXh8-O1a9tkWFgfs7abgveksr0U9OBEAP3lZcJLX8P3Bis899-uWFOu45qMqn059i3tcyp9IopU6D0St9QsrTnHy9zKFFnvp90bdO7GuiyBypcxCONlBJ0k_QFnj338GO5lHXUEYAZgYFYCzLoQgA22873E2VJkGSNLUgL9waXjImkVZ4BHjMYJGmWGmnYzSMMyU0VZGmGQdrTw2OWYtsspktpK-mGwMcvLH-CnbIdsToSIIzlhzUS3xnG3bj0U-ry7qA_sCDvKZcQ |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NSwMxEA1aBT2pWPHbHLxuTfYz8Sa1UrFdi1btrexmZqFgd8u2Ffz3Zrat4sGDtyQkBDKEeZm8ecPYZeTpMLE4yAFXgeMbJR0NaBwf0aDFz5msah2-dqI4VoOB7i2T1atcGESsyGfYoGb1lw-FmVOozN5wVxOgX2cbge-7YpGutWRtSaGvbh-bz6GwkKZBVcEbq_m_KqdUjuNu559b7rL6Twoe7307lz22hvk-e7vJeWs8GVWyHpwogJ-8yDkJbNh-vGB0T-26OUU7rnm_HI3HtsUtMuVPJNJKnS6Ra-3CklOk_L1IoM5e7lr9ZttZVkZwRlLqmZOJADNBCgLSaPsC1BikFne5ng-gQ5QAmck8cFNQ2nqfzHgyVQBRkmiwVy7xDlgtL3I8ZDxEUChSTFUm_MTzUqmVET5GaaCkRSdHrE6nMpwsxC-GqwM5_mP8gm21-93OsHMfP5ywbbID0Stc95TVZuUcz9im-ZiNpuV5ZbwvOMmcuA |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2023+5th+International+Conference+on+Data-driven+Optimization+of+Complex+Systems+%28DOCS%29&rft.atitle=An+Empirical+Study+on+Neural+Networks+Pruning%3A+Trimming+for+Reducing+Memory+or+Workload&rft.au=Xiao%2C+Kaiwen&rft.au=Cai%2C+Xiaodong&rft.au=Xu%2C+Ke&rft.date=2023-09-22&rft.pub=IEEE&rft.spage=1&rft.epage=7&rft_id=info:doi/10.1109%2FDOCS60977.2023.10294956&rft.externalDocID=10294956 |