MARS: Multi-macro Architecture SRAM CIM-Based Accelerator with Co-designed Compressed Neural Networks

Convolutional neural networks (CNNs) play a key role in deep learning applications. However, the large storage overheads and the substantial computation cost of CNNs are problematic in hardware accelerators. Computing-in-memory (CIM) architecture has demonstrated great potential to effectively compu...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:arXiv.org
Hlavní autori: Syuan-Hao Sie, Lee, Jye-Luen, Yi-Ren, Chen, Chih-Cheng, Lu, Chih-Cheng Hsieh, Meng-Fan, Chang, Tang, Kea-Tiong
Médium: Paper
Jazyk:English
Vydavateľské údaje: Ithaca Cornell University Library, arXiv.org 25.05.2021
Predmet:
ISSN:2331-8422
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Abstract Convolutional neural networks (CNNs) play a key role in deep learning applications. However, the large storage overheads and the substantial computation cost of CNNs are problematic in hardware accelerators. Computing-in-memory (CIM) architecture has demonstrated great potential to effectively compute large-scale matrix-vector multiplication. However, the intensive multiply and accumulation (MAC) operations executed at the crossbar array and the limited capacity of CIM macros remain bottlenecks for further improvement of energy efficiency and throughput. To reduce computation costs, network pruning and quantization are two widely studied compression methods to shrink the model size. However, most of the model compression algorithms can only be implemented in digital-based CNN accelerators. For implementation in a static random access memory (SRAM) CIM-based accelerator, the model compression algorithm must consider the hardware limitations of CIM macros, such as the number of word lines and bit lines that can be turned on at the same time, as well as how to map the weight to the SRAM CIM macro. In this study, a software and hardware co-design approach is proposed to design an SRAM CIM-based CNN accelerator and an SRAM CIM-aware model compression algorithm. To lessen the high-precision MAC required by batch normalization (BN), a quantization algorithm that can fuse BN into the weights is proposed. Furthermore, to reduce the number of network parameters, a sparsity algorithm that considers a CIM architecture is proposed. Last, MARS, a CIM-based CNN accelerator that can utilize multiple SRAM CIM macros as processing units and support a sparsity neural network, is proposed.
AbstractList Convolutional neural networks (CNNs) play a key role in deep learning applications. However, the large storage overheads and the substantial computation cost of CNNs are problematic in hardware accelerators. Computing-in-memory (CIM) architecture has demonstrated great potential to effectively compute large-scale matrix-vector multiplication. However, the intensive multiply and accumulation (MAC) operations executed at the crossbar array and the limited capacity of CIM macros remain bottlenecks for further improvement of energy efficiency and throughput. To reduce computation costs, network pruning and quantization are two widely studied compression methods to shrink the model size. However, most of the model compression algorithms can only be implemented in digital-based CNN accelerators. For implementation in a static random access memory (SRAM) CIM-based accelerator, the model compression algorithm must consider the hardware limitations of CIM macros, such as the number of word lines and bit lines that can be turned on at the same time, as well as how to map the weight to the SRAM CIM macro. In this study, a software and hardware co-design approach is proposed to design an SRAM CIM-based CNN accelerator and an SRAM CIM-aware model compression algorithm. To lessen the high-precision MAC required by batch normalization (BN), a quantization algorithm that can fuse BN into the weights is proposed. Furthermore, to reduce the number of network parameters, a sparsity algorithm that considers a CIM architecture is proposed. Last, MARS, a CIM-based CNN accelerator that can utilize multiple SRAM CIM macros as processing units and support a sparsity neural network, is proposed.
Author Yi-Ren, Chen
Meng-Fan, Chang
Tang, Kea-Tiong
Chih-Cheng Hsieh
Lee, Jye-Luen
Syuan-Hao Sie
Chih-Cheng, Lu
Author_xml – sequence: 1
  fullname: Syuan-Hao Sie
– sequence: 2
  givenname: Jye-Luen
  surname: Lee
  fullname: Lee, Jye-Luen
– sequence: 3
  givenname: Chen
  surname: Yi-Ren
  fullname: Yi-Ren, Chen
– sequence: 4
  givenname: Lu
  surname: Chih-Cheng
  fullname: Chih-Cheng, Lu
– sequence: 5
  fullname: Chih-Cheng Hsieh
– sequence: 6
  givenname: Chang
  surname: Meng-Fan
  fullname: Meng-Fan, Chang
– sequence: 7
  givenname: Kea-Tiong
  surname: Tang
  fullname: Tang, Kea-Tiong
BookMark eNotjctOwzAURC0EEqX0A9hFYp3itx12IeIlNSC17CvbvaYpaVzshPL5BMHqSDNHMxfotAsdIHRF8JxrIfCNid_N15ziMSBUS3KCJpQxkmtO6TmapbTDGFOpqBBsgqAul6vbrB7avsn3xsWQldFtmx5cP0TIVsuyzqrnOr8zCTZZ6Ry0EE0fYnZs-m1WhXwDqXnvxrIK-0OE9Ou9wBBNO6I_hviRLtGZN22C2T-naPVw_1Y95YvXx-eqXORGUJUXhlqvCAWutdBQGOWIld5ICpJ7qzjx1kvpmSukFtYqpr2nVhthGQXLpuj6b_UQw-cAqV_vwhC78XBNueCCFFgq9gOgZVj6
ContentType Paper
Copyright 2021. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Copyright_xml – notice: 2021. This work is published under http://arxiv.org/licenses/nonexclusive-distrib/1.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
DBID 8FE
8FG
ABJCF
ABUWG
AFKRA
AZQEC
BENPR
BGLVJ
CCPQU
DWQXO
HCIFZ
L6V
M7S
PHGZM
PHGZT
PIMPY
PKEHL
PQEST
PQGLB
PQQKQ
PQUKI
PRINS
PTHSS
DOI 10.48550/arxiv.2010.12861
DatabaseName ProQuest SciTech Collection
ProQuest Technology Collection
Materials Science & Engineering Collection
ProQuest Central (Alumni)
ProQuest Central UK/Ireland
ProQuest Central Essentials
ProQuest Central (New)
ProQuest Technology Collection
ProQuest One Community College
ProQuest Central
SciTech Premium Collection
ProQuest Engineering Collection
Engineering Database
ProQuest Central Premium
ProQuest One Academic (New)
ProQuest Publicly Available Content Database
ProQuest One Academic Middle East (New)
ProQuest One Academic Eastern Edition (DO NOT USE)
ProQuest One Applied & Life Sciences
ProQuest One Academic (retired)
ProQuest One Academic UKI Edition
ProQuest Central China
Engineering Collection
DatabaseTitle Publicly Available Content Database
Engineering Database
Technology Collection
ProQuest One Academic Middle East (New)
ProQuest Central Essentials
ProQuest One Academic Eastern Edition
ProQuest Central (Alumni Edition)
SciTech Premium Collection
ProQuest One Community College
ProQuest Technology Collection
ProQuest SciTech Collection
ProQuest Central China
ProQuest Central
ProQuest One Applied & Life Sciences
ProQuest Engineering Collection
ProQuest One Academic UKI Edition
ProQuest Central Korea
Materials Science & Engineering Collection
ProQuest Central (New)
ProQuest One Academic
ProQuest One Academic (New)
Engineering Collection
DatabaseTitleList Publicly Available Content Database
Database_xml – sequence: 1
  dbid: PIMPY
  name: ProQuest Publicly Available Content Database
  url: http://search.proquest.com/publiccontent
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Discipline Physics
EISSN 2331-8422
Genre Working Paper/Pre-Print
GroupedDBID 8FE
8FG
ABJCF
ABUWG
AFKRA
ALMA_UNASSIGNED_HOLDINGS
AZQEC
BENPR
BGLVJ
CCPQU
DWQXO
FRJ
HCIFZ
L6V
M7S
M~E
PHGZM
PHGZT
PIMPY
PKEHL
PQEST
PQGLB
PQQKQ
PQUKI
PRINS
PTHSS
ID FETCH-LOGICAL-a527-9a2bf712e48858e9a7c1b6fa62e64fb741fbf66f3c9685bb738ff2b8a5b32eb3
IEDL.DBID M7S
IngestDate Mon Jun 30 09:35:48 EDT 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-a527-9a2bf712e48858e9a7c1b6fa62e64fb741fbf66f3c9685bb738ff2b8a5b32eb3
Notes SourceType-Working Papers-1
ObjectType-Working Paper/Pre-Print-1
content type line 50
OpenAccessLink https://www.proquest.com/docview/2454519067?pq-origsite=%requestingapplication%
PQID 2454519067
PQPubID 2050157
ParticipantIDs proquest_journals_2454519067
PublicationCentury 2000
PublicationDate 20210525
PublicationDateYYYYMMDD 2021-05-25
PublicationDate_xml – month: 05
  year: 2021
  text: 20210525
  day: 25
PublicationDecade 2020
PublicationPlace Ithaca
PublicationPlace_xml – name: Ithaca
PublicationTitle arXiv.org
PublicationYear 2021
Publisher Cornell University Library, arXiv.org
Publisher_xml – name: Cornell University Library, arXiv.org
SSID ssj0002672553
Score 1.7589328
SecondaryResourceType preprint
Snippet Convolutional neural networks (CNNs) play a key role in deep learning applications. However, the large storage overheads and the substantial computation cost...
SourceID proquest
SourceType Aggregation Database
SubjectTerms Accelerators
Algorithms
Artificial neural networks
Co-design
Computer architecture
Hardware
Machine learning
Macros
Matrix algebra
Matrix methods
Measurement
Multiplication
Neural networks
Random access memory
Sparsity
Static random access memory
Title MARS: Multi-macro Architecture SRAM CIM-Based Accelerator with Co-designed Compressed Neural Networks
URI https://www.proquest.com/docview/2454519067
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV3PS8MwGA26KXjyN_6YIwevYW2apK0X6caGAzvKKjJPI0kTEHSb7Rz--SZZp4LgxVMouYQv5H3p-17eB8B1UJBAxpZ3ozFHRBHP1ncF8sNABYKJuHBizMf7cDSKJpM4qwm3qpZVbjDRAXUxl5Yj72BCrROKAdfbxRuyXaNsdbVuobENmtYlwXfSvfyLY8EsNDfmYF3MdNZdHV5-PK_Wii6DzMz_BcEurwz2_7uiA9DM-EKVh2BLzY7ArtNzyuoYqDQZ5zfQPa9Fr9ygLUx-lAxgPk5S2BumqGuyWAETKU36cRV3aJlZ2Jujwmk7zKSFDGcxXkBr5cFfzOC049UJyAf9h94dqjsqIE5xiGKOhQ59rMyppZGKeSh9wTRnWDGihblcaKEZ02b3WESFCINIaywiTkWAzV_3KWjM5jN1BqDinlcQE92ICYK55L7lTqx5FokFZfIctDYxm9aHopp-B-zi7-lLsIetdMSjCNMWaCzLd3UFduRq-VyVbdDs9kfZuO322nxlwzR7-gRn3bPB
linkProvider ProQuest
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V1LSwMxEB60VfTkG9_moMdgN7vJ7goitVpabIu0RXqyJNksFLSt3fr6T_5IJ6lVQfDmwdMeAgvDTOZLvvkyA3DoJ4GvY8u78VjSwAQFW99V1At94yuh4sSJMW9qYaMRdTrx9Qy8Td_CWFnlNCe6RJ0MtOXIj1nAbScUTK5nwwdqp0bZ6up0hMYkLK7M6zNe2bLT6gX694ix8mW7VKEfUwWo5CyksWQqDT1mMHJ5ZGIZak-JVApmRJAqBNhUpUKkaIGIuFKhH6UpU5Hkymd488S_zkIeDxEsdkLB1iejw0SI53N_Ujp1jcKO5eil9zTRjyEOCO9HwncoVl76X_YvQ_5aDs1oBWZMfxXmnVZVZ2tg6sVm64S4p8P0XiKSkOK3cghpNYt1UqrW6TkidEKKWiO0OjUBsawzKQ1o4nQruGjToWufnhDbpkTe4cfp4rN1aP2BXRuQ6w_6ZhOIkYVCEqAvI6ECJrX0LC9kG4MFseJCb8Hu1EPdjw2fdb_cs_378gEsVNr1WrdWbVztwCKzEpkCp4zvQm48ejR7MKefxr1stO-ii8Dt3zrzHVeDDSo
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=MARS%3A+Multi-macro+Architecture+SRAM+CIM-Based+Accelerator+with+Co-designed+Compressed+Neural+Networks&rft.jtitle=arXiv.org&rft.au=Syuan-Hao+Sie&rft.au=Lee%2C+Jye-Luen&rft.au=Yi-Ren%2C+Chen&rft.au=Chih-Cheng%2C+Lu&rft.date=2021-05-25&rft.pub=Cornell+University+Library%2C+arXiv.org&rft.eissn=2331-8422&rft_id=info:doi/10.48550%2Farxiv.2010.12861