MASR: A Modular Accelerator for Sparse RNNs

Recurrent neural networks (RNNs) are becoming the de-facto solution for speech recognition. RNNs exploit long-term temporal relationships in data by applying repeated, learned transformations. Unlike fully-connected (FC) layers with single vector matrix operations, RNN layers consist of hundreds of...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:Proceedings / International Conference on Parallel Architectures and Compilation Techniques s. 1 - 14
Hlavní autori: Gupta, Udit, Reagen, Brandon, Pentecost, Lillian, Donato, Marco, Tambe, Thierry, Rush, Alexander M., Wei, Gu-Yeon, Brooks, David
Médium: Konferenčný príspevok..
Jazyk:English
Vydavateľské údaje: IEEE 01.09.2019
Predmet:
ISSN:2641-7936
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Abstract Recurrent neural networks (RNNs) are becoming the de-facto solution for speech recognition. RNNs exploit long-term temporal relationships in data by applying repeated, learned transformations. Unlike fully-connected (FC) layers with single vector matrix operations, RNN layers consist of hundreds of such operations chained over time. This poses challenges unique to RNNs that are not found in convolutional neural networks(CNNs) or FC models, namely large dynamic activation. In this paper we present MASR, a principled and modular architecture that accelerates bidirectional RNNs for on-chip ASR. MASR is designed to exploit sparsity in both dynamic activations and static weights. The architecture is enhanced by a series of dynamic activation optimizations that enable compact storage, ensure no energy is wasted computing null operations, and maintain high MAC utilization for highly parallel accelerator designs. In comparison to current state-of-the-art sparse neural network accelerators (e.g., EIE), MASR provides 2×area 3×energy, and 1.6×performance benefits. The modular nature of MASR enables designs that efficiently scale from resource-constrained low-power IoT applications to large-scale, highly parallel datacenter deployments.
AbstractList Recurrent neural networks (RNNs) are becoming the de-facto solution for speech recognition. RNNs exploit long-term temporal relationships in data by applying repeated, learned transformations. Unlike fully-connected (FC) layers with single vector matrix operations, RNN layers consist of hundreds of such operations chained over time. This poses challenges unique to RNNs that are not found in convolutional neural networks(CNNs) or FC models, namely large dynamic activation. In this paper we present MASR, a principled and modular architecture that accelerates bidirectional RNNs for on-chip ASR. MASR is designed to exploit sparsity in both dynamic activations and static weights. The architecture is enhanced by a series of dynamic activation optimizations that enable compact storage, ensure no energy is wasted computing null operations, and maintain high MAC utilization for highly parallel accelerator designs. In comparison to current state-of-the-art sparse neural network accelerators (e.g., EIE), MASR provides 2×area 3×energy, and 1.6×performance benefits. The modular nature of MASR enables designs that efficiently scale from resource-constrained low-power IoT applications to large-scale, highly parallel datacenter deployments.
Author Pentecost, Lillian
Rush, Alexander M.
Reagen, Brandon
Donato, Marco
Brooks, David
Tambe, Thierry
Wei, Gu-Yeon
Gupta, Udit
Author_xml – sequence: 1
  givenname: Udit
  surname: Gupta
  fullname: Gupta, Udit
  organization: Harvard University
– sequence: 2
  givenname: Brandon
  surname: Reagen
  fullname: Reagen, Brandon
  organization: Harvard University
– sequence: 3
  givenname: Lillian
  surname: Pentecost
  fullname: Pentecost, Lillian
  organization: Harvard University
– sequence: 4
  givenname: Marco
  surname: Donato
  fullname: Donato, Marco
  organization: Harvard University
– sequence: 5
  givenname: Thierry
  surname: Tambe
  fullname: Tambe, Thierry
  organization: Harvard University
– sequence: 6
  givenname: Alexander M.
  surname: Rush
  fullname: Rush, Alexander M.
  organization: Harvard University
– sequence: 7
  givenname: Gu-Yeon
  surname: Wei
  fullname: Wei, Gu-Yeon
  organization: Harvard University
– sequence: 8
  givenname: David
  surname: Brooks
  fullname: Brooks, David
  organization: Harvard University
BookMark eNotjktLAzEURqMo2KmuXbjJXmaam5vJw10YfBTaKm0FdyWTSaAydkpSF_57B_SDw9kdvoJcHIZDIOQWWAXAzOzNNtuKMzAVG2fOSAGKa0AJ-HFOJlwKKJVBeUWKnD8ZEyBrnJD7pd2sH6ily6H77l2i1vvQh-ROQ6JxZHN0KQe6Xq3yNbmMrs_h5t9T8v70uG1eysXr87yxi9JxVZ9KKViopZeoauadakGP_9po0Djdas6EMxi96qKXgusOY61QR21U8MJJhjgld3_dfQhhd0z7L5d-dlobkKDwF6L5QGo
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/PACT.2019.00009
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISBN 172813613X
9781728136134
EISSN 2641-7936
EndPage 14
ExternalDocumentID 8891617
Genre orig-research
GroupedDBID 123
23M
29O
6IE
6IL
ACGFS
AFFNX
ALMA_UNASSIGNED_HOLDINGS
CBEJK
M43
RIE
RIL
RNS
ID FETCH-LOGICAL-a275t-640e56c63750ca7b18109bf939a8b8204a93fc7dfc6428d3f5738f897ec4a6033
IEDL.DBID RIE
ISICitedReferencesCount 43
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000550990200001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
IngestDate Wed Aug 27 02:43:19 EDT 2025
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-a275t-640e56c63750ca7b18109bf939a8b8204a93fc7dfc6428d3f5738f897ec4a6033
PageCount 14
ParticipantIDs ieee_primary_8891617
PublicationCentury 2000
PublicationDate 2019-Sept.
PublicationDateYYYYMMDD 2019-09-01
PublicationDate_xml – month: 09
  year: 2019
  text: 2019-Sept.
PublicationDecade 2010
PublicationTitle Proceedings / International Conference on Parallel Architectures and Compilation Techniques
PublicationTitleAbbrev PACT
PublicationYear 2019
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0041653
ssib057737306
Score 2.3180993
Snippet Recurrent neural networks (RNNs) are becoming the de-facto solution for speech recognition. RNNs exploit long-term temporal relationships in data by applying...
SourceID ieee
SourceType Publisher
StartPage 1
SubjectTerms Acceleration
Accelerator
automatic speech recognition
Computer architecture
deep neural network
Encoding
Hardware
Hidden Markov models
Load management
Recurrent neural networks
sparsity
Title MASR: A Modular Accelerator for Sparse RNNs
URI https://ieeexplore.ieee.org/document/8891617
WOSCitedRecordID wos000550990200001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LTwIxEJ4A8eAJFYziIz1405WFbndabxsi8SAbAphwI213NvEChIe_33Z3wZh48db00PQx7TfT-WYG4EH0qY8yN4GxygaRRh0YbbKgZ1BTprTlZRLXd0xTOZ-rcQ2ejrEwRFSQz-jZNwtffraye_9V1pVSeXW8DnXEuIzVOsiOQOROWOPDK-z0DMGrVD69UHXHyWDmiVyqSFioftVSKaBk2PzfJM6g_ROTx8ZHtDmHGi0voHkoysCqO9qCx1EynbywhI1WmaeYssRaBy2FN505DZVN186WJTZJ020bPoavs8FbUFVECHQfxS6Io5BEbGPucN5qNA6eQ2VyxZWWxmF5pBXPLWa59WZFxnOBXOZSIdlIxyHnl9BYrpZ0BQxzpZ0pEQlD6EYNJXo_SU9kXIUxkbiGll_7Yl0mvVhUy-783X0Dp35zS_LVLTR2mz3dwYn92n1uN_fFSX0DjXOReQ
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LTwIxEJ4gmugJFYxve_CmKwvdbltvGyLBCBsCmHAjfcwmXIDw8PfbLgvGxIu3poemj2m_mc43MwCPrIlNLjIdaCNNECmuAq20DRqaK7RSGbpN4trlaSrGY9kvwfM-FgYRc_IZvvhm7su3c7PxX2V1IaRXxw_g0FfOKqK1dtLDOKdOXOPdO-w0DUaLZD6NUNb7SWvkqVwyT1kof1VTycGkXfnfNE6h9hOVR_p7vDmDEs7OobIry0CKW1qFp14yHLyShPTm1pNMSWKMA5fcn06cjkqGC2fNIhmk6aoGn-23UasTFDURAtXkbB3EUYgsNjF1SG8U1w6gQ6kzSaUS2qF5pCTNDLeZ8YaFpRnjVGRCcjSRikNKL6A8m8_wEgjPpHLGRMQ0cjdqKLj3lDSYpTKMEdkVVP3aJ4tt2otJsezrv7sf4Lgz6nUn3ff04wZO_EZvqVi3UF4vN3gHR-ZrPV0t7_NT-wZZ5JTC
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=proceeding&rft.title=Proceedings+%2F+International+Conference+on+Parallel+Architectures+and+Compilation+Techniques&rft.atitle=MASR%3A+A+Modular+Accelerator+for+Sparse+RNNs&rft.au=Gupta%2C+Udit&rft.au=Reagen%2C+Brandon&rft.au=Pentecost%2C+Lillian&rft.au=Donato%2C+Marco&rft.date=2019-09-01&rft.pub=IEEE&rft.eissn=2641-7936&rft.spage=1&rft.epage=14&rft_id=info:doi/10.1109%2FPACT.2019.00009&rft.externalDocID=8891617