MASR: A Modular Accelerator for Sparse RNNs
Recurrent neural networks (RNNs) are becoming the de-facto solution for speech recognition. RNNs exploit long-term temporal relationships in data by applying repeated, learned transformations. Unlike fully-connected (FC) layers with single vector matrix operations, RNN layers consist of hundreds of...
Uložené v:
| Vydané v: | Proceedings / International Conference on Parallel Architectures and Compilation Techniques s. 1 - 14 |
|---|---|
| Hlavní autori: | , , , , , , , |
| Médium: | Konferenčný príspevok.. |
| Jazyk: | English |
| Vydavateľské údaje: |
IEEE
01.09.2019
|
| Predmet: | |
| ISSN: | 2641-7936 |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Abstract | Recurrent neural networks (RNNs) are becoming the de-facto solution for speech recognition. RNNs exploit long-term temporal relationships in data by applying repeated, learned transformations. Unlike fully-connected (FC) layers with single vector matrix operations, RNN layers consist of hundreds of such operations chained over time. This poses challenges unique to RNNs that are not found in convolutional neural networks(CNNs) or FC models, namely large dynamic activation. In this paper we present MASR, a principled and modular architecture that accelerates bidirectional RNNs for on-chip ASR. MASR is designed to exploit sparsity in both dynamic activations and static weights. The architecture is enhanced by a series of dynamic activation optimizations that enable compact storage, ensure no energy is wasted computing null operations, and maintain high MAC utilization for highly parallel accelerator designs. In comparison to current state-of-the-art sparse neural network accelerators (e.g., EIE), MASR provides 2×area 3×energy, and 1.6×performance benefits. The modular nature of MASR enables designs that efficiently scale from resource-constrained low-power IoT applications to large-scale, highly parallel datacenter deployments. |
|---|---|
| AbstractList | Recurrent neural networks (RNNs) are becoming the de-facto solution for speech recognition. RNNs exploit long-term temporal relationships in data by applying repeated, learned transformations. Unlike fully-connected (FC) layers with single vector matrix operations, RNN layers consist of hundreds of such operations chained over time. This poses challenges unique to RNNs that are not found in convolutional neural networks(CNNs) or FC models, namely large dynamic activation. In this paper we present MASR, a principled and modular architecture that accelerates bidirectional RNNs for on-chip ASR. MASR is designed to exploit sparsity in both dynamic activations and static weights. The architecture is enhanced by a series of dynamic activation optimizations that enable compact storage, ensure no energy is wasted computing null operations, and maintain high MAC utilization for highly parallel accelerator designs. In comparison to current state-of-the-art sparse neural network accelerators (e.g., EIE), MASR provides 2×area 3×energy, and 1.6×performance benefits. The modular nature of MASR enables designs that efficiently scale from resource-constrained low-power IoT applications to large-scale, highly parallel datacenter deployments. |
| Author | Pentecost, Lillian Rush, Alexander M. Reagen, Brandon Donato, Marco Brooks, David Tambe, Thierry Wei, Gu-Yeon Gupta, Udit |
| Author_xml | – sequence: 1 givenname: Udit surname: Gupta fullname: Gupta, Udit organization: Harvard University – sequence: 2 givenname: Brandon surname: Reagen fullname: Reagen, Brandon organization: Harvard University – sequence: 3 givenname: Lillian surname: Pentecost fullname: Pentecost, Lillian organization: Harvard University – sequence: 4 givenname: Marco surname: Donato fullname: Donato, Marco organization: Harvard University – sequence: 5 givenname: Thierry surname: Tambe fullname: Tambe, Thierry organization: Harvard University – sequence: 6 givenname: Alexander M. surname: Rush fullname: Rush, Alexander M. organization: Harvard University – sequence: 7 givenname: Gu-Yeon surname: Wei fullname: Wei, Gu-Yeon organization: Harvard University – sequence: 8 givenname: David surname: Brooks fullname: Brooks, David organization: Harvard University |
| BookMark | eNotjktLAzEURqMo2KmuXbjJXmaam5vJw10YfBTaKm0FdyWTSaAydkpSF_57B_SDw9kdvoJcHIZDIOQWWAXAzOzNNtuKMzAVG2fOSAGKa0AJ-HFOJlwKKJVBeUWKnD8ZEyBrnJD7pd2sH6ily6H77l2i1vvQh-ROQ6JxZHN0KQe6Xq3yNbmMrs_h5t9T8v70uG1eysXr87yxi9JxVZ9KKViopZeoauadakGP_9po0Djdas6EMxi96qKXgusOY61QR21U8MJJhjgld3_dfQhhd0z7L5d-dlobkKDwF6L5QGo |
| ContentType | Conference Proceeding |
| DBID | 6IE 6IL CBEJK RIE RIL |
| DOI | 10.1109/PACT.2019.00009 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Computer Science |
| EISBN | 172813613X 9781728136134 |
| EISSN | 2641-7936 |
| EndPage | 14 |
| ExternalDocumentID | 8891617 |
| Genre | orig-research |
| GroupedDBID | 123 23M 29O 6IE 6IL ACGFS AFFNX ALMA_UNASSIGNED_HOLDINGS CBEJK M43 RIE RIL RNS |
| ID | FETCH-LOGICAL-a275t-640e56c63750ca7b18109bf939a8b8204a93fc7dfc6428d3f5738f897ec4a6033 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 43 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000550990200001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| IngestDate | Wed Aug 27 02:43:19 EDT 2025 |
| IsPeerReviewed | false |
| IsScholarly | true |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-a275t-640e56c63750ca7b18109bf939a8b8204a93fc7dfc6428d3f5738f897ec4a6033 |
| PageCount | 14 |
| ParticipantIDs | ieee_primary_8891617 |
| PublicationCentury | 2000 |
| PublicationDate | 2019-Sept. |
| PublicationDateYYYYMMDD | 2019-09-01 |
| PublicationDate_xml | – month: 09 year: 2019 text: 2019-Sept. |
| PublicationDecade | 2010 |
| PublicationTitle | Proceedings / International Conference on Parallel Architectures and Compilation Techniques |
| PublicationTitleAbbrev | PACT |
| PublicationYear | 2019 |
| Publisher | IEEE |
| Publisher_xml | – name: IEEE |
| SSID | ssj0041653 ssib057737306 |
| Score | 2.3180993 |
| Snippet | Recurrent neural networks (RNNs) are becoming the de-facto solution for speech recognition. RNNs exploit long-term temporal relationships in data by applying... |
| SourceID | ieee |
| SourceType | Publisher |
| StartPage | 1 |
| SubjectTerms | Acceleration Accelerator automatic speech recognition Computer architecture deep neural network Encoding Hardware Hidden Markov models Load management Recurrent neural networks sparsity |
| Title | MASR: A Modular Accelerator for Sparse RNNs |
| URI | https://ieeexplore.ieee.org/document/8891617 |
| WOSCitedRecordID | wos000550990200001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LTwIxEJ4A8eAJFYziIz1405WFbndabxsi8SAbAphwI213NvEChIe_33Z3wZh48db00PQx7TfT-WYG4EH0qY8yN4GxygaRRh0YbbKgZ1BTprTlZRLXd0xTOZ-rcQ2ejrEwRFSQz-jZNwtffraye_9V1pVSeXW8DnXEuIzVOsiOQOROWOPDK-z0DMGrVD69UHXHyWDmiVyqSFioftVSKaBk2PzfJM6g_ROTx8ZHtDmHGi0voHkoysCqO9qCx1EynbywhI1WmaeYssRaBy2FN505DZVN186WJTZJ020bPoavs8FbUFVECHQfxS6Io5BEbGPucN5qNA6eQ2VyxZWWxmF5pBXPLWa59WZFxnOBXOZSIdlIxyHnl9BYrpZ0BQxzpZ0pEQlD6EYNJXo_SU9kXIUxkbiGll_7Yl0mvVhUy-783X0Dp35zS_LVLTR2mz3dwYn92n1uN_fFSX0DjXOReQ |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LTwIxEJ4gmugJFYxve_CmKwvdbltvGyLBCBsCmHAjfcwmXIDw8PfbLgvGxIu3poemj2m_mc43MwCPrIlNLjIdaCNNECmuAq20DRqaK7RSGbpN4trlaSrGY9kvwfM-FgYRc_IZvvhm7su3c7PxX2V1IaRXxw_g0FfOKqK1dtLDOKdOXOPdO-w0DUaLZD6NUNb7SWvkqVwyT1kof1VTycGkXfnfNE6h9hOVR_p7vDmDEs7OobIry0CKW1qFp14yHLyShPTm1pNMSWKMA5fcn06cjkqGC2fNIhmk6aoGn-23UasTFDURAtXkbB3EUYgsNjF1SG8U1w6gQ6kzSaUS2qF5pCTNDLeZ8YaFpRnjVGRCcjSRikNKL6A8m8_wEgjPpHLGRMQ0cjdqKLj3lDSYpTKMEdkVVP3aJ4tt2otJsezrv7sf4Lgz6nUn3ff04wZO_EZvqVi3UF4vN3gHR-ZrPV0t7_NT-wZZ5JTC |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=proceeding&rft.title=Proceedings+%2F+International+Conference+on+Parallel+Architectures+and+Compilation+Techniques&rft.atitle=MASR%3A+A+Modular+Accelerator+for+Sparse+RNNs&rft.au=Gupta%2C+Udit&rft.au=Reagen%2C+Brandon&rft.au=Pentecost%2C+Lillian&rft.au=Donato%2C+Marco&rft.date=2019-09-01&rft.pub=IEEE&rft.eissn=2641-7936&rft.spage=1&rft.epage=14&rft_id=info:doi/10.1109%2FPACT.2019.00009&rft.externalDocID=8891617 |