Flex-SFU: Accelerating DNN Activation Functions by Non-Uniform Piecewise Approximation
Modern DNN workloads increasingly rely on activation functions consisting of computationally complex operations. This poses a challenge to current accelerators optimized for convolutions and matrix-matrix multiplications. This work presents Flex-SFU, a lightweight hardware accelerator for activation...
Uložené v:
| Vydané v: | 2023 60th ACM/IEEE Design Automation Conference (DAC) s. 1 - 6 |
|---|---|
| Hlavní autori: | , , |
| Médium: | Konferenčný príspevok.. |
| Jazyk: | English |
| Vydavateľské údaje: |
IEEE
09.07.2023
|
| Predmet: | |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Abstract | Modern DNN workloads increasingly rely on activation functions consisting of computationally complex operations. This poses a challenge to current accelerators optimized for convolutions and matrix-matrix multiplications. This work presents Flex-SFU, a lightweight hardware accelerator for activation functions implementing non-uniform piecewise interpolation supporting multiple data formats. Non-Uniform segments and floating-point numbers are enabled by implementing a binary-tree comparison within the address decoding unit. An SGD-based optimization algorithm with heuristics is proposed to find the interpolation function reducing the mean squared error. Thanks to non-uniform interpolation and floating-point support, Flex-SFU achieves on average 22.3x better mean squared error compared to previous piecewise linear interpolation approaches. The evaluation with more than 700 computer vision and natural language processing models shows that Flex-SFU can, on average, improve the end-to-end performance of state-of-the-art AI hardware accelerators by 35.7%, achieving up to 3.3x speedup with negligible impact in the models' accuracy when using 32 segments, and only introducing an area and power overhead of 5.9% and 0.8% relative to the baseline vector processing unit. |
|---|---|
| AbstractList | Modern DNN workloads increasingly rely on activation functions consisting of computationally complex operations. This poses a challenge to current accelerators optimized for convolutions and matrix-matrix multiplications. This work presents Flex-SFU, a lightweight hardware accelerator for activation functions implementing non-uniform piecewise interpolation supporting multiple data formats. Non-Uniform segments and floating-point numbers are enabled by implementing a binary-tree comparison within the address decoding unit. An SGD-based optimization algorithm with heuristics is proposed to find the interpolation function reducing the mean squared error. Thanks to non-uniform interpolation and floating-point support, Flex-SFU achieves on average 22.3x better mean squared error compared to previous piecewise linear interpolation approaches. The evaluation with more than 700 computer vision and natural language processing models shows that Flex-SFU can, on average, improve the end-to-end performance of state-of-the-art AI hardware accelerators by 35.7%, achieving up to 3.3x speedup with negligible impact in the models' accuracy when using 32 segments, and only introducing an area and power overhead of 5.9% and 0.8% relative to the baseline vector processing unit. |
| Author | Reggiani, Enrico Cavigelli, Lukas Andri, Renzo |
| Author_xml | – sequence: 1 givenname: Enrico surname: Reggiani fullname: Reggiani, Enrico organization: Huawei Zurich Research Center,Computing Systems Lab,Switzerland – sequence: 2 givenname: Renzo surname: Andri fullname: Andri, Renzo email: renzo.andri@huawei.com organization: Huawei Zurich Research Center,Computing Systems Lab,Switzerland – sequence: 3 givenname: Lukas surname: Cavigelli fullname: Cavigelli, Lukas organization: Huawei Zurich Research Center,Computing Systems Lab,Switzerland |
| BookMark | eNo1T9tKxDAUjKCgrv0DkfxA11ybxLfStSosVdD6uiTZUwl009JW3f176-1pLjAzzDk6jl0EhK4oWVJKzPUqL2RmmFkywviSEiaUlvIIJUYZzSXhjAtNT1EyjsGRjEgtSCbO0GvZwj59LusbnHsPLQx2CvENr6pqNqbwMcsu4vI9-m8yYnfAVRfTOoamG3b4KYCHzzACzvt-6PZh9xO4QCeNbUdI_nCB6vL2pbhP1493D0W-Ti0zZEqpYNYbLoRy0nPtlJHMKZh_KO0zrTLTyGZLORhttVUCvKNENVtBnDUcgC_Q5W9vAIBNP8zzw2Hzf59_AfOoUqs |
| ContentType | Conference Proceeding |
| DBID | 6IE 6IH CBEJK RIE RIO |
| DOI | 10.1109/DAC56929.2023.10247855 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan (POP) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Xplore IEEE Proceedings Order Plans (POP) 1998-present |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| EISBN | 9798350323481 |
| EndPage | 6 |
| ExternalDocumentID | 10247855 |
| Genre | orig-research |
| GroupedDBID | 6IE 6IH ACM ALMA_UNASSIGNED_HOLDINGS CBEJK RIE RIO |
| ID | FETCH-LOGICAL-a290t-142ac93447b5c38b7952b7e92978c68769f5fd13e98a8a74ecb107fd40ba93ee3 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 4 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001073487300168&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| IngestDate | Wed Aug 27 02:51:00 EDT 2025 |
| IsPeerReviewed | false |
| IsScholarly | true |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-a290t-142ac93447b5c38b7952b7e92978c68769f5fd13e98a8a74ecb107fd40ba93ee3 |
| PageCount | 6 |
| ParticipantIDs | ieee_primary_10247855 |
| PublicationCentury | 2000 |
| PublicationDate | 2023-July-9 |
| PublicationDateYYYYMMDD | 2023-07-09 |
| PublicationDate_xml | – month: 07 year: 2023 text: 2023-July-9 day: 09 |
| PublicationDecade | 2020 |
| PublicationTitle | 2023 60th ACM/IEEE Design Automation Conference (DAC) |
| PublicationTitleAbbrev | DAC |
| PublicationYear | 2023 |
| Publisher | IEEE |
| Publisher_xml | – name: IEEE |
| SSID | ssib060584064 |
| Score | 2.2905202 |
| Snippet | Modern DNN workloads increasingly rely on activation functions consisting of computationally complex operations. This poses a challenge to current accelerators... |
| SourceID | ieee |
| SourceType | Publisher |
| StartPage | 1 |
| SubjectTerms | Computational modeling Computer vision Decoding Design automation Heuristic algorithms Interpolation Natural language processing |
| Title | Flex-SFU: Accelerating DNN Activation Functions by Non-Uniform Piecewise Approximation |
| URI | https://ieeexplore.ieee.org/document/10247855 |
| WOSCitedRecordID | wos001073487300168&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV27TsMwFLVoxcAEiCLe8sDq0jwc22xVS8SAokrQqlvlxw3KkqI-oPw9124DYmBgsyzZUY7tnOvY5x5CbiHGMDniwKThlqWxNkyp2DJrS2ciA5kLPmSTJ1EUcjpVo51YPWhhACBcPoOuL4azfDe3a_-rDFd4nArJeYu0hMi2Yq1m8vjjPSSndKcCjnrqbtgf8Azpv-stwrtN4182KoFF8sN_Pv-IdH70eHT0zTTHZA_qEzJByDfsOR_f0761yB5-LOtXOiwKrGhcy2iOxBXmFjWftJjXDKNMH6jSUQUWPqol0L7PK76ptiLGDhnnDy-DR7ZzSWA6Vr0VixBgq3ziPkQ7kUYoHhsB-N5C2gw_dqrkpYsSUFJLLVKwBrd8pUt7RqsEIDkl7Xpewxmh3GEvcaSd9lnkTKa0FZmTkkvNLW4kz0nHgzJ72ybCmDV4XPxRf0kOPPThdqu6Iu3VYg3XZN--r6rl4iYM3xdy95t9 |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV09T8MwELWgIMEEiCK-8cDq0nw4sdmqlqqIElWirbpVtnNBWVLUDyj_nrObgBgY2CJLiZx3jt859rtHyC34mCZ7HJjQ3LDQV5pJ6RtmTJZqT0OUOh-ycT9OEjGZyEEpVndaGABwh8-gYS_dXn46Myv7qwy_cD-MBefbZMdaZ5VyrWr42A0-pKew1AF7TXnXabV5hAlAw5qEN6rbfxmpOB7pHvyzB4ek_qPIo4NvrjkiW1AckzGCvmYv3dE9bRmD_GGjWbzSTpJgQ-VbRrtIXW50Uf1Jk1nBMM-0qSod5GDgI18AbdnK4ut8I2Osk1H3YdjusdIngSlfNpfMQ4iNtKX7EO9A6FhyX8eA7x0LE-F0JzOepV4AUiih4hCMxkVfloZNrWQAEJyQWjEr4JRQnuJTfE-lytaR05FUJo5SIbhQ3OBS8ozULSjTt00pjGmFx_kf7Tdkrzd87k_7j8nTBdm3YXBnXeUlqS3nK7giu-Z9mS_m1y6UX76dnsY |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2023+60th+ACM%2FIEEE+Design+Automation+Conference+%28DAC%29&rft.atitle=Flex-SFU%3A+Accelerating+DNN+Activation+Functions+by+Non-Uniform+Piecewise+Approximation&rft.au=Reggiani%2C+Enrico&rft.au=Andri%2C+Renzo&rft.au=Cavigelli%2C+Lukas&rft.date=2023-07-09&rft.pub=IEEE&rft.spage=1&rft.epage=6&rft_id=info:doi/10.1109%2FDAC56929.2023.10247855&rft.externalDocID=10247855 |