Approximate Softmax Functions for Energy-Efficient Deep Neural Networks

Approximate computing has emerged as a new paradigm that provides power-efficient and high-performance arithmetic designs by relaxing the stringent requirement of accuracy. Nonlinear functions (such as softmax , rectified linear unit ( ReLU ), Tanh , and Sigmoid ) are extensively used in deep neural...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on very large scale integration (VLSI) systems Jg. 31; H. 1; S. 1 - 13
Hauptverfasser: Chen, Ke, Gao, Yue, Waris, Haroon, Liu, Weiqiang, Lombardi, Fabrizio
Format: Journal Article
Sprache:Englisch
Veröffentlicht: New York IEEE 01.01.2023
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Schlagworte:
ISSN:1063-8210, 1557-9999
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Approximate computing has emerged as a new paradigm that provides power-efficient and high-performance arithmetic designs by relaxing the stringent requirement of accuracy. Nonlinear functions (such as softmax , rectified linear unit ( ReLU ), Tanh , and Sigmoid ) are extensively used in deep neural networks (DNNs). However, they incur significant power dissipation due to the high circuit complexity. As DNNs are error-tolerant, the design of approximation-linear functions is possible and desired. In this article, the design of an approximate softmax function (AxSF) is proposed. AxSF is based on a double hybrid structure (DHS). AxSF divides the input of the softmax function into two parts for different processing methods. The most significant bits (MSBs) are processed with lookup tables (LUTs) and an exact restoring array divider (EXDr). Taylor's expansion and a logarithmic divider are used for the less significant bits (LSBs). An improved DHS (IDHS) is also proposed to reduce the hardware complexity. In IDHS, a novel Booth multiplier is utilized for the hybrid scheme to improve the partial product generation and compression, while the truncated implementation is applied to the divider unit. The proposed DHS and IDHS are compared with existing softmax designs. The results show that the proposed approximate softmax design reduces hardware by 48% and delay by 54% while retaining a high accuracy.
Bibliographie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1063-8210
1557-9999
DOI:10.1109/TVLSI.2022.3224011