A Fully Probabilistic Model for Sigmoid Approximation and Its Hardware- Efficient Implementation

The sigmoid function is a representative activation function in shallow neural networks. Its hardware realization is challenging due to the complex exponential and reciprocal operations. Existing studies applied piecewise models to approximate sigmoid function and employed numerical methods or non-u...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:IEEE transactions on circuits and systems. I, Regular papers Ročník 71; číslo 8; s. 3775 - 3786
Hlavní autoři: Lu, Wenhao, Lu, Minshan, Zhang, Xiangfen, Lu, Zhongzhiguang, Sun, Miao, Dong, Boyi, Shu, Zhou
Médium: Journal Article
Jazyk:angličtina
Vydáno: New York IEEE 01.08.2024
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Témata:
ISSN:1549-8328, 1558-0806
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:The sigmoid function is a representative activation function in shallow neural networks. Its hardware realization is challenging due to the complex exponential and reciprocal operations. Existing studies applied piecewise models to approximate sigmoid function and employed numerical methods or non-uniform input segmentations to mitigate fitting inaccuracies. However, the breakpoints introduce inevitable approximation precision loss. Besides, additional fitting processes greatly increase hardware complexity and power consumption. This paper presents a hardware-friendly sigmoidal approximation from the perspective of probability theory. We find that for a given input, the output of a sigmoid function can be approximated by the probability that the sum of this input and a Gaussian random variable is greater than or equal to zero. As the derived theorem does not involve piecewise expressions, the precision loss caused by the breakpoint issue is avoided. A low-complexity binary-search-based address localization method is proposed to optimize our theorem for hardware implementation. For the optimized scheme, an efficient implemented circuit is also presented. Our scheme's approximation ability and hardware efficiency are validated through software modeling and FPGA- and ASIC-based experiments. Feedforward neural network-based classification applications demonstrate that building networks with the proposed sigmoid approximator has only a tiny recognition rate loss.
Bibliografie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1549-8328
1558-0806
DOI:10.1109/TCSI.2024.3413570