Enhancing Neural Network Reliability: Insights From Hardware/Software Collaboration With Neuron Vulnerability Quantization

Ensuring the reliability of deep neural networks (DNNs) is paramount in safety-critical applications. Although introducing supplementary fault-tolerant mechanisms can augment the reliability of DNNs, an efficiency tradeoff may be introduced. This study reveals the inherent fault tolerance of neural...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:IEEE transactions on computers Ročník 73; číslo 8; s. 1953 - 1966
Hlavní autoři: Wang, Jing, Zhu, Jinbin, Fu, Xin, Zang, Di, Li, Keyao, Zhang, Weigong
Médium: Journal Article
Jazyk:angličtina
Vydáno: New York IEEE 01.08.2024
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Témata:
ISSN:0018-9340, 1557-9956
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Ensuring the reliability of deep neural networks (DNNs) is paramount in safety-critical applications. Although introducing supplementary fault-tolerant mechanisms can augment the reliability of DNNs, an efficiency tradeoff may be introduced. This study reveals the inherent fault tolerance of neural networks, where individual neurons exhibit varying degrees of fault tolerance, by thoroughly exploring the structural attributes of DNNs. We thereby develop a hardware/software collaborative method that guarantees the reliability of DNNs while minimizing performance degradation. We introduce the neuron vulnerability factor (NVF) to quantify the susceptibility to soft errors. We propose two efficient methods that leverage the NVF to minimize the negative effects of soft errors on neurons. First, we present a novel computational scheduling scheme. By prioritizing error-prone neurons, the expedited completion of their computations is facilitated to mitigate the risk of neural computing errors that arise from soft errors without sacrificing efficiency. Second, we propose the NVF-guided heterogeneous memory system. We employ variable-strength error-correcting codes and tailor their error-correction mechanisms to the vulnerability profile of specific neurons to ensure a highly targeted approach for error mitigation. Our experimental results demonstrate that the proposed scheme enhances the neural network accuracy by 18% on average, while significantly reducing the fault-tolerance overhead.
Bibliografie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0018-9340
1557-9956
DOI:10.1109/TC.2024.3398492