Enhancing Neural Network Reliability: Insights From Hardware/Software Collaboration With Neuron Vulnerability Quantization

Ensuring the reliability of deep neural networks (DNNs) is paramount in safety-critical applications. Although introducing supplementary fault-tolerant mechanisms can augment the reliability of DNNs, an efficiency tradeoff may be introduced. This study reveals the inherent fault tolerance of neural...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on computers Jg. 73; H. 8; S. 1953 - 1966
Hauptverfasser: Wang, Jing, Zhu, Jinbin, Fu, Xin, Zang, Di, Li, Keyao, Zhang, Weigong
Format: Journal Article
Sprache:Englisch
Veröffentlicht: New York IEEE 01.08.2024
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Schlagworte:
ISSN:0018-9340, 1557-9956
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Abstract Ensuring the reliability of deep neural networks (DNNs) is paramount in safety-critical applications. Although introducing supplementary fault-tolerant mechanisms can augment the reliability of DNNs, an efficiency tradeoff may be introduced. This study reveals the inherent fault tolerance of neural networks, where individual neurons exhibit varying degrees of fault tolerance, by thoroughly exploring the structural attributes of DNNs. We thereby develop a hardware/software collaborative method that guarantees the reliability of DNNs while minimizing performance degradation. We introduce the neuron vulnerability factor (NVF) to quantify the susceptibility to soft errors. We propose two efficient methods that leverage the NVF to minimize the negative effects of soft errors on neurons. First, we present a novel computational scheduling scheme. By prioritizing error-prone neurons, the expedited completion of their computations is facilitated to mitigate the risk of neural computing errors that arise from soft errors without sacrificing efficiency. Second, we propose the NVF-guided heterogeneous memory system. We employ variable-strength error-correcting codes and tailor their error-correction mechanisms to the vulnerability profile of specific neurons to ensure a highly targeted approach for error mitigation. Our experimental results demonstrate that the proposed scheme enhances the neural network accuracy by 18% on average, while significantly reducing the fault-tolerance overhead.
AbstractList Ensuring the reliability of deep neural networks (DNNs) is paramount in safety-critical applications. Although introducing supplementary fault-tolerant mechanisms can augment the reliability of DNNs, an efficiency tradeoff may be introduced. This study reveals the inherent fault tolerance of neural networks, where individual neurons exhibit varying degrees of fault tolerance, by thoroughly exploring the structural attributes of DNNs. We thereby develop a hardware/software collaborative method that guarantees the reliability of DNNs while minimizing performance degradation. We introduce the neuron vulnerability factor (NVF) to quantify the susceptibility to soft errors. We propose two efficient methods that leverage the NVF to minimize the negative effects of soft errors on neurons. First, we present a novel computational scheduling scheme. By prioritizing error-prone neurons, the expedited completion of their computations is facilitated to mitigate the risk of neural computing errors that arise from soft errors without sacrificing efficiency. Second, we propose the NVF-guided heterogeneous memory system. We employ variable-strength error-correcting codes and tailor their error-correction mechanisms to the vulnerability profile of specific neurons to ensure a highly targeted approach for error mitigation. Our experimental results demonstrate that the proposed scheme enhances the neural network accuracy by 18% on average, while significantly reducing the fault-tolerance overhead.
Author Zhang, Weigong
Zang, Di
Zhu, Jinbin
Wang, Jing
Fu, Xin
Li, Keyao
Author_xml – sequence: 1
  givenname: Jing
  orcidid: 0000-0003-3653-7013
  surname: Wang
  fullname: Wang, Jing
  email: jwang@ruc.edu.cn
  organization: School of Information, Renmin University of China, Beijing, China
– sequence: 2
  givenname: Jinbin
  orcidid: 0000-0003-4955-7718
  surname: Zhu
  fullname: Zhu, Jinbin
  email: zhujinbin@sdust.edu.cn
  organization: College of Computer Science and Engineering, Shandong University of Science and Technology, Qingdao, China
– sequence: 3
  givenname: Xin
  orcidid: 0000-0002-9458-4769
  surname: Fu
  fullname: Fu, Xin
  email: xfu8@central.uh.edu
  organization: Electrical and Computer Engineering Department, University of Houston, Houston, TX, USA
– sequence: 4
  givenname: Di
  surname: Zang
  fullname: Zang, Di
  email: zd@cnu.edu.cn
  organization: College of Information Engineering, Capital Normal University, Beijing, China
– sequence: 5
  givenname: Keyao
  surname: Li
  fullname: Li, Keyao
  email: lky@cnu.edu.cn
  organization: College of Information Engineering, Capital Normal University, Beijing, China
– sequence: 6
  givenname: Weigong
  orcidid: 0000-0003-3969-5607
  surname: Zhang
  fullname: Zhang, Weigong
  email: zwg771@cnu.edu.cn
  organization: College of Information Engineering, Capital Normal University, Beijing, China
BookMark eNp9UM9PwjAYbQwmInr24mGJ58HXbqOrN7OAkBCNinpcWtpBcbTYdSHw1zt-HIwHT-8l3_uR712ilrFGIXSDoYsxsN406xIgcTeKWBozcobaOEloyFjSb6E2AE5DFsVwgS6ragkAfQKsjXYDs-Bmps08eFK142UDfmPdV_CqSs2FLrXf3gdjU-n5wlfB0NlVMOJObrhTvTdb-D0JMluWXFjHvbYm-NR-cYhr-EddGuVOQcFLzY3Xu4PsCp0XvKzU9Qk76H04mGajcPL8OM4eJuGMMPBhWsQAnImIFsCJSjEVoq8Ex4qIVEjFYylBCMJkX1IpYkkFibGgSvKYiVRGHXR3zF07-12ryudLWzvTVOYR0BRSihPcqHpH1czZqnKqyNdOr7jb5hjy_cD5NMv3A-engRtH8scx0_7wmXdcl__4bo8-rZT61ZIQGjXnH6YgjYI
CODEN ITCOB4
CitedBy_id crossref_primary_10_3390_electronics14051042
Cites_doi 10.1109/TITS.2020.3032227
10.23919/DATE.2019.8714885
10.1109/ICCD.2018.00077
10.1145/2024723.2000118
10.1109/TAES.2021.3056086
10.1109/DSN.2014.2
10.1109/TSMC.2020.3020188
10.1109/MM.2004.86
10.1109/TC.2020.2973150
10.1109/DATE.2012.6176535
10.1145/2830772.2830829
10.1109/TNSM.2022.3164715
10.1145/2366231.2337200
10.1109/TC.2023.3283685
10.1109/TCSI.2014.2304658
10.1145/3464423
10.1109/TC.2005.202
10.1109/TDMR.2005.855790
10.1109/DATE.2005.181
10.1109/TIE.2020.3009604
10.1609/aaai.v33i01.33015565
10.1109/TITS.2021.3113928
10.1109/TNS.2009.2032090
10.1145/3386263.3406938
10.1109/MICRO.2003.1253181
10.1109/TPDS.2015.2426179
10.1109/TNS.2005.856543
10.1109/TNS.2018.2884460
10.1109/RAMS.2008.4925824
10.1145/3007787.3001165
10.1109/DAC.2018.8465834
10.23919/DATE.2018.8342151
10.1109/40.259894
10.1145/3140659.3080221
10.1109/TNNLS.2019.2958151
ContentType Journal Article
Copyright Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024
Copyright_xml – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2024
DBID 97E
RIA
RIE
AAYXX
CITATION
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
DOI 10.1109/TC.2024.3398492
DatabaseName IEEE All-Society Periodicals Package (ASPP) 2005–Present
IEEE All-Society Periodicals Package (ASPP) 1998–Present
IEEE/IET Electronic Library (IEL) (UW System Shared)
CrossRef
Computer and Information Systems Abstracts
Electronics & Communications Abstracts
Technology Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
DatabaseTitle CrossRef
Technology Research Database
Computer and Information Systems Abstracts – Academic
Electronics & Communications Abstracts
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts Professional
DatabaseTitleList
Technology Research Database
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE/IET Electronic Library (IEL) (UW System Shared)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
Computer Science
EISSN 1557-9956
EndPage 1966
ExternalDocumentID 10_1109_TC_2024_3398492
10527392
Genre orig-research
GrantInformation_xml – fundername: National Natural Science Foundation of China (NSFC)
  grantid: 62076168
  funderid: 10.13039/501100001809
GroupedDBID --Z
-DZ
-~X
.55
.DC
0R~
29I
3EH
3O-
4.4
5GY
5VS
6IK
85S
97E
AAJGR
AARMG
AASAJ
AAWTH
ABAZT
ABFSI
ABQJQ
ABVLG
ACGFO
ACIWK
ACNCT
AENEX
AETEA
AETIX
AGQYO
AGSQL
AHBIQ
AI.
AIBXA
AKJIK
AKQYR
ALLEH
ALMA_UNASSIGNED_HOLDINGS
ASUFR
ATWAV
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CS3
DU5
E.L
EBS
EJD
HZ~
H~9
IAAWW
IBMZZ
ICLAB
IEDLZ
IFIPE
IFJZH
IPLJI
JAVBF
LAI
M43
MS~
MVM
O9-
OCL
P2P
PQQKQ
RIA
RIE
RNI
RNS
RXW
RZB
TAE
TN5
TWZ
UHB
UKR
UPT
VH1
X7M
XJT
XOL
XZL
YXB
YYQ
YZZ
ZCG
AAYXX
ABUFD
CITATION
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
ID FETCH-LOGICAL-c290t-8f400a9b37f0a2e817bb6eba1e2b8bdea4dd0bb29d6d7db4d7b241b7eda49b8d3
IEDL.DBID RIE
ISICitedReferencesCount 1
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001270596400003&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 0018-9340
IngestDate Mon Jun 30 05:43:28 EDT 2025
Sat Nov 29 01:35:46 EST 2025
Tue Nov 18 22:16:48 EST 2025
Wed Aug 27 02:03:57 EDT 2025
IsPeerReviewed true
IsScholarly true
Issue 8
Language English
License https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html
https://doi.org/10.15223/policy-029
https://doi.org/10.15223/policy-037
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c290t-8f400a9b37f0a2e817bb6eba1e2b8bdea4dd0bb29d6d7db4d7b241b7eda49b8d3
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ORCID 0000-0003-4955-7718
0000-0003-3969-5607
0000-0003-3653-7013
0000-0002-9458-4769
PQID 3078087151
PQPubID 85452
PageCount 14
ParticipantIDs proquest_journals_3078087151
ieee_primary_10527392
crossref_citationtrail_10_1109_TC_2024_3398492
crossref_primary_10_1109_TC_2024_3398492
PublicationCentury 2000
PublicationDate 2024-08-01
PublicationDateYYYYMMDD 2024-08-01
PublicationDate_xml – month: 08
  year: 2024
  text: 2024-08-01
  day: 01
PublicationDecade 2020
PublicationPlace New York
PublicationPlace_xml – name: New York
PublicationTitle IEEE transactions on computers
PublicationTitleAbbrev TC
PublicationYear 2024
Publisher IEEE
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml – name: IEEE
– name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References ref13
ref35
ref12
ref34
ref15
ref37
ref14
ref36
ref31
Li (ref8) 2017
ref30
ref11
ref10
ref32
Deng (ref39) 2015
ref2
ref1
Zhang (ref28) 2015
ref17
ref16
ref38
ref19
ref18
Shi (ref41) 2017
Zhang (ref26) 2015
ref24
ref23
ref25
ref20
ref22
ref21
ref27
ref29
Jan (ref33) 2003
ref7
ref9
ref4
ref3
ref6
ref5
ref40
References_xml – ident: ref4
  doi: 10.1109/TITS.2020.3032227
– start-page: 1
  volume-title: Proc. Int. Conf. High Perform. Comput., Netw., Storage Anal.
  year: 2017
  ident: ref8
  article-title: Understanding error propagation in deep learning neural network (DNN) accelerators and applications
– ident: ref19
  doi: 10.23919/DATE.2019.8714885
– ident: ref18
  doi: 10.1109/ICCD.2018.00077
– ident: ref32
  doi: 10.1145/2024723.2000118
– ident: ref5
  doi: 10.1109/TAES.2021.3056086
– ident: ref17
  doi: 10.1109/DSN.2014.2
– ident: ref23
  doi: 10.1109/TSMC.2020.3020188
– year: 2003
  ident: ref33
  article-title: Digital integrated circuits: A design perspective
– ident: ref10
  doi: 10.1109/MM.2004.86
– ident: ref30
  doi: 10.1109/TC.2020.2973150
– ident: ref13
  doi: 10.1109/DATE.2012.6176535
– ident: ref11
  doi: 10.1145/2830772.2830829
– ident: ref3
  doi: 10.1109/TNSM.2022.3164715
– start-page: 701
  volume-title: Proc. Des., Automat. Test Europe Conf. Exhibition (DATE)
  year: 2015
  ident: ref28
  article-title: ApproxANN: An approximate computing framework for artificial neural network
– ident: ref38
  doi: 10.1145/2366231.2337200
– year: 2017
  ident: ref41
  article-title: Exploiting the tradeoff between program accuracy and soft-error resiliency overhead for machine learning workloads
– ident: ref7
  doi: 10.1109/TC.2023.3283685
– ident: ref35
  doi: 10.1109/TCSI.2014.2304658
– ident: ref1
  doi: 10.1145/3464423
– ident: ref12
  doi: 10.1109/TC.2005.202
– ident: ref36
  doi: 10.1109/TDMR.2005.855790
– ident: ref14
  doi: 10.1109/DATE.2005.181
– ident: ref2
  doi: 10.1109/TIE.2020.3009604
– ident: ref29
  doi: 10.1609/aaai.v33i01.33015565
– ident: ref22
  doi: 10.1109/TITS.2021.3113928
– ident: ref15
  doi: 10.1109/TNS.2009.2032090
– ident: ref21
  doi: 10.1145/3386263.3406938
– ident: ref9
  doi: 10.1109/MICRO.2003.1253181
– ident: ref6
  doi: 10.1109/TPDS.2015.2426179
– start-page: 701
  volume-title: Proc. Des., Automat. Test Europe Conf. Exhibition (DATE)
  year: 2015
  ident: ref26
  article-title: ApproxANN: An approximate computing framework for artificial neural network
– ident: ref34
  doi: 10.1109/TNS.2005.856543
– ident: ref40
  doi: 10.1109/TNS.2018.2884460
– ident: ref25
  doi: 10.1109/RAMS.2008.4925824
– ident: ref37
  doi: 10.1145/3007787.3001165
– ident: ref16
  doi: 10.1109/DAC.2018.8465834
– ident: ref20
  doi: 10.23919/DATE.2018.8342151
– ident: ref27
  doi: 10.1109/40.259894
– ident: ref31
  doi: 10.1145/3140659.3080221
– start-page: 593
  volume-title: Proc. Des., Automat. Test Europe Conf. Exhibition (DATE)
  year: 2015
  ident: ref39
  article-title: Retraining-based timing error mitigation for hardware neural networks
– ident: ref24
  doi: 10.1109/TNNLS.2019.2958151
SSID ssj0006209
Score 2.4291906
Snippet Ensuring the reliability of deep neural networks (DNNs) is paramount in safety-critical applications. Although introducing supplementary fault-tolerant...
SourceID proquest
crossref
ieee
SourceType Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 1953
SubjectTerms Artificial neural networks
Collaboration
Computer network reliability
Computing time
Error correcting codes
Error correction
Fault tolerance
Fault tolerant systems
Hardware
memory protection
Network reliability
neural network
Neural networks
neuron vulnerability factor
Neurons
Performance degradation
Reliability
Safety critical
soft error
Soft errors
Software
Software reliability
Structural reliability
Title Enhancing Neural Network Reliability: Insights From Hardware/Software Collaboration With Neuron Vulnerability Quantization
URI https://ieeexplore.ieee.org/document/10527392
https://www.proquest.com/docview/3078087151
Volume 73
WOSCitedRecordID wos001270596400003&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVIEE
  databaseName: IEEE/IET Electronic Library (IEL) (UW System Shared)
  customDbUrl:
  eissn: 1557-9956
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0006209
  issn: 0018-9340
  databaseCode: RIE
  dateStart: 19680101
  isFulltext: true
  titleUrlDefault: https://ieeexplore.ieee.org/
  providerName: IEEE
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3NT8IwFG-UeNCDKGJE0fTgwctgH2VdvRkC0QvRiMptadciJDjMNjT61_vaFYIxHLz10L40-e19re_9HkKXPuAaCuo5gnU6DqFJ4HBJPAecCQUHJ3ngcTNsgg4G0WjE7m2zuumFUUqZ4jPV0kvzli_nyUL_KgMN13RhDCzuNqVh2ay1Mrvhsp7DAw0OiGt5fDyXtYddSAR90goCFhHm_3JBZqbKH0NsvEu_-s97HaB9G0bimxL3Q7Sl0hqqLkc0YKuxNbS3xjd4hL576UTza6SvWJNygIBBWQWOdWVyydj9dY3v0lyn7DnuZ_M3rN_2P3mm2o9gsfUCd9e_HfwyLSZGHKyfFzPNYl0Kwg8LgM32edbRU7837N46dviCk_jMLZxoDNrNmQjo2OW-AuiECJXgnvJFJKTiREpXCJ_JUFIpiKQCggFBleSEiUgGx6iSzlN1gnDkRgmYDgGRT0jGMuQMspROBwRBbBAo3kCtJR5xYpnJ9YCMWWwyFJfFw26sAYwtgA10tTrwXpJybN5a13itbSuhaqDmEvHYam0eg72Du1IIgk43HDtDu1p6WQHYRJUiW6hztJN8FNM8uzAf5A_5KN_V
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LSwMxEB5EBfXgW6zPHDx42bqP7GbjTYrFYi2KVXtbkk2qgm6lD0V_vZNsWiriwVsOyWzgyzyymfkG4ChEXBPJAk_yOPYoyyNPKBp46EwYOjglokDYZhOs1Uo7HX7titVtLYzW2iaf6aoZ2rd81ctH5lcZarihC-NocediSkO_LNeaGN5knNERoA5H1HdMPoHPT9o1vAqGtBpFPKU8_OGEbFeVX6bY-pf6yj93tgrLLpAkZyXyazCji3VYGTdpIE5n12FpinFwA77OiyfDsFE8EkPLgQJaZR44MbnJJWf35ylpFANzaR-Qer_3Sszr_ofo65NbtNlmQGrTp4c8PA-frDgc349eDI91KYjcjBA4V-m5CXf183btwnPtF7w85P7QS7uo34LLiHV9EWoET8pESxHoUKZSaUGV8qUMuUoUU5IqJjEckEwrQblMVbQFs0Wv0NtAUj_N0XhIjH0S2lWJ4HhPiWMUhNFBpEUFqmM8stxxk5sWGS-ZvaP4PGvXMgNg5gCswPFkwVtJy_H31E2D19S0EqoK7I0Rz5zeDjK0eLhXhmHQzh_LDmHhon3VzJqN1uUuLJovlfmAezA77I_0Pszn78PnQf_AHs5vv1njHA
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Enhancing+Neural+Network+Reliability%3A+Insights+From+Hardware%2FSoftware+Collaboration+With+Neuron+Vulnerability+Quantization&rft.jtitle=IEEE+transactions+on+computers&rft.au=Wang%2C+Jing&rft.au=Zhu%2C+Jinbin&rft.au=Fu%2C+Xin&rft.au=Zang%2C+Di&rft.date=2024-08-01&rft.pub=IEEE&rft.issn=0018-9340&rft.volume=73&rft.issue=8&rft.spage=1953&rft.epage=1966&rft_id=info:doi/10.1109%2FTC.2024.3398492&rft.externalDocID=10527392
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0018-9340&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0018-9340&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0018-9340&client=summon