Soft Error Tolerant Convolutional Neural Networks on FPGAs With Ensemble Learning
Convolutional neural networks (CNNs) are widely used in computer vision and natural language processing. Field-programmable gate arrays (FPGAs) are popular accelerators for CNNs. However, if used in critical applications, the reliability of FPGA-based CNNs becomes a priority because FPGAs are prone...
Saved in:
| Published in: | IEEE transactions on very large scale integration (VLSI) systems Vol. 30; no. 3; pp. 291 - 302 |
|---|---|
| Main Authors: | , , , , , , , , |
| Format: | Journal Article |
| Language: | English |
| Published: |
New York
IEEE
01.03.2022
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Subjects: | |
| ISSN: | 1063-8210, 1557-9999 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Convolutional neural networks (CNNs) are widely used in computer vision and natural language processing. Field-programmable gate arrays (FPGAs) are popular accelerators for CNNs. However, if used in critical applications, the reliability of FPGA-based CNNs becomes a priority because FPGAs are prone to suffer soft errors. Traditional protection schemes, such as triple modular redundancy (TMR), introduce a large overhead, which is not acceptable in resource-limited platforms. This article proposes to use an ensemble of weak CNNs to build a robust classifier with low cost. To have a group of base CNNs with low complexity and balanced similarity and diversity, residual neural networks (ResNets) with different layers (20/32/44/56) are combined in the ensemble system to replace a single strong ResNet 110. In addition, a robust combiner is designed based on the reliability evaluation of a single ResNet. Single ResNets with different layers and different ensemble schemes are implemented on the FPGA accelerator based on Xilinx Zynq 7000 SoC. The reliability of the ensemble systems is evaluated based on a large-scale fault injection platform and compared with that of the TMR-protected ResNet 110 and ResNet 20. Experiment results show that the proposed ensembles could effectively improve the system reliability when suffering soft errors with an overhead much lower than TMR. |
|---|---|
| Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ISSN: | 1063-8210 1557-9999 |
| DOI: | 10.1109/TVLSI.2021.3138491 |