An Efficient Bit-level Sparse MAC-accelerated Architecture with SW/HW Co-design on FPGA
Exploring bit-level sparsity in the MAC process has been proven to be an important method for improving the efficiency of neural network feedforward processing. The reconfigurable platform offers possibilities for identifying the bitlevel unstructured redundancy during inference with different DNN m...
Uložené v:
| Vydané v: | 2025 62nd ACM/IEEE Design Automation Conference (DAC) s. 1 - 7 |
|---|---|
| Hlavní autori: | , , , |
| Médium: | Konferenčný príspevok.. |
| Jazyk: | English |
| Vydavateľské údaje: |
IEEE
22.06.2025
|
| Predmet: | |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Shrnutí: | Exploring bit-level sparsity in the MAC process has been proven to be an important method for improving the efficiency of neural network feedforward processing. The reconfigurable platform offers possibilities for identifying the bitlevel unstructured redundancy during inference with different DNN models. Researchers noticed significant progress in valueaware accelerators on ASICs, yet we are concerned about the few studies on FPGAs. This paper observed the limitations of implementing bit-level sparsity optimizations using FPGA and proposed a software/architecture co-design solution. Specifically, by introducing LUT-friendly encoding with adaptable granularity and hardware structure supporting multiplication time uncertainty, we achieved a better trade-off between potential redundancy and accuracy with compatibility and scalability. Experiments show that under accurate calculation, PEs are up to 2.2 \times smaller than bit-parallel ones, and our design boosts performance by 1.04 \times to 1.74 \times and 1.40 \times to 2.79 \times over bitparallel and Booth-based designs, respectively. |
|---|---|
| DOI: | 10.1109/DAC63849.2025.11133249 |