Guarder: A Stable and Lightweight Reconfigurable RRAM-based PIM Accelerator for DNN IP Protection

Deploying deep neural networks (DNNs) on conventional digital edge devices faces significant challenges due to high energy consumption. A promising solution is the processing-inmemory (PIM) architecture with resistive random-access memory (RRAM), but RRAM-based systems suffer from imprecise weights...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:2025 62nd ACM/IEEE Design Automation Conference (DAC) S. 1 - 7
Hauptverfasser: Lin, Ning, Li, Yi, Li, Jiankun, Yang, Jichang, He, Yangu, Luo, Yukui, Shang, Dashan, Chen, Xiaoming, Qi, Xiaojuan, Wang, Zhongrui
Format: Tagungsbericht
Sprache:Englisch
Veröffentlicht: IEEE 22.06.2025
Schlagworte:
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Deploying deep neural networks (DNNs) on conventional digital edge devices faces significant challenges due to high energy consumption. A promising solution is the processing-inmemory (PIM) architecture with resistive random-access memory (RRAM), but RRAM-based systems suffer from imprecise weights due to programming stochasticity and cannot effectively utilize conventional weight encryption/decryption intellectual property (IP) protection schemes. To address these issues, we propose a novel software-hardware co-design Guarder. On the hardware side, we introduce 3T2R cells to achieve reliable multiply-accumulate (MAC) operations and use reconfigurable inverter operating voltages to encode keys for encrypting DNNs on RRAM. On the software side, we implement a contrastive training method that ensures high model accuracy on authorized chips while degrading performance on unauthorized ones. This approach protects DNN IP with minimal hardware overhead while significantly mitigating the effects of RRAM programming stochasticity. Extensive experiments on tasks such as image classification (using MLP, ResNet, and ViT), segmentation (using SegFormer), and image generation (using DiT) validate the effectiveness of our method. The proposed contrastive training ensures negligible performance degradation on authorized chips, while performance on unauthorized chips drops to random guessing or generation. Compared to traditional RRAM accelerators, the 3T2R-based accelerator achieves a 1.41 \times reduction in area overhead and a 2.28 \times reduction in energy consumption.
DOI:10.1109/DAC63849.2025.11133133