Memristor-Based Circuit Implementation and Circuitry Optimized Algorithm for Mamba Language Network

Language networks are crucial in artificial intelligence, with the novel Mamba architecture significantly reducing computations and consumption compared to the traditional transformer network. However, a full-circuit implementation of the Mamba network has not been proposed due to the complexity of...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on circuits and systems. I, Regular papers S. 1 - 14
Hauptverfasser: Zhang, Junming, Sheng, Zheyuan, Sun, Huajun, Zhu, Chuanbo, Chen, Liangyu, Hu, Zhenyu, Miao, Xiangshui
Format: Journal Article
Sprache:Englisch
Veröffentlicht: IEEE 2025
Schlagworte:
ISSN:1549-8328, 1558-0806
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Language networks are crucial in artificial intelligence, with the novel Mamba architecture significantly reducing computations and consumption compared to the traditional transformer network. However, a full-circuit implementation of the Mamba network has not been proposed due to the complexity of computations and data storage. Additionally, optimized hardware-aware parallel algorithms for Mamba inference in circuits remain undeveloped. This work addresses these challenges by presenting a memristor-based full-circuit implementation of the Mamba network and introducing a computing-in-memory parallel-aware algorithm tailored for circuit-level inference. The implementation includes: 1) Standard 1T1M memristor crossbar and depthwise separable convolution memristor crossbar for different convolutions. 2) Computing-in-memory implicit latent state circuits for the computation and transition of latent states. 3) Functional circuits for SiLU activation, RMS normalization, and multi-layer multiply-accumulate operations. 4) Optimized algorithm and circuit implementation for hardware-aware inference, achieving parallel scanning and hardware awareness in circuits. The proposed circuit enables analog signal computations and eliminates redundant analog-to-digital conversions and intermediate storage. A basic single-sentence generation task was simulated in PSPICE, validating the circuit's correctness. Analyses of analog computation accuracy, circuit stability, and power consumption demonstrate the proposed circuit's advantages, highlighting its potential as a fundamental module for large-scale circuit integration and complex text generation tasks.
ISSN:1549-8328
1558-0806
DOI:10.1109/TCSI.2025.3584247