An 8b-Precision 6T SRAM Computing-in-Memory Macro Using Time-Domain Incremental Accumulation for AI Edge Chips

This article presents a novel static random access memory computing-in-memory (SRAM-CIM) structure designed for high-precision multiply-and-accumulate (MAC) operations with high energy efficiency (EF), high readout accuracy, and short compute latency. The proposed device employs 1) a time-domain inc...

Full description

Saved in:
Bibliographic Details
Published in:IEEE journal of solid-state circuits Vol. 59; no. 7; pp. 2297 - 2309
Main Authors: Wu, Ping-Chun, Su, Jian-Wei, Chung, Yen-Lin, Hong, Li-Yang, Ren, Jin-Sheng, Chang, Fu-Chun, Wu, Yuan, Chen, Ho-Yu, Lin, Chen-Hsun, Hsiao, Hsu-Ming, Li, Sih-Han, Sheu, Shyh-Shyuan, Chang, Shih-Chieh, Lo, Wei-Chung, Wu, Chih-I, Lo, Chung-Chuan, Liu, Ren-Shuo, Hsieh, Chih-Cheng, Tang, Kea-Tiong, Chang, Meng-Fan
Format: Journal Article
Language:English
Published: New York IEEE 01.07.2024
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects:
ISSN:0018-9200, 1558-173X
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:This article presents a novel static random access memory computing-in-memory (SRAM-CIM) structure designed for high-precision multiply-and-accumulate (MAC) operations with high energy efficiency (EF), high readout accuracy, and short compute latency. The proposed device employs 1) a time-domain incremental-accumulation (TDIA) scheme to enable high-accumulation MAC operations while maintaining a large signal margin across MAC values (MACVs), 2) a dynamic differential-reference (D2REF) scheme based on software-hardware co-design to reduce read energy consumption, and 3) a low-dMACV-aware recursive time-to-digital converter (LMAR-TDC) for implementation with the D2REF scheme to further suppress readout energy consumption. A 28 nm 1 Mb SRAM-CIM macro fabricated using foundry-provided compact 6T-SRAM cells achieved EF of 39.31 TOPS/W and compute latency of 6.6 ns for 8b-MAC operations with 64 accumulations per cycle and near-full output precision (22b).
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0018-9200
1558-173X
DOI:10.1109/JSSC.2023.3343669