A 8-b-Precision 6T SRAM Computing-in-Memory Macro Using Segmented-Bitline Charge-Sharing Scheme for AI Edge Chips

Advances in static random access memory (SRAM)-CIM devices are meant to increase capacity while improving energy efficiency (EF) and reducing computing latency (<inline-formula> <tex-math notation="LaTeX">T_{\mathrm {AC}} </tex-math></inline-formula>). This work pre...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:IEEE journal of solid-state circuits Ročník 58; číslo 3; s. 877 - 892
Hlavní autoři: Su, Jian-Wei, Chou, Yen-Chi, Liu, Ruhui, Liu, Ta-Wei, Lu, Pei-Jung, Wu, Ping-Chun, Chung, Yen-Lin, Hong, Li-Yang, Ren, Jin-Sheng, Pan, Tianlong, Jhang, Chuan-Jia, Huang, Wei-Hsing, Chien, Chih-Han, Mei, Peng-I, Li, Sih-Han, Sheu, Shyh-Shyuan, Chang, Shih-Chieh, Lo, Wei-Chung, Wu, Chih-I, Si, Xin, Lo, Chung-Chuan, Liu, Ren-Shuo, Hsieh, Chih-Cheng, Tang, Kea-Tiong, Chang, Meng-Fan
Médium: Journal Article
Jazyk:angličtina
Vydáno: New York IEEE 01.03.2023
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Témata:
ISSN:0018-9200, 1558-173X
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Advances in static random access memory (SRAM)-CIM devices are meant to increase capacity while improving energy efficiency (EF) and reducing computing latency (<inline-formula> <tex-math notation="LaTeX">T_{\mathrm {AC}} </tex-math></inline-formula>). This work presents a novel SRAM-CIM structure using: 1) a segmented-bitline charge-sharing (SBCS) scheme for multiply-and-accumulate (MAC) operations with low energy consumption and a consistently high signal margin across MAC values; 2) a bitline-combining (BL-CMB) scheme to reduce the number of analog-to-digital converters (ADCs) and, thereby, provide options in determining a tradeoff between EF and inference accuracy; 3) a source-injection local-multiplication cell (SILMC) connected to two types of global-bitline-switch to support the SBCS and BL-CMB schemes with consistent signal margin against process variation in transistors; and 4) prioritized-hybrid ADC to suppress area and power overhead for analog readout operations. We fabricated a 28-nm 384-kb SRAM-CIM macro using foundry-provided compact-6T cells supporting MAC operations with 16 accumulations of 8-b input and 8-b weight with near-full precision output (20 b). This macro achieved <inline-formula> <tex-math notation="LaTeX">T_{\mathrm {AC}} </tex-math></inline-formula> of 7.2 ns and EF of 22.75 TOPS/W performing 8-b-MAC operations.
Bibliografie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:0018-9200
1558-173X
DOI:10.1109/JSSC.2022.3199077