A Full-system, Programmable, and Extensible In-Memory Computing Simulation Framework for Deep Learning
In-memory computing (IMC) has established itself as an attractive alternative to hardware accelerators in addressing the memory wall problem for artificial intelligence (AI) workloads. However, designing programmable IMC-based computing platforms for today's large generative AI models, such as...
Saved in:
| Published in: | 2025 62nd ACM/IEEE Design Automation Conference (DAC) pp. 1 - 7 |
|---|---|
| Main Authors: | , , , |
| Format: | Conference Proceeding |
| Language: | English |
| Published: |
IEEE
22.06.2025
|
| Subjects: | |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | In-memory computing (IMC) has established itself as an attractive alternative to hardware accelerators in addressing the memory wall problem for artificial intelligence (AI) workloads. However, designing programmable IMC-based computing platforms for today's large generative AI models, such as large language models (LLMs) and diffusion transformers (DiTs), is hindered by the absence of a simulator that is able to address the associated scalability challenges while simultaneously incorporating device and circuit-level behaviors intrinsic to IMCs. To address this challenge, we present IMCsim, a versatile fullsystem IMC simulation framework. IMCsim integrates software runtime libraries for AI models, introduces a new set of ISA extensions to express common tensor operators, and provides flexibility in mapping these operators to various IMC architectures. As such, IMCsim enables designers to explore trade-offs between performance, energy, area, and computational accuracy for various IMC design choices. To demonstrate the functionality, efficiency, and versatility of IMCsim, we model three types of IMCs: (1) embedded non-volatile memory (eNVM)-based, (2) SRAM-based, and (3) digital IMCs. We validate IMCsim using measured data from two laboratory-tested IMC prototype ICs-a 22 nm MRAM-based IMC and a 28 nm SRAM-based IMC-and a digital IMC design in 28 nm. Next, we demonstrate the utility of IMCsim by exploring the architectural design space to obtain insights for maximizing utilization of IMC-based processors for diverse workloads-ResNet-18, Llama, and a DiT-using the three IMC types. Finally, we employ IMCsim as a design tool to obtain an efficient chip architecture and layout in 28 nm for a lightweight DiT. |
|---|---|
| DOI: | 10.1109/DAC63849.2025.11132463 |