A Full-system, Programmable, and Extensible In-Memory Computing Simulation Framework for Deep Learning

In-memory computing (IMC) has established itself as an attractive alternative to hardware accelerators in addressing the memory wall problem for artificial intelligence (AI) workloads. However, designing programmable IMC-based computing platforms for today's large generative AI models, such as...

Full description

Saved in:

Bibliographic Details
Published in:	2025 62nd ACM/IEEE Design Automation Conference (DAC) pp. 1 - 7
Main Authors:	Zhou, Kaining, Huang, Jian, Kim, Nam Sung, Shanbhag, Naresh
Format:	Conference Proceeding
Language:	English
Published:	IEEE 22.06.2025
Subjects:	Computational modeling Computer architecture Deep learning Design tools In-memory computing Integrated circuit modeling Signal to noise ratio Software Tensors Transformers
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Abstract	In-memory computing (IMC) has established itself as an attractive alternative to hardware accelerators in addressing the memory wall problem for artificial intelligence (AI) workloads. However, designing programmable IMC-based computing platforms for today's large generative AI models, such as large language models (LLMs) and diffusion transformers (DiTs), is hindered by the absence of a simulator that is able to address the associated scalability challenges while simultaneously incorporating device and circuit-level behaviors intrinsic to IMCs. To address this challenge, we present IMCsim, a versatile fullsystem IMC simulation framework. IMCsim integrates software runtime libraries for AI models, introduces a new set of ISA extensions to express common tensor operators, and provides flexibility in mapping these operators to various IMC architectures. As such, IMCsim enables designers to explore trade-offs between performance, energy, area, and computational accuracy for various IMC design choices. To demonstrate the functionality, efficiency, and versatility of IMCsim, we model three types of IMCs: (1) embedded non-volatile memory (eNVM)-based, (2) SRAM-based, and (3) digital IMCs. We validate IMCsim using measured data from two laboratory-tested IMC prototype ICs-a 22 nm MRAM-based IMC and a 28 nm SRAM-based IMC-and a digital IMC design in 28 nm. Next, we demonstrate the utility of IMCsim by exploring the architectural design space to obtain insights for maximizing utilization of IMC-based processors for diverse workloads-ResNet-18, Llama, and a DiT-using the three IMC types. Finally, we employ IMCsim as a design tool to obtain an efficient chip architecture and layout in 28 nm for a lightweight DiT.
AbstractList	In-memory computing (IMC) has established itself as an attractive alternative to hardware accelerators in addressing the memory wall problem for artificial intelligence (AI) workloads. However, designing programmable IMC-based computing platforms for today's large generative AI models, such as large language models (LLMs) and diffusion transformers (DiTs), is hindered by the absence of a simulator that is able to address the associated scalability challenges while simultaneously incorporating device and circuit-level behaviors intrinsic to IMCs. To address this challenge, we present IMCsim, a versatile fullsystem IMC simulation framework. IMCsim integrates software runtime libraries for AI models, introduces a new set of ISA extensions to express common tensor operators, and provides flexibility in mapping these operators to various IMC architectures. As such, IMCsim enables designers to explore trade-offs between performance, energy, area, and computational accuracy for various IMC design choices. To demonstrate the functionality, efficiency, and versatility of IMCsim, we model three types of IMCs: (1) embedded non-volatile memory (eNVM)-based, (2) SRAM-based, and (3) digital IMCs. We validate IMCsim using measured data from two laboratory-tested IMC prototype ICs-a 22 nm MRAM-based IMC and a 28 nm SRAM-based IMC-and a digital IMC design in 28 nm. Next, we demonstrate the utility of IMCsim by exploring the architectural design space to obtain insights for maximizing utilization of IMC-based processors for diverse workloads-ResNet-18, Llama, and a DiT-using the three IMC types. Finally, we employ IMCsim as a design tool to obtain an efficient chip architecture and layout in 28 nm for a lightweight DiT.
Author	Shanbhag, Naresh Huang, Jian Kim, Nam Sung Zhou, Kaining
Author_xml	– sequence: 1 givenname: Kaining surname: Zhou fullname: Zhou, Kaining email: kainingz@illinois.edu organization: University of Illinois at Urbana-Champaign,IL,USA – sequence: 2 givenname: Jian surname: Huang fullname: Huang, Jian email: jianh@illinois.edu organization: University of Illinois at Urbana-Champaign,IL,USA – sequence: 3 givenname: Nam Sung surname: Kim fullname: Kim, Nam Sung email: nskim@illinois.edu organization: University of Illinois at Urbana-Champaign,IL,USA – sequence: 4 givenname: Naresh surname: Shanbhag fullname: Shanbhag, Naresh email: shanbhag@illinois.edu organization: University of Illinois at Urbana-Champaign,IL,USA
BookMark	eNo1j1FLwzAUhSPog879A5H8gHXm5rZr8ji6TQcTBfV5pMvNKDbJSFt0_96C-nQ4h-8cODfsMsRAjN2DmAMI_bBaVgtUuZ5LIYsxApT5Ai_YVJdaIUIhUOTqmrkl3wxtm3Xnric_468pHpPx3tQtzbgJlq-_ewpdM3q-Ddkz-ZjOvIr-NPRNOPK3xg-t6ZsY-GYs0ldMn9zFxFdEJ74jk8KI3bIrZ9qOpn86YR-b9Xv1lO1eHrfVcpcZKHWfkYXcCQLjtCl1QUDK1nAAcLIWaJVDK11hdG0XQiI6VI5kCVYdVGkwz3HC7n53GyLan1LjTTrv_-_jD24QVZU
ContentType	Conference Proceeding
DBID	6IE 6IH CBEJK RIE RIO
DOI	10.1109/DAC63849.2025.11132463
DatabaseName	IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan (POP) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP) 1998-present
DatabaseTitleList
Database_xml	– sequence: 1 dbid: RIE name: IEEE/IET Electronic Library (IEL) (UW System Shared) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher
DeliveryMethod	fulltext_linktorsrc
EISBN	9798331503048
EndPage	7
ExternalDocumentID	11132463
Genre	orig-research
GrantInformation_xml	– fundername: Defense Advanced Research Projects Agency funderid: 10.13039/100000185
GroupedDBID	6IE 6IH CBEJK RIE RIO
ID	FETCH-LOGICAL-a179t-ed14f0e1af9a795e1e8db1c11f2b03d8f3d2f5a9bd60233f38fe271d8c87a3443
IEDL.DBID	RIE
IngestDate	Wed Oct 01 07:05:15 EDT 2025
IsPeerReviewed	false
IsScholarly	true
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-a179t-ed14f0e1af9a795e1e8db1c11f2b03d8f3d2f5a9bd60233f38fe271d8c87a3443
PageCount	7
ParticipantIDs	ieee_primary_11132463
PublicationCentury	2000
PublicationDate	2025-June-22
PublicationDateYYYYMMDD	2025-06-22
PublicationDate_xml	– month: 06 year: 2025 text: 2025-June-22 day: 22
PublicationDecade	2020
PublicationTitle	2025 62nd ACM/IEEE Design Automation Conference (DAC)
PublicationTitleAbbrev	DAC
PublicationYear	2025
Publisher	IEEE
Publisher_xml	– name: IEEE
Score	2.2953281
Snippet	In-memory computing (IMC) has established itself as an attractive alternative to hardware accelerators in addressing the memory wall problem for artificial...
SourceID	ieee
SourceType	Publisher
StartPage	1
SubjectTerms	Computational modeling Computer architecture Deep learning Design tools In-memory computing Integrated circuit modeling Signal to noise ratio Software Tensors Transformers
Title	A Full-system, Programmable, and Extensible In-Memory Computing Simulation Framework for Deep Learning
URI	https://ieeexplore.ieee.org/document/11132463
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3PS8MwFA5uePCk4sTf5OBx2ZombZLj2A8UdAxU2G0kzYvssG7MTfS_N0k7xYMHb20JKeQlvPe13_c-hG5DDepRhiOFyqUHKNwSo2ROhC_1E0NzR03srv8gxmM5napJLVaPWhgAiOQz6ITL-C_fLott-FTWjbboPGcN1BAir8RateqXJqo76PX9buJBfpJmnd3gX7YpMWuMDv_5viPU-tHf4cl3ZjlGe1CeINfDAS-SqvdyOwwIzKpF0D61sS4tHn5EOrq_x_cleQwc2k9c2Tb4afDTfFF7deHRjpKFfc2KBwArXDdafW2hl9HwuX9HapcEov1h2hCwlLsEqHZKC5UBBWkNLSh1qUmYlY7Z1GVaGZv7_Mwckw5SQa0spNCMc3aKmuWyhDOEwengZWU8BrI8TUALHzsPABVjGkQmzlErLNJsVTXCmO3W5-KP55foIIQiMKvS9Ao1N-stXKP94n0zf1vfxPB9ASVjnOQ
linkProvider	IEEE
linkToHtml	http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3PT8IwFG4UTfSkRoy_7cEjhXXt1vZI-BGIQEjEhBvp1lfDgUEQjP73ttvQePDgbVuaLXmvy3vf9n3vQ-jR96AOZViSqlg6gMINSZSMiXCtfpDQ2NIkn64_EKORnE7VuBSr51oYAMjJZ1D3h_m_fLNMt_5TWSO3Recx20cHEedhUMi1St0vDVSj3Wy5_cS9ACWM6rvlv4xT8rrRPfnnE09R9UeBh8ffteUM7UF2jmwTe8RIiunLNb_Ac6sWXv1UwzozuPORE9LdOe5nZOhZtJ-4MG5wt8HP80Xp1oW7O1IWdl0rbgOscDlq9bWKXrqdSatHSp8Eot3rtCFgKLcBUG2VFioCCtIkNKXUhknAjLTMhDbSKjGxq9DMMmkhFNTIVArNOGcXqJItM7hEGKz2blaJQ0HGBRi0cNlzEFAxpkFE4gpVfZBmq2IUxmwXn-s_rj-go95kOJgN-qOnG3Ts0-J5VmF4iyqb9Rbu0GH6vpm_re_zVH4Be4ygKw
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2025+62nd+ACM%2FIEEE+Design+Automation+Conference+%28DAC%29&rft.atitle=A+Full-system%2C+Programmable%2C+and+Extensible+In-Memory+Computing+Simulation+Framework+for+Deep+Learning&rft.au=Zhou%2C+Kaining&rft.au=Huang%2C+Jian&rft.au=Kim%2C+Nam+Sung&rft.au=Shanbhag%2C+Naresh&rft.date=2025-06-22&rft.pub=IEEE&rft.spage=1&rft.epage=7&rft_id=info:doi/10.1109%2FDAC63849.2025.11132463&rft.externalDocID=11132463