A Full-system, Programmable, and Extensible In-Memory Computing Simulation Framework for Deep Learning
In-memory computing (IMC) has established itself as an attractive alternative to hardware accelerators in addressing the memory wall problem for artificial intelligence (AI) workloads. However, designing programmable IMC-based computing platforms for today's large generative AI models, such as...
Gespeichert in:
| Veröffentlicht in: | 2025 62nd ACM/IEEE Design Automation Conference (DAC) S. 1 - 7 |
|---|---|
| Hauptverfasser: | , , , |
| Format: | Tagungsbericht |
| Sprache: | Englisch |
| Veröffentlicht: |
IEEE
22.06.2025
|
| Schlagworte: | |
| Online-Zugang: | Volltext |
| Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
| Abstract | In-memory computing (IMC) has established itself as an attractive alternative to hardware accelerators in addressing the memory wall problem for artificial intelligence (AI) workloads. However, designing programmable IMC-based computing platforms for today's large generative AI models, such as large language models (LLMs) and diffusion transformers (DiTs), is hindered by the absence of a simulator that is able to address the associated scalability challenges while simultaneously incorporating device and circuit-level behaviors intrinsic to IMCs. To address this challenge, we present IMCsim, a versatile fullsystem IMC simulation framework. IMCsim integrates software runtime libraries for AI models, introduces a new set of ISA extensions to express common tensor operators, and provides flexibility in mapping these operators to various IMC architectures. As such, IMCsim enables designers to explore trade-offs between performance, energy, area, and computational accuracy for various IMC design choices. To demonstrate the functionality, efficiency, and versatility of IMCsim, we model three types of IMCs: (1) embedded non-volatile memory (eNVM)-based, (2) SRAM-based, and (3) digital IMCs. We validate IMCsim using measured data from two laboratory-tested IMC prototype ICs-a 22 nm MRAM-based IMC and a 28 nm SRAM-based IMC-and a digital IMC design in 28 nm. Next, we demonstrate the utility of IMCsim by exploring the architectural design space to obtain insights for maximizing utilization of IMC-based processors for diverse workloads-ResNet-18, Llama, and a DiT-using the three IMC types. Finally, we employ IMCsim as a design tool to obtain an efficient chip architecture and layout in 28 nm for a lightweight DiT. |
|---|---|
| AbstractList | In-memory computing (IMC) has established itself as an attractive alternative to hardware accelerators in addressing the memory wall problem for artificial intelligence (AI) workloads. However, designing programmable IMC-based computing platforms for today's large generative AI models, such as large language models (LLMs) and diffusion transformers (DiTs), is hindered by the absence of a simulator that is able to address the associated scalability challenges while simultaneously incorporating device and circuit-level behaviors intrinsic to IMCs. To address this challenge, we present IMCsim, a versatile fullsystem IMC simulation framework. IMCsim integrates software runtime libraries for AI models, introduces a new set of ISA extensions to express common tensor operators, and provides flexibility in mapping these operators to various IMC architectures. As such, IMCsim enables designers to explore trade-offs between performance, energy, area, and computational accuracy for various IMC design choices. To demonstrate the functionality, efficiency, and versatility of IMCsim, we model three types of IMCs: (1) embedded non-volatile memory (eNVM)-based, (2) SRAM-based, and (3) digital IMCs. We validate IMCsim using measured data from two laboratory-tested IMC prototype ICs-a 22 nm MRAM-based IMC and a 28 nm SRAM-based IMC-and a digital IMC design in 28 nm. Next, we demonstrate the utility of IMCsim by exploring the architectural design space to obtain insights for maximizing utilization of IMC-based processors for diverse workloads-ResNet-18, Llama, and a DiT-using the three IMC types. Finally, we employ IMCsim as a design tool to obtain an efficient chip architecture and layout in 28 nm for a lightweight DiT. |
| Author | Shanbhag, Naresh Huang, Jian Kim, Nam Sung Zhou, Kaining |
| Author_xml | – sequence: 1 givenname: Kaining surname: Zhou fullname: Zhou, Kaining email: kainingz@illinois.edu organization: University of Illinois at Urbana-Champaign,IL,USA – sequence: 2 givenname: Jian surname: Huang fullname: Huang, Jian email: jianh@illinois.edu organization: University of Illinois at Urbana-Champaign,IL,USA – sequence: 3 givenname: Nam Sung surname: Kim fullname: Kim, Nam Sung email: nskim@illinois.edu organization: University of Illinois at Urbana-Champaign,IL,USA – sequence: 4 givenname: Naresh surname: Shanbhag fullname: Shanbhag, Naresh email: shanbhag@illinois.edu organization: University of Illinois at Urbana-Champaign,IL,USA |
| BookMark | eNo1j1FLwzAUhSPog879A5H8gHXm5rZr8ji6TQcTBfV5pMvNKDbJSFt0_96C-nQ4h-8cODfsMsRAjN2DmAMI_bBaVgtUuZ5LIYsxApT5Ai_YVJdaIUIhUOTqmrkl3wxtm3Xnric_468pHpPx3tQtzbgJlq-_ewpdM3q-Ddkz-ZjOvIr-NPRNOPK3xg-t6ZsY-GYs0ldMn9zFxFdEJ74jk8KI3bIrZ9qOpn86YR-b9Xv1lO1eHrfVcpcZKHWfkYXcCQLjtCl1QUDK1nAAcLIWaJVDK11hdG0XQiI6VI5kCVYdVGkwz3HC7n53GyLan1LjTTrv_-_jD24QVZU |
| ContentType | Conference Proceeding |
| DBID | 6IE 6IH CBEJK RIE RIO |
| DOI | 10.1109/DAC63849.2025.11132463 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan (POP) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP) 1998-present |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| EISBN | 9798331503048 |
| EndPage | 7 |
| ExternalDocumentID | 11132463 |
| Genre | orig-research |
| GrantInformation_xml | – fundername: Defense Advanced Research Projects Agency funderid: 10.13039/100000185 |
| GroupedDBID | 6IE 6IH CBEJK RIE RIO |
| ID | FETCH-LOGICAL-a179t-ed14f0e1af9a795e1e8db1c11f2b03d8f3d2f5a9bd60233f38fe271d8c87a3443 |
| IEDL.DBID | RIE |
| IngestDate | Wed Oct 01 07:05:15 EDT 2025 |
| IsPeerReviewed | false |
| IsScholarly | true |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-a179t-ed14f0e1af9a795e1e8db1c11f2b03d8f3d2f5a9bd60233f38fe271d8c87a3443 |
| PageCount | 7 |
| ParticipantIDs | ieee_primary_11132463 |
| PublicationCentury | 2000 |
| PublicationDate | 2025-June-22 |
| PublicationDateYYYYMMDD | 2025-06-22 |
| PublicationDate_xml | – month: 06 year: 2025 text: 2025-June-22 day: 22 |
| PublicationDecade | 2020 |
| PublicationTitle | 2025 62nd ACM/IEEE Design Automation Conference (DAC) |
| PublicationTitleAbbrev | DAC |
| PublicationYear | 2025 |
| Publisher | IEEE |
| Publisher_xml | – name: IEEE |
| Score | 2.2953281 |
| Snippet | In-memory computing (IMC) has established itself as an attractive alternative to hardware accelerators in addressing the memory wall problem for artificial... |
| SourceID | ieee |
| SourceType | Publisher |
| StartPage | 1 |
| SubjectTerms | Computational modeling Computer architecture Deep learning Design tools In-memory computing Integrated circuit modeling Signal to noise ratio Software Tensors Transformers |
| Title | A Full-system, Programmable, and Extensible In-Memory Computing Simulation Framework for Deep Learning |
| URI | https://ieeexplore.ieee.org/document/11132463 |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LTwMhECa28eBJjTW-w8FjaZfHLsux6SN6sGmiJr01LAymh26bPoz-e4Hdajx48AaEQDJDMgN833wI3XOpLdfACGOFJSJNGdHUKUK1FIpnhc9SbRSbkONxPp2qSU1Wj1wYAIjgM-iEZvzLt0uzC09l3SiLLjLeQA0ps4qsVbN-aaK6g17fnyYR6Ccs7ewn_5JNiVFjdPzP_U5Q64d_hyffkeUUHUB5hlwPh_siqWovt8OEgKxaBO5TG-vS4uFHhKP7Pn4syVPA0H7iSrbBL4Of54taqwuP9pAs7HNWPABY4brQ6lsLvY6GL_0HUqskeKNKtSVgqXAJUO2UlioFCrktqKHUsSLhNnfcMpdqVdjMx2fueO6ASWpzk0vNheDnqFkuS7hAOOMmM5BaWQgltEsLrYRNpAGfZeTK2EvUCkaarapCGLO9fa7-GL9GR8EVAVnF2A1qbtc7uEWH5n0736zvovu-AKxVnjk |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LTwIxEG4UTfSkRoxve_DIwva13R4Jj0AEQiIm3Ei3nRoOLATB6L93u7toPHjw1jZNm8w0mWn7ffMh9MiktkwDDShNbMCFoIEmTgVES65YlGRZqs3FJuRoFE-nalyS1XMuDADk4DOo-2b-l2-XZuufyhq5LDqP2D46EJzTsKBrlbxfEqpGu9nKzhP3BBQq6rvpv4RT8rjRPfnnjqeo-sPAw-Pv2HKG9iA9R66J_Y0xKKov1_wEj61aePZTDevU4s5HDkjP-rifBkOPov3EhXBDtgx-ni9KtS7c3YGycJa14jbACpelVl-r6KXbmbR6QamTkJlVqk0AlnAXAtFOaakEEIhtQgwhjiYhs7FjljqhVWKjLEIzx2IHVBIbm1hqxjm7QJV0mcIlwhEzkQFhZcIV104kWnEbSgNZnhErY69Q1RtptipKYcx29rn-Y_wBHfUmw8Fs0B893aBj7xaPs6L0FlU26y3coUPzvpm_re9zV34BE7OhgA |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2025+62nd+ACM%2FIEEE+Design+Automation+Conference+%28DAC%29&rft.atitle=A+Full-system%2C+Programmable%2C+and+Extensible+In-Memory+Computing+Simulation+Framework+for+Deep+Learning&rft.au=Zhou%2C+Kaining&rft.au=Huang%2C+Jian&rft.au=Kim%2C+Nam+Sung&rft.au=Shanbhag%2C+Naresh&rft.date=2025-06-22&rft.pub=IEEE&rft.spage=1&rft.epage=7&rft_id=info:doi/10.1109%2FDAC63849.2025.11132463&rft.externalDocID=11132463 |