SplitSync: Bank Group-Level Split-Synchronization for High-Performance DRAM PIM
Processing in Memory (PIM) architectures enhance memory bandwidth by utilizing bank-level parallelism, typically implemented with a SIMD structure where all banks operate simultaneously under a single command. However, this synchronous approach requires the activation of all banks before computation...
Saved in:
| Published in: | 2025 62nd ACM/IEEE Design Automation Conference (DAC) pp. 1 - 7 |
|---|---|
| Main Authors: | , , , |
| Format: | Conference Proceeding |
| Language: | English |
| Published: |
IEEE
22.06.2025
|
| Subjects: | |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Processing in Memory (PIM) architectures enhance memory bandwidth by utilizing bank-level parallelism, typically implemented with a SIMD structure where all banks operate simultaneously under a single command. However, this synchronous approach requires the activation of all banks before computation, leading to activation times that exceed computation times, limiting performance gain. Recently, asynchronous execution PIM has been proposed as an alternative, allowing banks to operate asynchronously and overlap activation with processing to hide the row activation overhead. While effective at reducing row activation overhead, the independent operation requires large shared accumulators for each bank group, increasing area overhead. To address the issues, we propose bank group (BG)-level split synchronization DRAM PIM, where each bank group operates asynchronously to hide row activation overhead while operating synchronously within the bank group to eliminate the need for shared accumulators. Evaluation results show that our proposed design achieves an average throughput improvement of 1.70 x and 1.06 x compared to conventional PIM and asynchronous execution PIM. Furthermore, the area overhead per processing unit (PU) increases by only 1.5 \% compared to conventional PIM and is significantly lower than that of asynchronous execution PIM. |
|---|---|
| DOI: | 10.1109/DAC63849.2025.11132821 |