Suchergebnisse - Theory of computation → Shared memory algorithms
-
1
Verifying Lock-Free Search Structure Templates (Artifact)
ISSN: 2509-8195Veröffentlicht: Schloss Dagstuhl – Leibniz-Zentrum für Informatik 12.09.2024“… We present and verify template algorithms for lock-free concurrent search structures that cover a broad range of existing implementations based on lists and skiplists …”
Volltext
Datensatz -
2
pSyncPIM: Partially Synchronous Execution of Sparse Matrix Operations for All-Bank PIM Architectures
Veröffentlicht: IEEE 29.06.2024Veröffentlicht in 2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA) (29.06.2024)“… Recent commercial incarnations of processing-in-memory (PIM) maintain the standard DRAM interface and employ the all-bank mode execution to maximize bank-level memory bandwidth …”
Volltext
Tagungsbericht -
3
Max-PIM: Fast and Efficient Max/Min Searching in DRAM
Veröffentlicht: IEEE 05.12.2021Veröffentlicht in 2021 58th ACM/IEEE Design Automation Conference (DAC) (05.12.2021)“… In this work, for the first time, we propose a novel 'Min/Max-in-memory' algorithm based on iterative XNOR bit-wise comparison, which supports parallel inmemory searching for minimum and maximum …”
Volltext
Tagungsbericht -
4
SplitSync: Bank Group-Level Split-Synchronization for High-Performance DRAM PIM
Veröffentlicht: IEEE 22.06.2025Veröffentlicht in 2025 62nd ACM/IEEE Design Automation Conference (DAC) (22.06.2025)“… Processing in Memory (PIM) architectures enhance memory bandwidth by utilizing bank-level parallelism, typically implemented with a SIMD structure where all banks operate simultaneously under a single command …”
Volltext
Tagungsbericht -
5
PIMDup: An Optimized Deduplication Design on a Real Processing-in-Memory System
Veröffentlicht: IEEE 22.06.2025Veröffentlicht in 2025 62nd ACM/IEEE Design Automation Conference (DAC) (22.06.2025)“… To overcome these challenges, we explore UPMEM's DPU, a processing-in-memory (PIM) technology that reduces data movement by performing computations directly within memory …”
Volltext
Tagungsbericht -
6
UPVSS: Jointly Managing Vector Similarity Search with Near-Memory Processing Systems
Veröffentlicht: IEEE 22.06.2025Veröffentlicht in 2025 62nd ACM/IEEE Design Automation Conference (DAC) (22.06.2025)“… to performance limitations in conventional von Neumann architecture with shared memory buses. This data movement bottleneck restricts the efficiency and scalability of vector similarity search due to insufficient memory bandwidth …”
Volltext
Tagungsbericht -
7
AttenPIM: Accelerating LLM Attention with Dual-mode GEMV in Processing-in-Memory
Veröffentlicht: IEEE 22.06.2025Veröffentlicht in 2025 62nd ACM/IEEE Design Automation Conference (DAC) (22.06.2025)“… While recent heterogeneous architectures attempt to address the memory-bound bottleneck from attention computations by processing-in-memory (PIM …”
Volltext
Tagungsbericht -
8
BLOwing Trees to the Ground: Layout Optimization of Decision Trees on Racetrack Memory
Veröffentlicht: IEEE 05.12.2021Veröffentlicht in 2021 58th ACM/IEEE Design Automation Conference (DAC) (05.12.2021)“… Modern distributed low power systems tend to integrate machine learning algorithms, which are directly executed on the distributed devices (on the edge …”
Volltext
Tagungsbericht -
9
SDM: Sharing-Enabled Disaggregated Memory System with Cache Coherent Compute Express Link
Veröffentlicht: IEEE 21.10.2023Veröffentlicht in 2023 32nd International Conference on Parallel Architectures and Compilation Techniques (PACT) (21.10.2023)“… Disaggregated memory has been gaining significant traction as a promising solution for scaling memory capacity and better utilizing memory resources in data centers …”
Volltext
Tagungsbericht -
10
Hardware Support for Durable Atomic Instructions for Persistent Parallel Programming
Veröffentlicht: IEEE 09.07.2023Veröffentlicht in 2023 60th ACM/IEEE Design Automation Conference (DAC) (09.07.2023)“… Persistent memory is emerging as an attractive main memory fabric capable of hosting persistent data …”
Volltext
Tagungsbericht -
11
Optimizing Persistent Memory Transactions
ISSN: 2641-7936Veröffentlicht: IEEE 01.09.2019Veröffentlicht in Proceedings / International Conference on Parallel Architectures and Compilation Techniques (01.09.2019)“… Byte-addressable, non-volatile, random access memory (NVM) has the potential to dramatically accelerate the performance of storage-intensive workloads …”
Volltext
Tagungsbericht -
12
Separating Mechanism from Policy in STM
Veröffentlicht: IEEE 21.10.2023Veröffentlicht in 2023 32nd International Conference on Parallel Architectures and Compilation Techniques (PACT) (21.10.2023)“… On one hand, Software Transactional Memory (STM) is easy, because it allows programmers to simply mark regions of sequential code as requiring atomicity, and then the compiler ensures that no races manifest …”
Volltext
Tagungsbericht -
13
DARIC: A Data Reuse-Friendly CGRA for Parallel Data Access via Elastic FIFOs
Veröffentlicht: IEEE 09.07.2023Veröffentlicht in 2023 60th ACM/IEEE Design Automation Conference (DAC) (09.07.2023)“… For parallel data accesses, uniform memory partitioning is usually introduced to CGRA for better pipelining performance …”
Volltext
Tagungsbericht -
14
Asynchronous Distributed-Memory Parallel Algorithms for Influence Maximization
Veröffentlicht: IEEE 17.11.2024Veröffentlicht in SC24: International Conference for High Performance Computing, Networking, Storage and Analysis (17.11.2024)“… We propose distributed-memory parallel algorithms for the two main kernels of a state-of-the-art implementation of one IM algorithm, influence maximization via martingales (IMM …”
Volltext
Tagungsbericht -
15
Ultrafast CPU/GPU Kernels for Density Accumulation in Placement
Veröffentlicht: IEEE 05.12.2021Veröffentlicht in 2021 58th ACM/IEEE Design Automation Conference (DAC) (05.12.2021)“… Density accumulation is a widely-used primitive operation in physical design, especially for placement. Iterative invocation in the optimization flow makes it …”
Volltext
Tagungsbericht -
16
Scalable task parallelism for NUMA: A uniform abstraction for coordinated scheduling and memory management
Veröffentlicht: ACM 01.09.2016Veröffentlicht in 2016 International Conference on Parallel Architecture and Compilation Techniques (PACT) (01.09.2016)“… Dynamic task-parallel programming models are popular on shared-memory systems, promising enhanced scalability, load balancing and locality …”
Volltext
Tagungsbericht -
17
Unfair Scheduling Patterns in NUMA Architectures
ISSN: 2641-7936Veröffentlicht: IEEE 01.09.2019Veröffentlicht in Proceedings / International Conference on Parallel Architectures and Compilation Techniques (01.09.2019)“… Lock-free algorithms are typically designed and analyzed with adversarial scheduling in mind …”
Volltext
Tagungsbericht -
18
CAF: Core to core Communication Acceleration Framework
Veröffentlicht: ACM 01.09.2016Veröffentlicht in 2016 International Conference on Parallel Architecture and Compilation Techniques (PACT) (01.09.2016)“… The traditional way cores communicate is by using shared memory space between them. However, shared memory communication fundamentally involves coherence invalidations …”
Volltext
Tagungsbericht -
19
Enumeration of Billions of Maximal Bicliques in Bipartite Graphs without Using GPUs
Veröffentlicht: IEEE 17.11.2024Veröffentlicht in SC24: International Conference for High Performance Computing, Networking, Storage and Analysis (17.11.2024)“… To overcome this limitation, we propose an AdaMBE algorithm. First, we redesign its core operations using local neighborhood information derived from computational subgraphs to minimize redundant memory accesses …”
Volltext
Tagungsbericht -
20
Parallel Top-K Algorithms on GPU: A Comprehensive Study and New Methods
ISSN: 2167-4337Veröffentlicht: ACM 11.11.2023Veröffentlicht in International Conference for High Performance Computing, Networking, Storage and Analysis (Online) (11.11.2023)“… As data volume grows rapidly, high-performance parallel top-K algorithms become critical …”
Volltext
Tagungsbericht