Výsledky vyhledávání - shared-memory algorithms
-
1
Max-PIM: Fast and Efficient Max/Min Searching in DRAM
Vydáno: IEEE 05.12.2021Vydáno v 2021 58th ACM/IEEE Design Automation Conference (DAC) (05.12.2021)“… In this work, for the first time, we propose a novel 'Min/Max-in-memory' algorithm based on iterative XNOR bit-wise comparison, which supports parallel inmemory searching for minimum and maximum…”
Získat plný text
Konferenční příspěvek -
2
PATSMA: Parameter Auto-tuning for Shared Memory Algorithms
ISSN: 2352-7110, 2352-7110Vydáno: Elsevier B.V 01.09.2024Vydáno v SoftwareX (01.09.2024)“…Programs with high levels of complexity often face challenges in adjusting execution parameters, particularly when the ideal value for these parameters may…”
Získat plný text
Journal Article -
3
SplitSync: Bank Group-Level Split-Synchronization for High-Performance DRAM PIM
Vydáno: IEEE 22.06.2025Vydáno v 2025 62nd ACM/IEEE Design Automation Conference (DAC) (22.06.2025)“…Processing in Memory (PIM) architectures enhance memory bandwidth by utilizing bank-level parallelism, typically implemented with a SIMD structure where all…”
Získat plný text
Konferenční příspěvek -
4
pSyncPIM: Partially Synchronous Execution of Sparse Matrix Operations for All-Bank PIM Architectures
Vydáno: IEEE 29.06.2024Vydáno v 2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA) (29.06.2024)“…Recent commercial incarnations of processing-in-memory (PIM) maintain the standard DRAM interface and employ the all-bank mode execution to maximize bank-level…”
Získat plný text
Konferenční příspěvek -
5
PIMDup: An Optimized Deduplication Design on a Real Processing-in-Memory System
Vydáno: IEEE 22.06.2025Vydáno v 2025 62nd ACM/IEEE Design Automation Conference (DAC) (22.06.2025)“…Data deduplication enhances storage efficiency through non-destructive compression but is often hindered by the chunking process, which requires scanning the…”
Získat plný text
Konferenční příspěvek -
6
UPVSS: Jointly Managing Vector Similarity Search with Near-Memory Processing Systems
Vydáno: IEEE 22.06.2025Vydáno v 2025 62nd ACM/IEEE Design Automation Conference (DAC) (22.06.2025)“… to performance limitations in conventional von Neumann architecture with shared memory buses. This data movement bottleneck restricts the efficiency and scalability of vector similarity search due to insufficient memory bandwidth…”
Získat plný text
Konferenční příspěvek -
7
AttenPIM: Accelerating LLM Attention with Dual-mode GEMV in Processing-in-Memory
Vydáno: IEEE 22.06.2025Vydáno v 2025 62nd ACM/IEEE Design Automation Conference (DAC) (22.06.2025)“…Large Language Models (LLMs) have demonstrated unprecedented generative performance across a wide range of applications. While recent heterogeneous…”
Získat plný text
Konferenční příspěvek -
8
Parallelization of particle-mass-transfer algorithms on shared-memory, multi-core CPUs
ISSN: 0309-1708Vydáno: Elsevier Ltd 01.11.2024Vydáno v Advances in water resources (01.11.2024)“…Simulating the transfer of mass between particles is not straightforwardly parallelized because it involves the calculation of the influence of many particles…”
Získat plný text
Journal Article -
9
DARIC: A Data Reuse-Friendly CGRA for Parallel Data Access via Elastic FIFOs
Vydáno: IEEE 09.07.2023Vydáno v 2023 60th ACM/IEEE Design Automation Conference (DAC) (09.07.2023)“… Based on the resource graph of DARIC, a mapping algorithm supporting path sharing is proposed…”
Získat plný text
Konferenční příspěvek -
10
Separating Mechanism from Policy in STM
Vydáno: IEEE 21.10.2023Vydáno v 2023 32nd International Conference on Parallel Architectures and Compilation Techniques (PACT) (21.10.2023)“…When designing concurrent data structures (CDSs), it can feel like programmers must choose between performance and convenience. On one hand, Software…”
Získat plný text
Konferenční příspěvek -
11
SDM: Sharing-Enabled Disaggregated Memory System with Cache Coherent Compute Express Link
Vydáno: IEEE 21.10.2023Vydáno v 2023 32nd International Conference on Parallel Architectures and Compilation Techniques (PACT) (21.10.2023)“…Disaggregated memory has been gaining significant traction as a promising solution for scaling memory capacity and better utilizing memory resources in data…”
Získat plný text
Konferenční příspěvek -
12
Modularity‐based parallel protein design algorithm with an implementation using shared memory programming
ISSN: 0887-3585, 1097-0134, 1097-0134Vydáno: Hoboken, USA John Wiley & Sons, Inc 01.03.2022Vydáno v Proteins, structure, function, and bioinformatics (01.03.2022)“…–based parallel protein design algorithm. The modular architecture of the protein structure is exploited by considering an intermediate structural organization between secondary structure and domain defined as protein unit (PU…”
Získat plný text
Journal Article -
13
Hardware Support for Durable Atomic Instructions for Persistent Parallel Programming
Vydáno: IEEE 09.07.2023Vydáno v 2023 60th ACM/IEEE Design Automation Conference (DAC) (09.07.2023)“…Persistent memory is emerging as an attractive main memory fabric capable of hosting persistent data. However, its programmability is hampered by the lack of…”
Získat plný text
Konferenční příspěvek -
14
Mutable locks: Combining the best of spin and sleep locks
ISSN: 1532-0626, 1532-0634Vydáno: Hoboken Wiley Subscription Services, Inc 25.11.2020Vydáno v Concurrency and computation (25.11.2020)“…Summary In this article, we present mutable locks, a synchronization construct with the same semantic of traditional locks (such as spin locks or sleep locks),…”
Získat plný text
Journal Article -
15
BLOwing Trees to the Ground: Layout Optimization of Decision Trees on Racetrack Memory
Vydáno: IEEE 05.12.2021Vydáno v 2021 58th ACM/IEEE Design Automation Conference (DAC) (05.12.2021)“…Modern distributed low power systems tend to integrate machine learning algorithms, which are directly executed on the distributed devices (on the edge…”
Získat plný text
Konferenční příspěvek -
16
Scalable task parallelism for NUMA: A uniform abstraction for coordinated scheduling and memory management
Vydáno: ACM 01.09.2016Vydáno v 2016 International Conference on Parallel Architecture and Compilation Techniques (PACT) (01.09.2016)“…Dynamic task-parallel programming models are popular on shared-memory systems, promising enhanced scalability, load balancing and locality…”
Získat plný text
Konferenční příspěvek -
17
Ultrafast CPU/GPU Kernels for Density Accumulation in Placement
Vydáno: IEEE 05.12.2021Vydáno v 2021 58th ACM/IEEE Design Automation Conference (DAC) (05.12.2021)“…Density accumulation is a widely-used primitive operation in physical design, especially for placement. Iterative invocation in the optimization flow makes it…”
Získat plný text
Konferenční příspěvek -
18
An enhanced parallel block coordinate descent algorithm with shared memory for solving large-scale user equilibrium problems
ISSN: 1366-5545Vydáno: Elsevier Ltd 01.12.2025Vydáno v Transportation research. Part E, Logistics and transportation review (01.12.2025)“…•OpenMP-based parallelization of the PBCD algorithm to leverage shared memory parallelism efficiently…”
Získat plný text
Journal Article -
19
CAF: Core to core Communication Acceleration Framework
Vydáno: ACM 01.09.2016Vydáno v 2016 International Conference on Parallel Architecture and Compilation Techniques (PACT) (01.09.2016)“… The traditional way cores communicate is by using shared memory space between them. However, shared memory communication fundamentally involves coherence invalidations…”
Získat plný text
Konferenční příspěvek -
20
Optimizing Persistent Memory Transactions
ISSN: 2641-7936Vydáno: IEEE 01.09.2019Vydáno v Proceedings / International Conference on Parallel Architectures and Compilation Techniques (01.09.2019)“…Byte-addressable, non-volatile, random access memory (NVM) has the potential to dramatically accelerate the performance of storage-intensive workloads. For…”
Získat plný text
Konferenční příspěvek