Search Results - shared‐memory algorithms

Refine Results
  1. 1

    Max-PIM: Fast and Efficient Max/Min Searching in DRAM by Zhang, Fan, Angizi, Shaahin, Fan, Deliang

    Published: IEEE 05.12.2021
    “… In this work, for the first time, we propose a novel 'Min/Max-in-memory' algorithm based on iterative XNOR bit-wise comparison, which supports parallel inmemory searching for minimum and maximum…”
    Get full text
    Conference Proceeding
  2. 2

    PATSMA: Parameter Auto-tuning for Shared Memory Algorithms by Fernandes, Joao B., Santos-da-Silva, Felipe H., Barros, Tiago, Assis, Italo A.S., Xavier-de-Souza, Samuel

    ISSN: 2352-7110, 2352-7110
    Published: Elsevier B.V 01.09.2024
    Published in SoftwareX (01.09.2024)
    “…Programs with high levels of complexity often face challenges in adjusting execution parameters, particularly when the ideal value for these parameters may…”
    Get full text
    Journal Article
  3. 3

    SplitSync: Bank Group-Level Split-Synchronization for High-Performance DRAM PIM by Yoon, Byungkuk, Han, Sanghyeok, Park, Gyeonghwan, Kim, Jae-Joon

    Published: IEEE 22.06.2025
    “…Processing in Memory (PIM) architectures enhance memory bandwidth by utilizing bank-level parallelism, typically implemented with a SIMD structure where all…”
    Get full text
    Conference Proceeding
  4. 4

    pSyncPIM: Partially Synchronous Execution of Sparse Matrix Operations for All-Bank PIM Architectures by Baek, Daehyeon, Hwang, Soojin, Huh, Jaehyuk

    Published: IEEE 29.06.2024
    “…Recent commercial incarnations of processing-in-memory (PIM) maintain the standard DRAM interface and employ the all-bank mode execution to maximize bank-level…”
    Get full text
    Conference Proceeding
  5. 5

    PIMDup: An Optimized Deduplication Design on a Real Processing-in-Memory System by Yeh, Chun-Le, Chen, Liang-Chi, Ho, Chien-Chung, Chang, Yu-Ming, Chang, Da-Wei

    Published: IEEE 22.06.2025
    “…Data deduplication enhances storage efficiency through non-destructive compression but is often hindered by the chunking process, which requires scanning the…”
    Get full text
    Conference Proceeding
  6. 6

    UPVSS: Jointly Managing Vector Similarity Search with Near-Memory Processing Systems by Liu, Chun-Chien, Wu, Chun-Feng, Jin, Yunho

    Published: IEEE 22.06.2025
    “… to performance limitations in conventional von Neumann architecture with shared memory buses. This data movement bottleneck restricts the efficiency and scalability of vector similarity search due to insufficient memory bandwidth…”
    Get full text
    Conference Proceeding
  7. 7

    AttenPIM: Accelerating LLM Attention with Dual-mode GEMV in Processing-in-Memory by Chen, Liyan, Lyu, Dongxu, Li, Zhenyu, Jiang, Jianfei, Wang, Qin, Mao, Zhigang, Jing, Naifeng

    Published: IEEE 22.06.2025
    “…Large Language Models (LLMs) have demonstrated unprecedented generative performance across a wide range of applications. While recent heterogeneous…”
    Get full text
    Conference Proceeding
  8. 8

    Parallelization of particle-mass-transfer algorithms on shared-memory, multi-core CPUs by Benson, David A., Pribec, Ivan, Engdahl, Nicholas B., Pankavich, Stephen, Schauer, Lucas

    ISSN: 0309-1708
    Published: Elsevier Ltd 01.11.2024
    Published in Advances in water resources (01.11.2024)
    “…Simulating the transfer of mass between particles is not straightforwardly parallelized because it involves the calculation of the influence of many particles…”
    Get full text
    Journal Article
  9. 9

    DARIC: A Data Reuse-Friendly CGRA for Parallel Data Access via Elastic FIFOs by Liu, Dajiang, Mou, Di, Zhu, Rong, Zhuang, Yan, Shang, Jiaxing, Zhong, Jiang, Yin, Shouyi

    Published: IEEE 09.07.2023
    “… Based on the resource graph of DARIC, a mapping algorithm supporting path sharing is proposed…”
    Get full text
    Conference Proceeding
  10. 10

    Separating Mechanism from Policy in STM by Sheng, Yaodong, Hassan, Ahmed, Spear, Michael

    Published: IEEE 21.10.2023
    “…When designing concurrent data structures (CDSs), it can feel like programmers must choose between performance and convenience. On one hand, Software…”
    Get full text
    Conference Proceeding
  11. 11

    SDM: Sharing-Enabled Disaggregated Memory System with Cache Coherent Compute Express Link by Lee, Hyokeun, Choi, Kwanseok, Lee, Hvuk-Jae, Sim, Jaewoong

    Published: IEEE 21.10.2023
    “…Disaggregated memory has been gaining significant traction as a promising solution for scaling memory capacity and better utilizing memory resources in data…”
    Get full text
    Conference Proceeding
  12. 12

    Modularity‐based parallel protein design algorithm with an implementation using shared memory programming by Pal, Abantika, Mulumudy, Rohith, Mitra, Pralay

    ISSN: 0887-3585, 1097-0134, 1097-0134
    Published: Hoboken, USA John Wiley & Sons, Inc 01.03.2022
    “…–based parallel protein design algorithm. The modular architecture of the protein structure is exploited by considering an intermediate structural organization between secondary structure and domain defined as protein unit (PU…”
    Get full text
    Journal Article
  13. 13

    Hardware Support for Durable Atomic Instructions for Persistent Parallel Programming by Hadi, Khan Shaikhul, Ul Mustafa, Naveed, Heinrich, Mark, Solihin, Yan

    Published: IEEE 09.07.2023
    “…Persistent memory is emerging as an attractive main memory fabric capable of hosting persistent data. However, its programmability is hampered by the lack of…”
    Get full text
    Conference Proceeding
  14. 14

    Mutable locks: Combining the best of spin and sleep locks by Marotta, Romolo, Tiriticco, Davide, Di Sanzo, Pierangelo, Pellegrini, Alessandro, Ciciani, Bruno, Quaglia, Francesco

    ISSN: 1532-0626, 1532-0634
    Published: Hoboken Wiley Subscription Services, Inc 25.11.2020
    Published in Concurrency and computation (25.11.2020)
    “…Summary In this article, we present mutable locks, a synchronization construct with the same semantic of traditional locks (such as spin locks or sleep locks),…”
    Get full text
    Journal Article
  15. 15

    BLOwing Trees to the Ground: Layout Optimization of Decision Trees on Racetrack Memory by Hakert, Christian, Khan, Asif Ali, Chen, Kuan-Hsun, Hameed, Fazal, Castrillon, Jeronimo, Chen, Jian-Jia

    Published: IEEE 05.12.2021
    “…Modern distributed low power systems tend to integrate machine learning algorithms, which are directly executed on the distributed devices (on the edge…”
    Get full text
    Conference Proceeding
  16. 16

    Scalable task parallelism for NUMA: A uniform abstraction for coordinated scheduling and memory management by Drebes, Andi, Pop, Antoniu, Heydemann, Karine, Cohen, Albert, Drach, Nathalie

    Published: ACM 01.09.2016
    “…Dynamic task-parallel programming models are popular on shared-memory systems, promising enhanced scalability, load balancing and locality…”
    Get full text
    Conference Proceeding
  17. 17

    Ultrafast CPU/GPU Kernels for Density Accumulation in Placement by Guo, Zizheng, Mai, Jing, Lin, Yibo

    Published: IEEE 05.12.2021
    “…Density accumulation is a widely-used primitive operation in physical design, especially for placement. Iterative invocation in the optimization flow makes it…”
    Get full text
    Conference Proceeding
  18. 18

    An enhanced parallel block coordinate descent algorithm with shared memory for solving large-scale user equilibrium problems by Liu, Zhiyuan, Zhang, Yicheng, Zhang, Honggang, Zhang, Kai

    ISSN: 1366-5545
    Published: Elsevier Ltd 01.12.2025
    “…•OpenMP-based parallelization of the PBCD algorithm to leverage shared memory parallelism efficiently…”
    Get full text
    Journal Article
  19. 19

    CAF: Core to core Communication Acceleration Framework by Yipeng Wang, Ren Wang, Herdrich, Andrew, Tsai, James, Solihin, Yan

    Published: ACM 01.09.2016
    “… The traditional way cores communicate is by using shared memory space between them. However, shared memory communication fundamentally involves coherence invalidations…”
    Get full text
    Conference Proceeding
  20. 20

    Optimizing Persistent Memory Transactions by Zardoshti, Pantea, Zhou, Tingzhe, Liu, Yujie, Spear, Michael

    ISSN: 2641-7936
    Published: IEEE 01.09.2019
    “…Byte-addressable, non-volatile, random access memory (NVM) has the potential to dramatically accelerate the performance of storage-intensive workloads. For…”
    Get full text
    Conference Proceeding