Suchergebnisse - Theory of computation → Shared memory algorithms

  1. 1

    Verifying Lock-Free Search Structure Templates (Artifact) von Patel, Nisarg, Shasha, Dennis, Wies, Thomas

    ISSN: 2509-8195
    Veröffentlicht: Schloss Dagstuhl – Leibniz-Zentrum für Informatik 12.09.2024
    “… We present and verify template algorithms for lock-free concurrent search structures that cover a broad range of existing implementations based on lists and skiplists …”
    Volltext
    Datensatz
  2. 2

    pSyncPIM: Partially Synchronous Execution of Sparse Matrix Operations for All-Bank PIM Architectures von Baek, Daehyeon, Hwang, Soojin, Huh, Jaehyuk

    Veröffentlicht: IEEE 29.06.2024
    “… Recent commercial incarnations of processing-in-memory (PIM) maintain the standard DRAM interface and employ the all-bank mode execution to maximize bank-level memory bandwidth …”
    Volltext
    Tagungsbericht
  3. 3

    Max-PIM: Fast and Efficient Max/Min Searching in DRAM von Zhang, Fan, Angizi, Shaahin, Fan, Deliang

    Veröffentlicht: IEEE 05.12.2021
    “… In this work, for the first time, we propose a novel 'Min/Max-in-memory' algorithm based on iterative XNOR bit-wise comparison, which supports parallel inmemory searching for minimum and maximum …”
    Volltext
    Tagungsbericht
  4. 4

    SplitSync: Bank Group-Level Split-Synchronization for High-Performance DRAM PIM von Yoon, Byungkuk, Han, Sanghyeok, Park, Gyeonghwan, Kim, Jae-Joon

    Veröffentlicht: IEEE 22.06.2025
    “… Processing in Memory (PIM) architectures enhance memory bandwidth by utilizing bank-level parallelism, typically implemented with a SIMD structure where all banks operate simultaneously under a single command …”
    Volltext
    Tagungsbericht
  5. 5

    PIMDup: An Optimized Deduplication Design on a Real Processing-in-Memory System von Yeh, Chun-Le, Chen, Liang-Chi, Ho, Chien-Chung, Chang, Yu-Ming, Chang, Da-Wei

    Veröffentlicht: IEEE 22.06.2025
    “… To overcome these challenges, we explore UPMEM's DPU, a processing-in-memory (PIM) technology that reduces data movement by performing computations directly within memory …”
    Volltext
    Tagungsbericht
  6. 6

    UPVSS: Jointly Managing Vector Similarity Search with Near-Memory Processing Systems von Liu, Chun-Chien, Wu, Chun-Feng, Jin, Yunho

    Veröffentlicht: IEEE 22.06.2025
    “… to performance limitations in conventional von Neumann architecture with shared memory buses. This data movement bottleneck restricts the efficiency and scalability of vector similarity search due to insufficient memory bandwidth …”
    Volltext
    Tagungsbericht
  7. 7

    AttenPIM: Accelerating LLM Attention with Dual-mode GEMV in Processing-in-Memory von Chen, Liyan, Lyu, Dongxu, Li, Zhenyu, Jiang, Jianfei, Wang, Qin, Mao, Zhigang, Jing, Naifeng

    Veröffentlicht: IEEE 22.06.2025
    “… While recent heterogeneous architectures attempt to address the memory-bound bottleneck from attention computations by processing-in-memory (PIM …”
    Volltext
    Tagungsbericht
  8. 8

    BLOwing Trees to the Ground: Layout Optimization of Decision Trees on Racetrack Memory von Hakert, Christian, Khan, Asif Ali, Chen, Kuan-Hsun, Hameed, Fazal, Castrillon, Jeronimo, Chen, Jian-Jia

    Veröffentlicht: IEEE 05.12.2021
    “… Modern distributed low power systems tend to integrate machine learning algorithms, which are directly executed on the distributed devices (on the edge …”
    Volltext
    Tagungsbericht
  9. 9

    SDM: Sharing-Enabled Disaggregated Memory System with Cache Coherent Compute Express Link von Lee, Hyokeun, Choi, Kwanseok, Lee, Hvuk-Jae, Sim, Jaewoong

    Veröffentlicht: IEEE 21.10.2023
    “… Disaggregated memory has been gaining significant traction as a promising solution for scaling memory capacity and better utilizing memory resources in data centers …”
    Volltext
    Tagungsbericht
  10. 10

    Hardware Support for Durable Atomic Instructions for Persistent Parallel Programming von Hadi, Khan Shaikhul, Ul Mustafa, Naveed, Heinrich, Mark, Solihin, Yan

    Veröffentlicht: IEEE 09.07.2023
    “… Persistent memory is emerging as an attractive main memory fabric capable of hosting persistent data …”
    Volltext
    Tagungsbericht
  11. 11

    Optimizing Persistent Memory Transactions von Zardoshti, Pantea, Zhou, Tingzhe, Liu, Yujie, Spear, Michael

    ISSN: 2641-7936
    Veröffentlicht: IEEE 01.09.2019
    “… Byte-addressable, non-volatile, random access memory (NVM) has the potential to dramatically accelerate the performance of storage-intensive workloads …”
    Volltext
    Tagungsbericht
  12. 12

    Separating Mechanism from Policy in STM von Sheng, Yaodong, Hassan, Ahmed, Spear, Michael

    Veröffentlicht: IEEE 21.10.2023
    “… On one hand, Software Transactional Memory (STM) is easy, because it allows programmers to simply mark regions of sequential code as requiring atomicity, and then the compiler ensures that no races manifest …”
    Volltext
    Tagungsbericht
  13. 13

    DARIC: A Data Reuse-Friendly CGRA for Parallel Data Access via Elastic FIFOs von Liu, Dajiang, Mou, Di, Zhu, Rong, Zhuang, Yan, Shang, Jiaxing, Zhong, Jiang, Yin, Shouyi

    Veröffentlicht: IEEE 09.07.2023
    “… For parallel data accesses, uniform memory partitioning is usually introduced to CGRA for better pipelining performance …”
    Volltext
    Tagungsbericht
  14. 14

    Asynchronous Distributed-Memory Parallel Algorithms for Influence Maximization von Singhal, Shubhendra Pal, Hati, Souvadra, Young, Jeffrey, Sarkar, Vivek, Hayashi, Akihiro, Vuduc, Richard

    Veröffentlicht: IEEE 17.11.2024
    “… We propose distributed-memory parallel algorithms for the two main kernels of a state-of-the-art implementation of one IM algorithm, influence maximization via martingales (IMM …”
    Volltext
    Tagungsbericht
  15. 15

    Ultrafast CPU/GPU Kernels for Density Accumulation in Placement von Guo, Zizheng, Mai, Jing, Lin, Yibo

    Veröffentlicht: IEEE 05.12.2021
    “… Density accumulation is a widely-used primitive operation in physical design, especially for placement. Iterative invocation in the optimization flow makes it …”
    Volltext
    Tagungsbericht
  16. 16

    Scalable task parallelism for NUMA: A uniform abstraction for coordinated scheduling and memory management von Drebes, Andi, Pop, Antoniu, Heydemann, Karine, Cohen, Albert, Drach, Nathalie

    Veröffentlicht: ACM 01.09.2016
    “… Dynamic task-parallel programming models are popular on shared-memory systems, promising enhanced scalability, load balancing and locality …”
    Volltext
    Tagungsbericht
  17. 17

    Unfair Scheduling Patterns in NUMA Architectures von Ben-David, Naama, Scully, Ziv, Blelloch, Guy E.

    ISSN: 2641-7936
    Veröffentlicht: IEEE 01.09.2019
    “… Lock-free algorithms are typically designed and analyzed with adversarial scheduling in mind …”
    Volltext
    Tagungsbericht
  18. 18

    CAF: Core to core Communication Acceleration Framework von Yipeng Wang, Ren Wang, Herdrich, Andrew, Tsai, James, Solihin, Yan

    Veröffentlicht: ACM 01.09.2016
    “… The traditional way cores communicate is by using shared memory space between them. However, shared memory communication fundamentally involves coherence invalidations …”
    Volltext
    Tagungsbericht
  19. 19

    Enumeration of Billions of Maximal Bicliques in Bipartite Graphs without Using GPUs von Pan, Zhe, He, Shuibing, Li, Xu, Zhang, Xuechen, Yin, Yanlong, Wang, Rui, Shou, Lidan, Song, Mingli, Sun, Xian-He, Chen, Gang

    Veröffentlicht: IEEE 17.11.2024
    “… To overcome this limitation, we propose an AdaMBE algorithm. First, we redesign its core operations using local neighborhood information derived from computational subgraphs to minimize redundant memory accesses …”
    Volltext
    Tagungsbericht
  20. 20

    Parallel Top-K Algorithms on GPU: A Comprehensive Study and New Methods von Zhang, Jingrong, Naruse, Akira, Li, Xipeng, Wang, Yong

    ISSN: 2167-4337
    Veröffentlicht: ACM 11.11.2023
    “… As data volume grows rapidly, high-performance parallel top-K algorithms become critical …”
    Volltext
    Tagungsbericht