Search Results - shared memory parallel algorithm

Refine Results
  1. 1

    Max-PIM: Fast and Efficient Max/Min Searching in DRAM by Zhang, Fan, Angizi, Shaahin, Fan, Deliang

    Published: IEEE 05.12.2021
    “… In this work, for the first time, we propose a novel 'Min/Max-in-memory' algorithm based on iterative XNOR bit-wise comparison, which supports parallel inmemory searching for minimum and maximum…”
    Get full text
    Conference Proceeding
  2. 2

    Modularity‐based parallel protein design algorithm with an implementation using shared memory programming by Pal, Abantika, Mulumudy, Rohith, Mitra, Pralay

    ISSN: 0887-3585, 1097-0134, 1097-0134
    Published: Hoboken, USA John Wiley & Sons, Inc 01.03.2022
    “…–based parallel protein design algorithm. The modular architecture of the protein structure is exploited by considering an intermediate structural organization between secondary structure and domain defined as protein unit (PU…”
    Get full text
    Journal Article
  3. 3

    Joint direct and transposed sparse matrix‐vector multiplication for multithreaded CPUs by Kozický, Claudio, Šimeček, Ivan

    ISSN: 1532-0626, 1532-0634
    Published: Hoboken Wiley Subscription Services, Inc 10.07.2021
    Published in Concurrency and computation (10.07.2021)
    “…‐vector multiplication (SpMMTV). In this article, we present a parallel SpMMTV algorithm for sharedmemory CPUs…”
    Get full text
    Journal Article
  4. 4

    A Shared-Memory Parallel Alpha-Tree Algorithm for Extreme Dynamic Ranges by Ryu, Jiwoo, Trager, Scott C., Wilkinson, Michael H. F.

    ISSN: 1057-7149, 1941-0042, 1941-0042
    Published: United States IEEE 01.01.2025
    Published in IEEE transactions on image processing (01.01.2025)
    “…The <inline-formula> <tex-math notation="LaTeX">\alpha </tex-math></inline-formula>-tree is an effective hierarchical image representation used for connected…”
    Get full text
    Journal Article
  5. 5

    pSyncPIM: Partially Synchronous Execution of Sparse Matrix Operations for All-Bank PIM Architectures by Baek, Daehyeon, Hwang, Soojin, Huh, Jaehyuk

    Published: IEEE 29.06.2024
    “…Recent commercial incarnations of processing-in-memory (PIM) maintain the standard DRAM interface and employ the all-bank mode execution to maximize bank-level memory bandwidth…”
    Get full text
    Conference Proceeding
  6. 6

    SplitSync: Bank Group-Level Split-Synchronization for High-Performance DRAM PIM by Yoon, Byungkuk, Han, Sanghyeok, Park, Gyeonghwan, Kim, Jae-Joon

    Published: IEEE 22.06.2025
    “…Processing in Memory (PIM) architectures enhance memory bandwidth by utilizing bank-level parallelism, typically implemented with a SIMD structure where all banks operate simultaneously under a single command…”
    Get full text
    Conference Proceeding
  7. 7

    PIMDup: An Optimized Deduplication Design on a Real Processing-in-Memory System by Yeh, Chun-Le, Chen, Liang-Chi, Ho, Chien-Chung, Chang, Yu-Ming, Chang, Da-Wei

    Published: IEEE 22.06.2025
    “… and storage are separated in a processor-centric design, necessitating multiple memory hierarchy traversals and causing inefficiencies…”
    Get full text
    Conference Proceeding
  8. 8

    A Hybrid Shared-Memory Parallel Max-Tree Algorithm for Extreme Dynamic-Range Images by Moschini, Ugo, Meijster, Arnold, Wilkinson, Michael H. F.

    ISSN: 0162-8828, 1939-3539, 2160-9292, 1939-3539
    Published: United States IEEE 01.03.2018
    “… However, we show that the current parallel algorithms perform poorly already with integers at bit depths higher than 16 bits per pixel…”
    Get full text
    Journal Article
  9. 9

    UPVSS: Jointly Managing Vector Similarity Search with Near-Memory Processing Systems by Liu, Chun-Chien, Wu, Chun-Feng, Jin, Yunho

    Published: IEEE 22.06.2025
    “… to performance limitations in conventional von Neumann architecture with shared memory buses. This data movement bottleneck restricts the efficiency and scalability of vector similarity search due to insufficient memory bandwidth…”
    Get full text
    Conference Proceeding
  10. 10

    AttenPIM: Accelerating LLM Attention with Dual-mode GEMV in Processing-in-Memory by Chen, Liyan, Lyu, Dongxu, Li, Zhenyu, Jiang, Jianfei, Wang, Qin, Mao, Zhigang, Jing, Naifeng

    Published: IEEE 22.06.2025
    “… While recent heterogeneous architectures attempt to address the memory-bound bottleneck from attention computations by processing-in-memory (PIM…”
    Get full text
    Conference Proceeding
  11. 11

    Hardware Support for Durable Atomic Instructions for Persistent Parallel Programming by Hadi, Khan Shaikhul, Ul Mustafa, Naveed, Heinrich, Mark, Solihin, Yan

    Published: IEEE 09.07.2023
    “…Persistent memory is emerging as an attractive main memory fabric capable of hosting persistent data…”
    Get full text
    Conference Proceeding
  12. 12

    SDM: Sharing-Enabled Disaggregated Memory System with Cache Coherent Compute Express Link by Lee, Hyokeun, Choi, Kwanseok, Lee, Hvuk-Jae, Sim, Jaewoong

    Published: IEEE 21.10.2023
    “…Disaggregated memory has been gaining significant traction as a promising solution for scaling memory capacity and better utilizing memory resources in data centers…”
    Get full text
    Conference Proceeding
  13. 13

    DARIC: A Data Reuse-Friendly CGRA for Parallel Data Access via Elastic FIFOs by Liu, Dajiang, Mou, Di, Zhu, Rong, Zhuang, Yan, Shang, Jiaxing, Zhong, Jiang, Yin, Shouyi

    Published: IEEE 09.07.2023
    “… For parallel data accesses, uniform memory partitioning is usually introduced to CGRA for better pipelining performance…”
    Get full text
    Conference Proceeding
  14. 14

    Scalable task parallelism for NUMA: A uniform abstraction for coordinated scheduling and memory management by Drebes, Andi, Pop, Antoniu, Heydemann, Karine, Cohen, Albert, Drach, Nathalie

    Published: ACM 01.09.2016
    “…Dynamic task-parallel programming models are popular on shared-memory systems, promising enhanced scalability, load balancing and locality…”
    Get full text
    Conference Proceeding
  15. 15

    A Shared-Memory Parallel Algorithm for Updating Single-Source Shortest Paths in Large Dynamic Networks by Srinivasan, Sriram, Riazi, Sara, Norris, Boyana, Das, Sajal K., Bhowmick, Sanjukta

    ISSN: 2640-0316
    Published: IEEE 01.12.2018
    “… To this end, we present a novel two-step shared-memory algorithm for updating SSSP on weighted large-scale graphs…”
    Get full text
    Conference Proceeding
  16. 16

    A hybrid shared/distributed memory parallel genetic algorithm for optimization of laminate composites by Rocha, I.B.C.M., Parente, E., Melo, A.M.C.

    ISSN: 0263-8223, 1879-1085
    Published: Elsevier Ltd 01.01.2014
    Published in Composite structures (01.01.2014)
    “…This work presents a genetic algorithm combining two types of computational parallelization methods, resulting in a hybrid shared/distributed memory algorithm based on the island model using…”
    Get full text
    Journal Article
  17. 17

    BLOwing Trees to the Ground: Layout Optimization of Decision Trees on Racetrack Memory by Hakert, Christian, Khan, Asif Ali, Chen, Kuan-Hsun, Hameed, Fazal, Castrillon, Jeronimo, Chen, Jian-Jia

    Published: IEEE 05.12.2021
    “…Modern distributed low power systems tend to integrate machine learning algorithms, which are directly executed on the distributed devices (on the edge…”
    Get full text
    Conference Proceeding
  18. 18

    A Parallel Algorithm Template for Updating Single-Source Shortest Paths in Large-Scale Dynamic Networks by Khanda, Arindam, Srinivasan, Sriram, Bhowmick, Sanjukta, Norris, Boyana, Das, Sajal K.

    ISSN: 1045-9219, 1558-2183
    Published: New York IEEE 01.04.2022
    “… We present a novel parallel algorithmic framework for updating the SSSP in large-scale dynamic networks and implement it on the shared-memory and GPU platforms…”
    Get full text
    Journal Article
  19. 19

    Separating Mechanism from Policy in STM by Sheng, Yaodong, Hassan, Ahmed, Spear, Michael

    Published: IEEE 21.10.2023
    “… On one hand, Software Transactional Memory (STM) is easy, because it allows programmers to simply mark regions of sequential code as requiring atomicity, and then the compiler ensures that no races manifest…”
    Get full text
    Conference Proceeding
  20. 20

    Shared-Memory Parallel Edmonds Blossom Algorithm for Maximum Cardinality Matching in General Graphs by Schwing, Gregory, Grosu, Daniel, Schwiebert, Loren

    ISSN: 2164-7062
    Published: United States IEEE 01.05.2024
    “…The Edmonds Blossom algorithm is implemented here using depth-first search, which is intrinsically serial…”
    Get full text
    Conference Proceeding Journal Article