Search Results - IEEE International Conference on Algorithms AND Architectures for Parallel Processing

Refine Results
  1. 1

    Proceedings fifth International Conference on algorithms and architectures for parallel processing by IEEE International Conference on Algorithms and Architectures for Parallel Processing, Zhou, Wanlei

    ISBN: 0769515126, 9780769515144, 0769515134, 9780769515137, 0769515142, 9780769515120
    Published: Los Alamitos ; Tokyo IEEE Computer Society 2002
    Get full text
    Book
  2. 2
  3. 3
  4. 4
  5. 5

    Algorithms and architectures for parallel processing: 1997 3rd international conference, Melbourne, Australia, December 10-12 1997 by Goscinski, Andrzej, Zhou, Wanlei, Hobbs, Michael

    ISBN: 0780342291, 9780780342293
    Published: World Scientific Publishing Co. Pte. Ltd 1997
    “…This volume of proceedings describes the lower costs and higher degrees of integration of chip architecture which allow parallel processing…”
    Get full text
    eBook
  6. 6

    Neural Speech Phase Prediction Based on Parallel Estimation Architecture and Anti-Wrapping Losses by Ai, Yang, Ling, Zhen-Hua

    ISSN: 2379-190X
    Published: IEEE 04.06.2023
    “… The proposed model is a cascade of a residual convolutional network and a parallel estimation architecture…”
    Get full text
    Conference Proceeding
  7. 7

    InnerSP: A Memory Efficient Sparse Matrix Multiplication Accelerator with Locality-Aware Inner Product Processing by Baek, Daehyeon, Hwang, Soojin, Heo, Taekyung, Kim, Daehoon, Huh, Jaehyuk

    Published: IEEE 01.09.2021
    “… To mitigate the memory access overheads, recent accelerator designs advocated the outer product processing which minimizes input accesses but generates intermediate products to be merged to the final output matrix…”
    Get full text
    Conference Proceeding
  8. 8

    Skywalker: Efficient Alias-Method-Based Graph Sampling and Random Walk on GPUs by Wang, Pengyu, Li, Chao, Wang, Jing, Wang, Taolei, Zhang, Lu, Leng, Jingwen, Chen, Quan, Guo, Minyi

    Published: IEEE 01.09.2021
    “…Graph sampling and random walk operations, capturing the structural properties of graphs, are playing an important role today as we cannot directly adopt computing-intensive algorithms on large-scale graphs…”
    Get full text
    Conference Proceeding
  9. 9

    Splitwise: Efficient Generative LLM Inference Using Phase Splitting by Patel, Pratyush, Choukse, Esha, Zhang, Chaojie, Shah, Aashaka, Goiri, Inigo, Maleki, Saeed, Bianchini, Ricardo

    Published: IEEE 29.06.2024
    “…Generative large language model (LLM) applications are growing rapidly, leading to large-scale deployments of expensive and power-hungry GPUs. Our…”
    Get full text
    Conference Proceeding
  10. 10

    Accelerating Graph Convolutional Networks Using Crossbar-based Processing-In-Memory Architectures by Huang, Yu, Zheng, Long, Yao, Pengcheng, Wang, Qinggang, Liao, Xiaofei, Jin, Hai, Xue, Jingling

    ISSN: 2378-203X
    Published: IEEE 01.04.2022
    “… efficiency.In this paper, we present a new GCN accelerator, RE-FLIP, with three key innovations in terms of architecture design, algorithm mappings, and practical implementations…”
    Get full text
    Conference Proceeding
  11. 11

    A Hybrid Systolic-Dataflow Architecture for Inductive Matrix Algorithms by Weng, Jian, Liu, Sihao, Wang, Zhengrong, Dadu, Vidushi, Nowatzki, Tony

    ISSN: 2378-203X
    Published: IEEE 01.02.2020
    “… in the hardware/software interface, then a spatial architecture could efficiently execute parallel code regions…”
    Get full text
    Conference Proceeding
  12. 12

    pSyncPIM: Partially Synchronous Execution of Sparse Matrix Operations for All-Bank PIM Architectures by Baek, Daehyeon, Hwang, Soojin, Huh, Jaehyuk

    Published: IEEE 29.06.2024
    “… Sparse matrix processing is another critical computation that can significantly benefit from the PIM architecture, but the current all-bank PIM control cannot support diverging executions due to the random sparsity…”
    Get full text
    Conference Proceeding
  13. 13

    MAD-Max Beyond Single-Node: Enabling Large Machine Learning Model Acceleration on Distributed Systems by Hsia, Samuel, Golden, Alicia, Acun, Bilge, Ardalani, Newsha, DeVito, Zachary, Wei, Gu-Yeon, Brooks, David, Wu, Carole-Jean

    Published: IEEE 29.06.2024
    “…Training and deploying large-scale machine learning models is time-consuming, requires significant distributed computing infrastructures, and incurs high…”
    Get full text
    Conference Proceeding
  14. 14

    Parallelizing Maximal Clique Enumeration on GPUs by Almasri, Mohammad, Chang, Yen-Hsiang, Hajj, Izzat El, Nagi, Rakesh, Xiong, Jinjun, Hwu, Wen-mei

    Published: IEEE 21.10.2023
    “…We present a GPU solution for exact maximal clique enumeration (MCE) that performs a search tree traversal following the Bron-Kerbosch algorithm…”
    Get full text
    Conference Proceeding
  15. 15

    PolyGraph: Exposing the Value of Flexibility for Graph Processing Accelerators by Dadu, Vidushi, Liu, Sihao, Nowatzki, Tony

    ISSN: 2575-713X
    Published: IEEE 01.06.2021
    “… First, we identify a taxonomy of key algorithm variants. Then we develop a template architecture (PolyGraph…”
    Get full text
    Conference Proceeding
  16. 16

    MnnFast: A Fast and Scalable System Architecture for Memory-Augmented Neural Networks by Jang, Hanhwi, Kim, Joonsung, Jo, Jae-Eon, Lee, Jaewon, Kim, Jangwoo

    ISSN: 2575-713X
    Published: ACM 01.06.2019
    “… Such large-scale memory networks provide excellent reasoning power; however, the current computer infrastructure cannot achieve scalable performance due to its limited system architecture…”
    Get full text
    Conference Proceeding
  17. 17

    An In-Network Architecture for Accelerating Shared-Memory Multiprocessor Collectives by Klenk, Benjamin, Jiang, Nan, Thorson, Greg, Dennison, Larry

    Published: IEEE 01.05.2020
    “…The slowdown of single-chip performance scaling combined with the growing demands of computing ever larger problems efficiently has led to a renewed interest in distributed architectures and specialized hardware…”
    Get full text
    Conference Proceeding
  18. 18

    DRRA-based Reconfigurable Architecture for Mixed-Radix FFT by Kallapu, Reeshita, Stathis, Dimitrios, Boppu, Srinivas, Hemani, Ahmed

    ISSN: 2380-6923
    Published: IEEE 01.01.2023
    Published in VLSI design (01.01.2023)
    “… In this paper, we propose an architecture for the implementation of the FFT that is derived from the Dynamically Reconfigurable Resource Array and has multiple parallel processing cells while also…”
    Get full text
    Conference Proceeding
  19. 19

    Seer: Predictive Runtime Kernel Selection for Irregular Problems by Swann, Ryan, Osama, Muhammad, Sangaiah, Karthik, Mahmud, Jalal

    ISSN: 2643-2838
    Published: IEEE 02.03.2024
    “…Modern GPUs are designed for regular problems and suffer from load imbalance when processing irregular data…”
    Get full text
    Conference Proceeding
  20. 20

    Accelerating Fourier and Number Theoretic Transforms using Tensor Cores and Warp Shuffles by Durrani, Sultan, Chughtai, Muhammad Saad, Hidayetoglu, Mert, Tahir, Rashid, Dakkak, Abdul, Rauchwerger, Lawrence, Zaffar, Fareed, Hwu, Wen-mei

    Published: IEEE 01.09.2021
    “… To speed things up, fast Fourier transform (FFT) algorithms, which are reduced-complexity formulations for computing the DFT of a sequence, have been proposed and implemented for traditional processors and their corresponding instruction sets…”
    Get full text
    Conference Proceeding