Search Results - "Computer systems organization Architectures Other architectures Reconfigurable computing"

Refine Results
  1. 1

    DRISA: a DRAM-based Reconfigurable In-Situ Accelerator by Li, Shuangchen, Niu, Dimin, Malladi, Krishna T., Zheng, Hongzhong, Brennan, Bob, Xie, Yuan

    ISBN: 1450349528, 9781450349529
    ISSN: 2379-3155
    Published: New York, NY, USA ACM 14.10.2017
    “…Data movement between the processing units and the memory in traditional von Neumann architecture is creating the "memory wall" problem. To bridge the gap, two…”
    Get full text
    Conference Proceeding
  2. 2

    Maximizing CNN accelerator efficiency through resource partitioning by Yongming Shen, Ferdman, Michael, Milder, Peter

    Published: ACM 01.06.2017
    “…Convolutional neural networks (CNNs) are revolutionizing machine learning, but they present significant computational challenges. Recently, many FPGA-based…”
    Get full text
    Conference Proceeding
  3. 3

    Stream-dataflow acceleration by Nowatzki, Tony, Gangadhar, Vinay, Ardalani, Newsha, Sankaralingam, Karthikeyan

    Published: ACM 01.06.2017
    “…Demand for low-power data processing hardware continues to rise inexorably. Existing programmable and "general purpose" solutions (eg. SIMD, GPGPUs) are…”
    Get full text
    Conference Proceeding
  4. 4

    Understanding and optimizing asynchronous low-precision stochastic gradient descent by De Sa, Christopher, Feldman, Matthew, Re, Christopher, Olukotun, Kunle

    Published: ACM 01.06.2017
    “…Stochastic gradient descent (SGD) is one of the most popular numerical algorithms used in machine learning and other domains. Since this is likely to continue…”
    Get full text
    Conference Proceeding
  5. 5

    Qubit Mapping for Reconfigurable Atom Arrays by Tan, Bochen, Bluvstein, Dolev, Lukin, Mikhail D., Cong, Jason

    ISSN: 1558-2434
    Published: ACM 29.10.2022
    “…Because of the largest number of qubits available, and the massive parallel execution of entangling two-qubit gates, atom arrays is a promising platform for…”
    Get full text
    Conference Proceeding
  6. 6

    Caffeine: Towards uniformed representation and acceleration for deep convolutional neural networks by Chen Zhang, Zhenman Fang, Peipei Zhou, Peichen Pan, Jason Cong

    ISSN: 1558-2434
    Published: ACM 01.11.2016
    “…With the recent advancement of multilayer convolutional neural networks (CNN), deep learning has achieved amazing success in many areas, especially in visual…”
    Get full text
    Conference Proceeding
  7. 7

    SODA: Stencil with Optimized Dataflow Architecture by Chi, Yuze, Cong, Jason, Wei, Peng, Zhou, Peipei

    ISSN: 1558-2434
    Published: ACM 01.11.2018
    “…Stencil computation is one of the most important kernels in many application domains such as image processing, solving partial differential equations, and…”
    Get full text
    Conference Proceeding
  8. 8

    TGPA: Tile-Grained Pipeline Architecture for Low Latency CNN Inference by Wei, Xuechao, Liang, Yun, Li, Xiuhong, Yu, Cody Hao, Zhang, Peng, Cong, Jason

    ISSN: 1558-2434
    Published: ACM 01.11.2018
    “…FPGAs are more and more widely used as reconfigurable hardware accelerators for applications leveraging convolutional neural networks (CNNs) in recent years…”
    Get full text
    Conference Proceeding
  9. 9

    MECLA: Memory-Compute-Efficient LLM Accelerator with Scaling Sub-matrix Partition by Qin, Yubin, Wang, Yang, Zhao, Zhiren, Yang, Xiaolong, Zhou, Yang, Wei, Shaojun, Hu, Yang, Yin, Shouyi

    Published: IEEE 29.06.2024
    “…Large language models (LLMs) have been showing surprising performance in processing language tasks, bringing a new prevalence to deploy LLM from cloud to edge…”
    Get full text
    Conference Proceeding
  10. 10

    Hardware-Aware Machine Learning: Modeling and Optimization by Marculescu, Diana, Stamoulis, Dimitrios, Cai, Ermao

    ISSN: 1558-2434
    Published: ACM 01.11.2018
    “…Recent breakthroughs in Machine Learning (ML) applications, and especially in Deep Learning (DL), have made DL models a key component in almost every modern…”
    Get full text
    Conference Proceeding
  11. 11

    Map-and-Conquer: Energy-Efficient Mapping of Dynamic Neural Nets onto Heterogeneous MPSoCs by Bouzidi, Halima, Odema, Mohanad, Ouarnoughi, Hamza, Niar, Smail, Al Faruque, Mohammad Abdullah

    Published: IEEE 09.07.2023
    “…Heterogeneous MPSoCs comprise diverse processing units of varying compute capabilities. To date, the mapping strategies of neural networks (NNs) onto such…”
    Get full text
    Conference Proceeding
  12. 12

    A Memory-Efficient LLM Accelerator with Q-K Correlation Prediction using Cluster-Based Associative Array for Selective KV Accessing by Zhou, Zikang, Chen, Kaiqi, Duan, Xuyang, Han, Jun

    Published: IEEE 22.06.2025
    “…Attention-based LLMs excel in text generation but face redundant computations in autoregressive token generation. While KV cache mitigates this, it introduces…”
    Get full text
    Conference Proceeding
  13. 13

    Supporting Register-based Addressing Modes for in-DRAM PIM ISAs by Kim, Seok Young, Choi, Byung Ho, Kang, Seokwon, Park, Yongjun, Kim, Seon Wook

    Published: IEEE 22.06.2025
    “…Processing-in-Memory architecture presents a promising solution to alleviate the data movement bottleneck that arises from transferring data between memory and…”
    Get full text
    Conference Proceeding
  14. 14

    ARCANE: Adaptive RISC-V Cache Architecture for Near-memory Extensions by Petrolo, Vincenzo, Guella, Flavia, Caon, Michele, Schiavone, Pasquale Davide, Masera, Guido, Martina, Maurizio

    Published: IEEE 22.06.2025
    “…Modern data-driven applications expose limitations of von Neumann architectures-extensive data movement, low throughput, and poor energy efficiency…”
    Get full text
    Conference Proceeding
  15. 15

    HH-PIM: Dynamic Optimization of Power and Performance with Heterogeneous-Hybrid PIM for Edge AI Devices by Jeon, Sangmin, Lee, Kangju, Lee, Kyeongwon, Lee, Woojoo

    Published: IEEE 22.06.2025
    “…Processing-in-Memory (PIM) architectures offer promising solutions for efficiently handling AI applications in energy-constrained edge environments. While…”
    Get full text
    Conference Proceeding
  16. 16

    RADiT: Redundancy-Aware Diffusion Transformer Acceleration Leveraging Timestep Similarity by Park, Youngjun, Kim, Sangyeon, Kim, Yeonggeon, Ji, Gisan, Ryu, Sungju

    Published: IEEE 22.06.2025
    “…Diffusion Transformers (DiTs) have demonstrated unprecedented performance across various generative tasks including image and video generation. However, a…”
    Get full text
    Conference Proceeding
  17. 17

    Late Breaking Results: FPGen-3D: Automated Framework for 3D-FPGA Architecture Generation and Exploration by Youssef, Ismael, Hao, Cong Callie

    Published: IEEE 22.06.2025
    “…In this work, we propose FPGen-3D, an automated framework for 3D field-programmable gate arrays (FPGA) architecture generation and exploration. FPGen-3D…”
    Get full text
    Conference Proceeding
  18. 18

    Configurable DSP-Based CAM Architecture for Data-Intensive Applications on FPGAs by Chen, Yao, Yu, Feng, Wu, Di, Wong, Weng-Fai, He, Bingsheng

    Published: IEEE 22.06.2025
    “…Content-addressable memory (CAM) is a type of fast memory unique in its ability to perform parallel searches of stored data based on content rather than…”
    Get full text
    Conference Proceeding
  19. 19

    VersaSlot: Efficient Fine-grained FPGA Sharing with Big.Little Slots and Live Migration in FPGA Cluster by Gu, Jianfeng, Wang, Hao, Guo, Xiaorang, Schulz, Martin, Gerndt, Michael

    Published: IEEE 22.06.2025
    “…As FPGAs gain popularity for on-demand application acceleration in data center computing, dynamic partial reconfiguration (DPR) has become an effective…”
    Get full text
    Conference Proceeding
  20. 20

    MambaOPU: An FPGA Overlay Processor for State-space-duality-based Mamba Models by Lu, Shaoqiang, Yu, Xuliang, Zhao, Tiandong, Miao, Siyuan, Sheng, Xinsong, Wu, Chen, Zhao, Liang, Lin, Ting-Jung, He, Lei

    Published: IEEE 22.06.2025
    “…State-space models (SSMs), such as Mamba, have emerged as a promising alternative to Transformers. However, the recently developed Mamba2, based on state space…”
    Get full text
    Conference Proceeding