Suchergebnisse - "Computer systems organisation Architectures Other architecture Reconfigurable computing"

Andere Suchmöglichkeiten:

  1. 1

    DRISA: a DRAM-based Reconfigurable In-Situ Accelerator von Li, Shuangchen, Niu, Dimin, Malladi, Krishna T., Zheng, Hongzhong, Brennan, Bob, Xie, Yuan

    ISBN: 1450349528, 9781450349529
    ISSN: 2379-3155
    Veröffentlicht: New York, NY, USA ACM 14.10.2017
    “… Data movement between the processing units and the memory in traditional von Neumann architecture is creating the "memory wall" problem. To bridge the gap, two …”
    Volltext
    Tagungsbericht
  2. 2

    Maximizing CNN accelerator efficiency through resource partitioning von Yongming Shen, Ferdman, Michael, Milder, Peter

    Veröffentlicht: ACM 01.06.2017
    “… Convolutional neural networks (CNNs) are revolutionizing machine learning, but they present significant computational challenges. Recently, many FPGA-based …”
    Volltext
    Tagungsbericht
  3. 3

    Stream-dataflow acceleration von Nowatzki, Tony, Gangadhar, Vinay, Ardalani, Newsha, Sankaralingam, Karthikeyan

    Veröffentlicht: ACM 01.06.2017
    “… Demand for low-power data processing hardware continues to rise inexorably. Existing programmable and "general purpose" solutions (eg. SIMD, GPGPUs) are …”
    Volltext
    Tagungsbericht
  4. 4

    Understanding and optimizing asynchronous low-precision stochastic gradient descent von De Sa, Christopher, Feldman, Matthew, Re, Christopher, Olukotun, Kunle

    Veröffentlicht: ACM 01.06.2017
    “… Stochastic gradient descent (SGD) is one of the most popular numerical algorithms used in machine learning and other domains. Since this is likely to continue …”
    Volltext
    Tagungsbericht
  5. 5

    Qubit Mapping for Reconfigurable Atom Arrays von Tan, Bochen, Bluvstein, Dolev, Lukin, Mikhail D., Cong, Jason

    ISSN: 1558-2434
    Veröffentlicht: ACM 29.10.2022
    “… Because of the largest number of qubits available, and the massive parallel execution of entangling two-qubit gates, atom arrays is a promising platform for …”
    Volltext
    Tagungsbericht
  6. 6

    Caffeine: Towards uniformed representation and acceleration for deep convolutional neural networks von Chen Zhang, Zhenman Fang, Peipei Zhou, Peichen Pan, Jason Cong

    ISSN: 1558-2434
    Veröffentlicht: ACM 01.11.2016
    “… With the recent advancement of multilayer convolutional neural networks (CNN), deep learning has achieved amazing success in many areas, especially in visual …”
    Volltext
    Tagungsbericht
  7. 7

    SODA: Stencil with Optimized Dataflow Architecture von Chi, Yuze, Cong, Jason, Wei, Peng, Zhou, Peipei

    ISSN: 1558-2434
    Veröffentlicht: ACM 01.11.2018
    “… Stencil computation is one of the most important kernels in many application domains such as image processing, solving partial differential equations, and …”
    Volltext
    Tagungsbericht
  8. 8

    TGPA: Tile-Grained Pipeline Architecture for Low Latency CNN Inference von Wei, Xuechao, Liang, Yun, Li, Xiuhong, Yu, Cody Hao, Zhang, Peng, Cong, Jason

    ISSN: 1558-2434
    Veröffentlicht: ACM 01.11.2018
    “… FPGAs are more and more widely used as reconfigurable hardware accelerators for applications leveraging convolutional neural networks (CNNs) in recent years …”
    Volltext
    Tagungsbericht
  9. 9

    MECLA: Memory-Compute-Efficient LLM Accelerator with Scaling Sub-matrix Partition von Qin, Yubin, Wang, Yang, Zhao, Zhiren, Yang, Xiaolong, Zhou, Yang, Wei, Shaojun, Hu, Yang, Yin, Shouyi

    Veröffentlicht: IEEE 29.06.2024
    “… Large language models (LLMs) have been showing surprising performance in processing language tasks, bringing a new prevalence to deploy LLM from cloud to edge …”
    Volltext
    Tagungsbericht
  10. 10

    Hardware-Aware Machine Learning: Modeling and Optimization von Marculescu, Diana, Stamoulis, Dimitrios, Cai, Ermao

    ISSN: 1558-2434
    Veröffentlicht: ACM 01.11.2018
    “… Recent breakthroughs in Machine Learning (ML) applications, and especially in Deep Learning (DL), have made DL models a key component in almost every modern …”
    Volltext
    Tagungsbericht
  11. 11

    Map-and-Conquer: Energy-Efficient Mapping of Dynamic Neural Nets onto Heterogeneous MPSoCs von Bouzidi, Halima, Odema, Mohanad, Ouarnoughi, Hamza, Niar, Smail, Al Faruque, Mohammad Abdullah

    Veröffentlicht: IEEE 09.07.2023
    “… Heterogeneous MPSoCs comprise diverse processing units of varying compute capabilities. To date, the mapping strategies of neural networks (NNs) onto such …”
    Volltext
    Tagungsbericht
  12. 12

    ARCANE: Adaptive RISC-V Cache Architecture for Near-memory Extensions von Petrolo, Vincenzo, Guella, Flavia, Caon, Michele, Schiavone, Pasquale Davide, Masera, Guido, Martina, Maurizio

    Veröffentlicht: IEEE 22.06.2025
    “… Modern data-driven applications expose limitations of von Neumann architectures-extensive data movement, low throughput, and poor energy efficiency …”
    Volltext
    Tagungsbericht
  13. 13

    A Memory-Efficient LLM Accelerator with Q-K Correlation Prediction using Cluster-Based Associative Array for Selective KV Accessing von Zhou, Zikang, Chen, Kaiqi, Duan, Xuyang, Han, Jun

    Veröffentlicht: IEEE 22.06.2025
    “… Attention-based LLMs excel in text generation but face redundant computations in autoregressive token generation. While KV cache mitigates this, it introduces …”
    Volltext
    Tagungsbericht
  14. 14

    Supporting Register-based Addressing Modes for in-DRAM PIM ISAs von Kim, Seok Young, Choi, Byung Ho, Kang, Seokwon, Park, Yongjun, Kim, Seon Wook

    Veröffentlicht: IEEE 22.06.2025
    “… Processing-in-Memory architecture presents a promising solution to alleviate the data movement bottleneck that arises from transferring data between memory and …”
    Volltext
    Tagungsbericht
  15. 15

    HH-PIM: Dynamic Optimization of Power and Performance with Heterogeneous-Hybrid PIM for Edge AI Devices von Jeon, Sangmin, Lee, Kangju, Lee, Kyeongwon, Lee, Woojoo

    Veröffentlicht: IEEE 22.06.2025
    “… Processing-in-Memory (PIM) architectures offer promising solutions for efficiently handling AI applications in energy-constrained edge environments. While …”
    Volltext
    Tagungsbericht
  16. 16

    HeteroSVD: Efficient SVD Accelerator on Versal ACAP with Algorithm-Hardware Co-Design von Luan, Xinya, Lin, Zhe, Shi, Kai, Zhai, Jianwang, Zhao, Kang

    Veröffentlicht: IEEE 22.06.2025
    “… Singular value decomposition (SVD) is a matrix factorization technique widely used in signal processing and recommendation systems, etc. In general, the time …”
    Volltext
    Tagungsbericht
  17. 17

    Dual-Issue Execution of Mixed Integer and Floating-Point Workloads on Energy-Efficient In-Order RISC-V Cores von Colagrande, Luca, Benini, Luca

    Veröffentlicht: IEEE 22.06.2025
    “… To meet the computational requirements of modern workloads under tight energy constraints, general-purpose accelerator architectures have to integrate an …”
    Volltext
    Tagungsbericht
  18. 18

    RADiT: Redundancy-Aware Diffusion Transformer Acceleration Leveraging Timestep Similarity von Park, Youngjun, Kim, Sangyeon, Kim, Yeonggeon, Ji, Gisan, Ryu, Sungju

    Veröffentlicht: IEEE 22.06.2025
    “… Diffusion Transformers (DiTs) have demonstrated unprecedented performance across various generative tasks including image and video generation. However, a …”
    Volltext
    Tagungsbericht
  19. 19

    Late Breaking Results: FPGen-3D: Automated Framework for 3D-FPGA Architecture Generation and Exploration von Youssef, Ismael, Hao, Cong Callie

    Veröffentlicht: IEEE 22.06.2025
    “… In this work, we propose FPGen-3D, an automated framework for 3D field-programmable gate arrays (FPGA) architecture generation and exploration. FPGen-3D …”
    Volltext
    Tagungsbericht
  20. 20

    Buffer Prospector: Discovering and Exploiting Untapped Buffer Resources in Many-Core DNN Accelerators von Wei, Yuchen, Cai, Jingwei, Gao, Mingyu, Peng, Sen, Wu, Zuotong, Shi, Guiming, Ma, Kaisheng

    Veröffentlicht: IEEE 22.06.2025
    “… In large-scale DNN inference accelerators, the many-core architecture has emerged as a predominant design, with layer-pipeline (LP) mapping being a mainstream …”
    Volltext
    Tagungsbericht