Výsledky vyhľadávania - Computer systems organization Architectures Other architectures Reconfigurable computing

  1. 1

    DRISA: a DRAM-based Reconfigurable In-Situ Accelerator Autor Li, Shuangchen, Niu, Dimin, Malladi, Krishna T., Zheng, Hongzhong, Brennan, Bob, Xie, Yuan

    ISBN: 1450349528, 9781450349529
    ISSN: 2379-3155
    Vydavateľské údaje: New York, NY, USA ACM 14.10.2017
    “… To address the challenge, we propose DRISA, a DRAM-based Reconfigurable In-Situ Accelerator architecture, to provide both powerful computing capability and large memory capacity/bandwidth…”
    Získať plný text
    Konferenčný príspevok..
  2. 2

    Caffeine: Towards uniformed representation and acceleration for deep convolutional neural networks Autor Chen Zhang, Zhenman Fang, Peipei Zhou, Peichen Pan, Jason Cong

    ISSN: 1558-2434
    Vydavateľské údaje: ACM 01.11.2016
    “… Second, we design Caffeine with the goal to maximize the underlying FPGA computing and bandwidth…”
    Získať plný text
    Konferenčný príspevok..
  3. 3

    Stream-dataflow acceleration Autor Nowatzki, Tony, Gangadhar, Vinay, Ardalani, Newsha, Sankaralingam, Karthikeyan

    Vydavateľské údaje: ACM 01.06.2017
    “…) are insufficient, as evidenced by the order-of-magnitude improvements and industry adoption of application and domain-specific accelerators in important areas like machine learning, computer vision and big data…”
    Získať plný text
    Konferenčný príspevok..
  4. 4

    SODA: Stencil with Optimized Dataflow Architecture Autor Chi, Yuze, Cong, Jason, Wei, Peng, Zhou, Peipei

    ISSN: 1558-2434
    Vydavateľské údaje: ACM 01.11.2018
    “… In this paper we present SODA, an automated framework for implementing Stencil algorithms with Optimized Dataflow Architecture on FPGAs…”
    Získať plný text
    Konferenčný príspevok..
  5. 5

    Maximizing CNN accelerator efficiency through resource partitioning Autor Yongming Shen, Ferdman, Michael, Milder, Peter

    Vydavateľské údaje: ACM 01.06.2017
    “…Convolutional neural networks (CNNs) are revolutionizing machine learning, but they present significant computational challenges. Recently, many FPGA-based…”
    Získať plný text
    Konferenčný príspevok..
  6. 6

    Qubit Mapping for Reconfigurable Atom Arrays Autor Tan, Bochen, Bluvstein, Dolev, Lukin, Mikhail D., Cong, Jason

    ISSN: 1558-2434
    Vydavateľské údaje: ACM 29.10.2022
    “…Because of the largest number of qubits available, and the massive parallel execution of entangling two-qubit gates, atom arrays is a promising platform for quantum computing…”
    Získať plný text
    Konferenčný príspevok..
  7. 7

    FEATHER: A Reconfigurable Accelerator with Data Reordering Support for Low-Cost On-Chip Dataflow Switching Autor Tong, Jianming, Itagi, Anirudh, Chatarasi, Prasanth, Krishna, Tushar

    Vydavateľské údaje: IEEE 29.06.2024
    “…The inference of ML models composed of diverse structures, types, and sizes boils down to the execution of different dataflows (i.e. different tiling,…”
    Získať plný text
    Konferenčný príspevok..
  8. 8

    Understanding and optimizing asynchronous low-precision stochastic gradient descent Autor De Sa, Christopher, Feldman, Matthew, Re, Christopher, Olukotun, Kunle

    Vydavateľské údaje: ACM 01.06.2017
    “…Stochastic gradient descent (SGD) is one of the most popular numerical algorithms used in machine learning and other domains. Since this is likely to continue…”
    Získať plný text
    Konferenčný príspevok..
  9. 9

    MECLA: Memory-Compute-Efficient LLM Accelerator with Scaling Sub-matrix Partition Autor Qin, Yubin, Wang, Yang, Zhao, Zhiren, Yang, Xiaolong, Zhou, Yang, Wei, Shaojun, Hu, Yang, Yin, Shouyi

    Vydavateľské údaje: IEEE 29.06.2024
    “…Large language models (LLMs) have been showing surprising performance in processing language tasks, bringing a new prevalence to deploy LLM from cloud to edge…”
    Získať plný text
    Konferenčný príspevok..
  10. 10

    TGPA: Tile-Grained Pipeline Architecture for Low Latency CNN Inference Autor Wei, Xuechao, Liang, Yun, Li, Xiuhong, Yu, Cody Hao, Zhang, Peng, Cong, Jason

    ISSN: 1558-2434
    Vydavateľské údaje: ACM 01.11.2018
    “…FPGAs are more and more widely used as reconfigurable hardware accelerators for applications leveraging convolutional neural networks (CNNs) in recent years…”
    Získať plný text
    Konferenčný príspevok..
  11. 11

    Map-and-Conquer: Energy-Efficient Mapping of Dynamic Neural Nets onto Heterogeneous MPSoCs Autor Bouzidi, Halima, Odema, Mohanad, Ouarnoughi, Hamza, Niar, Smail, Al Faruque, Mohammad Abdullah

    Vydavateľské údaje: IEEE 09.07.2023
    “… To date, the mapping strategies of neural networks (NNs) onto such systems are yet to exploit the full potential of processing parallelism, made possible through both the intrinsic NNs' structure and underlying hardware composition…”
    Získať plný text
    Konferenčný príspevok..
  12. 12

    HAL: Hardware-assisted Load Balancing for Energy-efficient SNIC-Host Cooperative Computing Autor Huang, Jinghan, Lou, Jiaqi, Vanavasam, Srikar, Kong, Xinhao, Ji, Houxiang, Jeong, Ipoom, Zhuo, Danyang, Lee, Eun Kyung, Kim, Nam Sung

    Vydavateľské údaje: IEEE 29.06.2024
    “… With such a processor, the SNIC has promised to notably improve the system-wide energy efficiency of datacenter servers…”
    Získať plný text
    Konferenčný príspevok..
  13. 13

    MambaOPU: An FPGA Overlay Processor for State-space-duality-based Mamba Models Autor Lu, Shaoqiang, Yu, Xuliang, Zhao, Tiandong, Miao, Siyuan, Sheng, Xinsong, Wu, Chen, Zhao, Liang, Lin, Ting-Jung, He, Lei

    Vydavateľské údaje: IEEE 22.06.2025
    “…State-space models (SSMs), such as Mamba, have emerged as a promising alternative to Transformers. However, the recently developed Mamba2, based on state space…”
    Získať plný text
    Konferenčný príspevok..
  14. 14

    CoSPARSE: A Software and Hardware Reconfigurable SpMV Framework for Graph Analytics Autor Feng, Siying, Sun, Jiawen, Pal, Subhankar, He, Xin, Kaszyk, Kuba, Park, Dong-hyeon, Morton, Magnus, Mudge, Trevor, Cole, Murray, O'Boyle, Michael, Chakrabarti, Chaitali, Dreslinski, Ronald

    Vydavateľské údaje: IEEE 05.12.2021
    “… reconfiguration as a synergistic solution to accelerate SpMV-based graph analytics algorithms. Building on previously proposed general-purpose reconfigurable hardware…”
    Získať plný text
    Konferenčný príspevok..
  15. 15

    Heterogeneous Reconfigurable Accelerators: Trends and Perspectives Autor Luk, Wayne

    Vydavateľské údaje: IEEE 09.07.2023
    “…Heterogeneity and reconfigurability have both been adopted by accelerators to improve their flexibility and efficiency for a wide variety of applications, from cloud computing to embedded systems…”
    Získať plný text
    Konferenčný príspevok..
  16. 16

    Hardware-Aware Machine Learning: Modeling and Optimization Autor Marculescu, Diana, Stamoulis, Dimitrios, Cai, Ermao

    ISSN: 1558-2434
    Vydavateľské údaje: ACM 01.11.2018
    “…), have made DL models a key component in almost every modern computing system. The increased popularity of DL applications deployed on a wide-spectrum of platforms…”
    Získať plný text
    Konferenčný príspevok..
  17. 17

    MASR: A Modular Accelerator for Sparse RNNs Autor Gupta, Udit, Reagen, Brandon, Pentecost, Lillian, Donato, Marco, Tambe, Thierry, Rush, Alexander M., Wei, Gu-Yeon, Brooks, David

    ISSN: 2641-7936
    Vydavateľské údaje: IEEE 01.09.2019
    “… In this paper we present MASR, a principled and modular architecture that accelerates bidirectional RNNs for on-chip ASR…”
    Získať plný text
    Konferenčný príspevok..
  18. 18

    RASA: Efficient Register-Aware Systolic Array Matrix Engine for CPU Autor Jeong, Geonhwa, Qin, Eric, Samajdar, Ananda, Hughes, Christopher J., Subramoney, Sreenivas, Kim, Hyesoon, Krishna, Tushar

    Vydavateľské údaje: IEEE 05.12.2021
    “…As AI-based applications become pervasive, CPU vendors are starting to incorporate matrix engines within the datapath to boost efficiency. Systolic arrays have…”
    Získať plný text
    Konferenčný príspevok..
  19. 19

    RADiT: Redundancy-Aware Diffusion Transformer Acceleration Leveraging Timestep Similarity Autor Park, Youngjun, Kim, Sangyeon, Kim, Yeonggeon, Ji, Gisan, Ryu, Sungju

    Vydavateľské údaje: IEEE 22.06.2025
    “…Diffusion Transformers (DiTs) have demonstrated unprecedented performance across various generative tasks including image and video generation. However, a…”
    Získať plný text
    Konferenčný príspevok..
  20. 20

    Buffer Prospector: Discovering and Exploiting Untapped Buffer Resources in Many-Core DNN Accelerators Autor Wei, Yuchen, Cai, Jingwei, Gao, Mingyu, Peng, Sen, Wu, Zuotong, Shi, Guiming, Ma, Kaisheng

    Vydavateľské údaje: IEEE 22.06.2025
    “…In large-scale DNN inference accelerators, the many-core architecture has emerged as a predominant design, with layer-pipeline (LP…”
    Získať plný text
    Konferenčný príspevok..