Výsledky vyhľadávania - "Hardware Integrated circuits Reconfigurable logic and FPGAs Hardware accelerators"

  1. 1

    Ambit: in-memory accelerator for bulk bitwise operations using commodity DRAM technology Autor Seshadri, Vivek, Lee, Donghyuk, Mullins, Thomas, Hassan, Hasan, Boroumand, Amirali, Kim, Jeremie, Kozuch, Michael A., Mutlu, Onur, Gibbons, Phillip B., Mowry, Todd C.

    ISBN: 1450349528, 9781450349529
    ISSN: 2379-3155
    Vydavateľské údaje: New York, NY, USA ACM 14.10.2017
    “…Many important applications trigger bulk bitwise operations, i.e., bitwise operations on large bit vectors. In fact, recent works design techniques that…”
    Získať plný text
    Konferenčný príspevok..
  2. 2
  3. 3

    GAMMA: Automating the HW Mapping of DNN Models on Accelerators via Genetic Algorithm Autor Kao, Sheng-Chun, Krishna, Tushar

    ISSN: 1558-2434
    Vydavateľské údaje: Association on Computer Machinery 02.11.2020
    “…DNN layers are multi-dimensional loops that can be ordered, tiled, and scheduled in myriad ways across space and time on DNN accelerators. Each of these…”
    Získať plný text
    Konferenčný príspevok..
  4. 4

    ANT: Exploiting Adaptive Numerical Data Type for Low-bit Deep Neural Network Quantization Autor Guo, Cong, Zhang, Chen, Leng, Jingwen, Liu, Zihan, Yang, Fan, Liu, Yunxin, Guo, Minyi, Zhu, Yuhao

    Vydavateľské údaje: IEEE 01.10.2022
    “…Quantization is a technique to reduce the computation and memory cost of DNN models, which are getting increasingly large. Existing quantization solutions use…”
    Získať plný text
    Konferenčný príspevok..
  5. 5

    Google Neural Network Models for Edge Devices: Analyzing and Mitigating Machine Learning Inference Bottlenecks Autor Boroumand, Amirali, Ghose, Saugata, Akin, Berkin, Narayanaswami, Ravi, Oliveira, Geraldo F., Ma, Xiaoyu, Shiu, Eric, Mutlu, Onur

    Vydavateľské údaje: IEEE 01.09.2021
    “…Emerging edge computing platforms often contain machine learning (ML) accelerators that can accelerate inference for a wide range of neural network (NN)…”
    Získať plný text
    Konferenčný príspevok..
  6. 6

    CoNDA: Efficient Cache Coherence Support for Near-Data Accelerators Autor Boroumand, Amirali, Ghose, Saugata, Patel, Minesh, Hassan, Hasan, Lucia, Brandon, Ausavarungnirun, Rachata, Hsieh, Kevin, Hajinazar, Nastaran, Malladi, Krishna T., Zheng, Hongzhong, Mutlu, Onur

    ISSN: 2575-713X
    Vydavateľské údaje: ACM 01.06.2019
    “…Specialized on-chip accelerators are widely used to improve the energy efficiency of computing systems. Recent advances in memory technology have enabled…”
    Získať plný text
    Konferenčný príspevok..
  7. 7

    Adaptable Butterfly Accelerator for Attention-based NNs via Hardware and Algorithm Co-design Autor Fan, Hongxiang, Chau, Thomas, Venieris, Stylianos I., Lee, Royson, Kouris, Alexandros, Luk, Wayne, Lane, Nicholas D., Abdelfattah, Mohamed S.

    Vydavateľské údaje: IEEE 01.10.2022
    “…Attention-based neural networks have become pervasive in many AI tasks. Despite their excellent algorithmic performance, the use of the attention mechanism and…”
    Získať plný text
    Konferenčný príspevok..
  8. 8

    NAAS: Neural Accelerator Architecture Search Autor Lin, Yujun, Yang, Mengtian, Han, Song

    Vydavateľské údaje: IEEE 05.12.2021
    “…Data-driven, automatic design space exploration of neural accelerator architecture is desirable for specialization and productivity. Previous frameworks focus…”
    Získať plný text
    Konferenčný príspevok..
  9. 9

    Laconic Deep Learning Inference Acceleration Autor Sharify, Sayeh, Lascorz, Alberto Delmas, Mahmoud, Mostafa, Nikolic, Milos, Siu, Kevin, Stuart, Dylan Malone, Poulos, Zissis, Moshovos, Andreas

    ISSN: 2575-713X
    Vydavateľské údaje: ACM 01.06.2019
    “…We present a method for transparently identifying ineffectual computations during inference with Deep Learning models. Specifically, by decomposing…”
    Získať plný text
    Konferenčný príspevok..
  10. 10

    DFX: A Low-latency Multi-FPGA Appliance for Accelerating Transformer-based Text Generation Autor Hong, Seongmin, Moon, Seungjae, Kim, Junsoo, Lee, Sungjae, Kim, Minsub, Lee, Dongsoo, Kim, Joo-Young

    Vydavateľské údaje: IEEE 01.10.2022
    “…Transformer is a deep learning language model widely used for natural language processing (NLP) services in datacenters. Among transformer models, Generative…”
    Získať plný text
    Konferenčný príspevok..
  11. 11

    PolySA: Polyhedral-Based Systolic Array Auto-Compilation Autor Cong, Jason, Wang, Jie

    ISSN: 1558-2434
    Vydavateľské údaje: ACM 01.11.2018
    “…Automatic systolic array generation has long been an interesting topic due to the need to reduce the lengthy development cycles of manual designs. Existing…”
    Získať plný text
    Konferenčný príspevok..
  12. 12

    CrossLight: A Cross-Layer Optimized Silicon Photonic Neural Network Accelerator Autor Sunny, Febin, Mirza, Asif, Nikdast, Mahdi, Pasricha, Sudeep

    Vydavateľské údaje: IEEE 05.12.2021
    “…Domain-specific neural network accelerators have seen growing interest in recent years due to their improved energy efficiency and performance compared to CPUs…”
    Získať plný text
    Konferenčný príspevok..
  13. 13

    Caffeine: Towards uniformed representation and acceleration for deep convolutional neural networks Autor Chen Zhang, Zhenman Fang, Peipei Zhou, Peichen Pan, Jason Cong

    ISSN: 1558-2434
    Vydavateľské údaje: ACM 01.11.2016
    “…With the recent advancement of multilayer convolutional neural networks (CNN), deep learning has achieved amazing success in many areas, especially in visual…”
    Získať plný text
    Konferenčný príspevok..
  14. 14

    SODA: Stencil with Optimized Dataflow Architecture Autor Chi, Yuze, Cong, Jason, Wei, Peng, Zhou, Peipei

    ISSN: 1558-2434
    Vydavateľské údaje: ACM 01.11.2018
    “…Stencil computation is one of the most important kernels in many application domains such as image processing, solving partial differential equations, and…”
    Získať plný text
    Konferenčný príspevok..
  15. 15

    MnnFast: A Fast and Scalable System Architecture for Memory-Augmented Neural Networks Autor Jang, Hanhwi, Kim, Joonsung, Jo, Jae-Eon, Lee, Jaewon, Kim, Jangwoo

    ISSN: 2575-713X
    Vydavateľské údaje: ACM 01.06.2019
    “…Memory-augmented neural networks are getting more attention from many researchers as they can make an inference with the previous history stored in memory…”
    Získať plný text
    Konferenčný príspevok..
  16. 16

    Efficient Hardware Acceleration of CNNs using Logarithmic Data Representation with Arbitrary log-base Autor Vogel, Sebastian, Liang, Mengyu, Guntoro, Andre, Stechele, Walter, Ascheid, Gerd

    ISSN: 1558-2434
    Vydavateľské údaje: ACM 01.11.2018
    “…Efficient acceleration of Deep Neural Networks is a manifold task. In order to save memory requirements and reduce energy consumption we propose the use of…”
    Získať plný text
    Konferenčný príspevok..
  17. 17

    LLMCompass: Enabling Efficient Hardware Design for Large Language Model Inference Autor Zhang, Hengrui, Ning, August, Prabhakar, Rohan Baskar, Wentzlaff, David

    Vydavateľské údaje: IEEE 29.06.2024
    “…The past year has witnessed the increasing popularity of Large Language Models (LLMs). Their unprecedented scale and associated high hardware cost have impeded…”
    Získať plný text
    Konferenčný príspevok..
  18. 18

    TGPA: Tile-Grained Pipeline Architecture for Low Latency CNN Inference Autor Wei, Xuechao, Liang, Yun, Li, Xiuhong, Yu, Cody Hao, Zhang, Peng, Cong, Jason

    ISSN: 1558-2434
    Vydavateľské údaje: ACM 01.11.2018
    “…FPGAs are more and more widely used as reconfigurable hardware accelerators for applications leveraging convolutional neural networks (CNNs) in recent years…”
    Získať plný text
    Konferenčný príspevok..
  19. 19

    Energy-Efficient Video Processing for Virtual Reality Autor Leng, Yue, Chen, Chi-Chun, Sun, Qiuyue, Huang, Jian, Zhu, Yuhao

    ISSN: 2575-713X
    Vydavateľské údaje: ACM 01.06.2019
    “…Virtual reality (VR) has huge potential to enable radically new applications, behind which spherical panoramic video processing is one of the backbone…”
    Získať plný text
    Konferenčný príspevok..
  20. 20

    MECLA: Memory-Compute-Efficient LLM Accelerator with Scaling Sub-matrix Partition Autor Qin, Yubin, Wang, Yang, Zhao, Zhiren, Yang, Xiaolong, Zhou, Yang, Wei, Shaojun, Hu, Yang, Yin, Shouyi

    Vydavateľské údaje: IEEE 29.06.2024
    “…Large language models (LLMs) have been showing surprising performance in processing language tasks, bringing a new prevalence to deploy LLM from cloud to edge…”
    Získať plný text
    Konferenčný príspevok..