Suchergebnisse - Computation/Dataflow Optimization

  1. 1

    MnnFast: A Fast and Scalable System Architecture for Memory-Augmented Neural Networks von Jang, Hanhwi, Kim, Joonsung, Jo, Jae-Eon, Lee, Jaewon, Kim, Jangwoo

    ISSN: 2575-713X
    Veröffentlicht: ACM 01.06.2019
    “… Memory-augmented neural networks are getting more attention from many researchers as they can make an inference with the previous history stored in memory …”
    Volltext
    Tagungsbericht
  2. 2
  3. 3

    Reconfigurable Dataflow Optimization for Spatiotemporal Spiking Neural Computation on Systolic Array Accelerators von Lee, Jeong-Jun, Li, Peng

    ISSN: 2576-6996
    Veröffentlicht: IEEE 01.10.2020
    “… Recognizing the need for efficient processing of complex spatiotemporal data while considering the all-or-none nature of spiking activities, we propose holistic reconfigurable dataflow optimization …”
    Volltext
    Tagungsbericht
  4. 4

    VQ-LLM: High-performance Code Generation for Vector Quantization Augmented LLM Inference von Liu, Zihan, Luo, Xinhao, Guo, Junxian, Ni, Wentao, Zhou, Yangjie, Guan, Yue, Guo, Cong, Cui, Weihao, Feng, Yu, Guo, Minyi, Zhu, Yuhao, Zhang, Minjia, Jin, Chen, Leng, Jingwen

    ISSN: 2378-203X
    Veröffentlicht: IEEE 01.03.2025
    “… and uncoordinated computation dataflow. Meanwhile, the diversity of VQ algorithms (e.g., different vector sizes and entry counts …”
    Volltext
    Tagungsbericht
  5. 5

    Techniques for Efficient Performance Analysis and Memory Optimization in Mapping Dataflow Models of Computation Onto Embedded Systems von Luna, Mauro Martín Letras

    ISBN: 9798346386964
    Veröffentlicht: ProQuest Dissertations & Theses 01.01.2024
    “… The power of modern multi-core and many-core platforms is an excellent fit for meeting the performance needs of embedded software applications. However, there …”
    Volltext
    Dissertation
  6. 6

    A high-performance dataflow-centric optimization framework for deep learning inference on the edge von Zhang, Runhua, Jiang, Hongxu, Geng, Jinkun, Tian, Fangzheng, Ma, Yuhang, Wang, Haojie

    ISSN: 1383-7621, 1873-6165
    Veröffentlicht: Elsevier B.V 01.07.2024
    Veröffentlicht in Journal of systems architecture (01.07.2024)
    “… Targeting the existing drawbacks of operator-centric frameworks, we design Xenos, which can automatically conduct dataflow-centric optimization of the computation graph and accelerate inference in two dimensions …”
    Volltext
    Journal Article
  7. 7

    SWG: an architecture for sparse weight gradient computation von Wu, Weiwei, Tu, Fengbin, Li, Xiangyu, Wei, Shaojun, Yin, Shouyi

    ISSN: 1674-733X, 1869-1919
    Veröffentlicht: Beijing Science China Press 01.02.2024
    Veröffentlicht in Science China. Information sciences (01.02.2024)
    “… Nevertheless, exploiting the optimization opportunities would meet three underutilization problems, which are caused by (1 …”
    Volltext
    Journal Article
  8. 8

    AttentionLib: A Scalable Optimization Framework for Automated Attention Acceleration on FPGA von Liu, Zhenyu, Zhou, Xilang, Sun, Faxian, Chen, Jianli, Yu, Jun, Wang, Kun

    ISSN: 1558-1101
    Veröffentlicht: EDAA 31.03.2025
    “… AttentionLib automatically performs fusion dataflow optimization for attention computations and generates high-level synthesis code in compliance …”
    Volltext
    Tagungsbericht
  9. 9

    Efficient N:M Sparse DNN Training Using Algorithm, Architecture, and Dataflow Co-Design von Fang, Chao, Sun, Wei, Zhou, Aojun, Wang, Zhongfeng

    ISSN: 0278-0070, 1937-4151
    Veröffentlicht: New York IEEE 01.02.2024
    “… Sparse training is one of the promising techniques to reduce the computational cost of DNNs while retaining high accuracy. In particular, N:M fine-grained …”
    Volltext
    Journal Article
  10. 10

    Unleashing Network/Accelerator Co-Exploration Potential on FPGAs: A Deeper Joint Search von Lou, Wenqi, Gong, Lei, Wang, Chao, Qian, Jiaming, Wang, Xuan, Li, Changlong, Zhou, Xuehai

    ISSN: 0278-0070, 1937-4151
    Veröffentlicht: New York IEEE 01.10.2024
    “… Recently, algorithm-hardware (HW) co-exploration for neural networks (NNs) has become the key to obtaining high-quality solutions. However, previous efforts …”
    Volltext
    Journal Article
  11. 11

    Algorithm/Hardware Co-optimization for Sparsity-Aware SpMM Acceleration of GNNs von Gao, Yingxue, Gong, Lei, Wang, Chao, Wang, Teng, Li, Xi, Zhou, Xuehai

    ISSN: 0278-0070, 1937-4151
    Veröffentlicht: New York IEEE 01.12.2023
    “… So in this paper, we demonstrate an algorithm/hardware co-optimization chance to enhance SpMM acceleration for GNNs …”
    Volltext
    Journal Article
  12. 12

    SCnC: Efficient Unification of Streaming with Dynamic Task Parallelism von Sbîrlea, Dragoş, Shirako, Jun, Newton, Ryan, Sarkar, Vivek

    ISSN: 0885-7458, 1573-7640
    Veröffentlicht: New York Springer US 01.04.2016
    Veröffentlicht in International journal of parallel programming (01.04.2016)
    “… This work shows that it is possible to exploit streaming as a safe and automatic optimization of a more general dataflow-based model …”
    Volltext
    Journal Article
  13. 13

    DiMO-Sparse: Differentiable Modeling and Optimization of Sparse CNN Dataflow and Hardware Architecture von Song, Jianfeng, Liang, Rongjian, Gong, Yu, Yuan, Bo, Hu, Jiang

    ISSN: 1558-1101
    Veröffentlicht: EDAA 25.03.2024
    “… To the best of our knowledge, this paper presents the first systematic investigation of automatic dataflow and hardware optimization for sparse CNN computation …”
    Volltext
    Tagungsbericht
  14. 14

    Mechanisms Towards Energy-Efficient Dynamic Hardware Specialization von Ho, Chen-Han

    ISBN: 9781321384222, 132138422X
    Veröffentlicht: ProQuest Dissertations & Theses 01.01.2014
    “… In the past few decades, Von Neumann superscalar processors have been the prevalent approach for general purpose processing. Hardware specialization, as a …”
    Volltext
    Dissertation
  15. 15

    An FPGA-based efficient accelerator for fault interaction of rupture dynamics von Yuan, Ming, Liu, Qiang, Gan, Lin

    ISSN: 1573-0484, 0920-8542, 1573-0484
    Veröffentlicht: New York Springer Nature B.V 10.09.2025
    Veröffentlicht in The Journal of supercomputing (10.09.2025)
    “… Efficiently predicting aftershocks based on rupture dynamics simulation is a crucial task in high-performance computing, traditionally dependent on …”
    Volltext
    Journal Article
  16. 16

    SpikeFlow: A hardware–software co-designed systolic array for spiking neural networks von Wang, Jianan, Shi, Yang, Chen, Zhaoyun, Wen, Mei

    ISSN: 1383-7621
    Veröffentlicht: Elsevier B.V 01.12.2025
    Veröffentlicht in Journal of systems architecture (01.12.2025)
    “… Spiking neural networks (SNNs), often referred to as third-generation neural networks, offer substantial advantages in efficiency and power consumption, which …”
    Volltext
    Journal Article
  17. 17

    CHAUS: Scalable VM-Based Channels for Unbounded Streaming von Zhang, Yu, Yu, Yu-Fen, Cao, Hui-Fang, Chen, Jian-Kang, Zhang, Qi-Liang

    ISSN: 1000-9000, 1860-4749
    Veröffentlicht: New York Springer US 01.11.2017
    Veröffentlicht in Journal of computer science and technology (01.11.2017)
    “… Stream processing is a special form of the dataflow execution model that offers extensive opportunities for optimization and automatic parallelism …”
    Volltext
    Journal Article
  18. 18

    SPADE: Sparse Pillar-based 3D Object Detection Accelerator for Autonomous Driving von Lee, Minjae, Park, Seongmin, Kim, Hyungmin, Yoon, Minyong, Lee, Janghwan, Choi, Jun Won, Kim, Nam Sung, Kang, Mingu, Choi, Jungwook

    ISSN: 2378-203X
    Veröffentlicht: IEEE 02.03.2024
    “… 3D object detection using point cloud (PC) data is essential for perception pipelines of autonomous driving, where efficient encoding is key to meeting …”
    Volltext
    Tagungsbericht
  19. 19

    Dataflow computing models, languages, and machines for intelligence computations von Herath, J., Yamaguchi, Y., Saito, N., Yuba, T.

    ISSN: 0098-5589, 1939-3520
    Veröffentlicht: New York, NY IEEE 01.12.1988
    Veröffentlicht in IEEE Transactions on Software Engineering (01.12.1988)
    “… The authors compare dataflow computing models, languages, and dataflow computing machines for numerical and nonnumerical computations. The …”
    Volltext
    Journal Article
  20. 20

    PolyJuice: Detecting Mis-compilation Bugs in Tensor Compilers with Equality Saturation Based Rewriting von Zhou, Chijin, Qian, Bingzhou, Go, Gwihwan, Zhang, Quan, Li, Shanshan, Jiang, Yu

    ISSN: 2475-1421, 2475-1421
    Veröffentlicht: New York, NY, USA ACM 08.10.2024
    Veröffentlicht in Proceedings of ACM on programming languages (08.10.2024)
    “… The main challenge is to construct equivalent graphs capable of efficiently exploring the diverse optimization logic during compilation …”
    Volltext
    Journal Article