Suchergebnisse - Computation/Dataflow Optimization

1

Wird geladen …

MnnFast: A Fast and Scalable System Architecture for Memory-Augmented Neural Networks von Jang, Hanhwi, Kim, Joonsung, Jo, Jae-Eon, Lee, Jaewon, Kim, Jangwoo

ISSN: 2575-713X

Veröffentlicht: ACM 01.06.2019

Veröffentlicht in 2019 ACM/IEEE 46th Annual International Symposium on Computer Architecture (ISCA) (01.06.2019)
“… Memory-augmented neural networks are getting more attention from many researchers as they can make an inference with the previous history stored in memory …”

Volltext

Tagungsbericht

Zu den Favoriten

Gespeichert in:
2

Wird geladen …

NLP-Fast: A Fast, Scalable, and Flexible System to Accelerate Large-Scale Heterogeneous NLP Models von Kim, Joonsung, Hur, Suyeon, Lee, Eunbok, Lee, Seungho, Kim, Jangwoo

Veröffentlicht: IEEE 01.09.2021

Veröffentlicht in 2021 30th International Conference on Parallel Architectures and Compilation Techniques (PACT) (01.09.2021)
“… : three end-to-end optimization techniques to accelerate …”

Volltext

Tagungsbericht

Zu den Favoriten

Gespeichert in:
3

Wird geladen …

Reconfigurable Dataflow Optimization for Spatiotemporal Spiking Neural Computation on Systolic Array Accelerators von Lee, Jeong-Jun, Li, Peng

ISSN: 2576-6996

Veröffentlicht: IEEE 01.10.2020

Veröffentlicht in Proceedings - IEEE International Conference on Computer Design (01.10.2020)
“… Recognizing the need for efficient processing of complex spatiotemporal data while considering the all-or-none nature of spiking activities, we propose holistic reconfigurable dataflow optimization …”

Volltext

Tagungsbericht

Zu den Favoriten

Gespeichert in:
4

Wird geladen …

VQ-LLM: High-performance Code Generation for Vector Quantization Augmented LLM Inference von Liu, Zihan, Luo, Xinhao, Guo, Junxian, Ni, Wentao, Zhou, Yangjie, Guan, Yue, Guo, Cong, Cui, Weihao, Feng, Yu, Guo, Minyi, Zhu, Yuhao, Zhang, Minjia, Jin, Chen, Leng, Jingwen

ISSN: 2378-203X

Veröffentlicht: IEEE 01.03.2025

Veröffentlicht in Proceedings - International Symposium on High-Performance Computer Architecture (01.03.2025)
“… and uncoordinated computation dataflow. Meanwhile, the diversity of VQ algorithms (e.g., different vector sizes and entry counts …”

Volltext

Tagungsbericht

Zu den Favoriten

Gespeichert in:
5

Wird geladen …

Techniques for Efficient Performance Analysis and Memory Optimization in Mapping Dataflow Models of Computation Onto Embedded Systems von Luna, Mauro Martín Letras

ISBN: 9798346386964

Veröffentlicht: ProQuest Dissertations & Theses 01.01.2024

“… The power of modern multi-core and many-core platforms is an excellent fit for meeting the performance needs of embedded software applications. However, there …”

Volltext

Dissertation

Zu den Favoriten

Gespeichert in:
6

Wird geladen …

A high-performance dataflow-centric optimization framework for deep learning inference on the edge von Zhang, Runhua, Jiang, Hongxu, Geng, Jinkun, Tian, Fangzheng, Ma, Yuhang, Wang, Haojie

ISSN: 1383-7621, 1873-6165

Veröffentlicht: Elsevier B.V 01.07.2024

Veröffentlicht in Journal of systems architecture (01.07.2024)
“… Targeting the existing drawbacks of operator-centric frameworks, we design Xenos, which can automatically conduct dataflow-centric optimization of the computation graph and accelerate inference in two dimensions …”

Volltext

Journal Article

Zu den Favoriten

Gespeichert in:
7

Wird geladen …

SWG: an architecture for sparse weight gradient computation von Wu, Weiwei, Tu, Fengbin, Li, Xiangyu, Wei, Shaojun, Yin, Shouyi

ISSN: 1674-733X, 1869-1919

Veröffentlicht: Beijing Science China Press 01.02.2024

Veröffentlicht in Science China. Information sciences (01.02.2024)
“… Nevertheless, exploiting the optimization opportunities would meet three underutilization problems, which are caused by (1 …”

Volltext

Journal Article

Zu den Favoriten

Gespeichert in:
8

Wird geladen …

AttentionLib: A Scalable Optimization Framework for Automated Attention Acceleration on FPGA von Liu, Zhenyu, Zhou, Xilang, Sun, Faxian, Chen, Jianli, Yu, Jun, Wang, Kun

ISSN: 1558-1101

Veröffentlicht: EDAA 31.03.2025

Veröffentlicht in Proceedings - Design, Automation, and Test in Europe Conference and Exhibition (31.03.2025)
“… AttentionLib automatically performs fusion dataflow optimization for attention computations and generates high-level synthesis code in compliance …”

Volltext

Tagungsbericht

Zu den Favoriten

Gespeichert in:
9

Wird geladen …

Efficient N:M Sparse DNN Training Using Algorithm, Architecture, and Dataflow Co-Design von Fang, Chao, Sun, Wei, Zhou, Aojun, Wang, Zhongfeng

ISSN: 0278-0070, 1937-4151

Veröffentlicht: New York IEEE 01.02.2024

Veröffentlicht in IEEE transactions on computer-aided design of integrated circuits and systems (01.02.2024)
“… Sparse training is one of the promising techniques to reduce the computational cost of DNNs while retaining high accuracy. In particular, N:M fine-grained …”

Volltext

Journal Article

Zu den Favoriten

Gespeichert in:
10

Wird geladen …

Unleashing Network/Accelerator Co-Exploration Potential on FPGAs: A Deeper Joint Search von Lou, Wenqi, Gong, Lei, Wang, Chao, Qian, Jiaming, Wang, Xuan, Li, Changlong, Zhou, Xuehai

ISSN: 0278-0070, 1937-4151

Veröffentlicht: New York IEEE 01.10.2024

Veröffentlicht in IEEE transactions on computer-aided design of integrated circuits and systems (01.10.2024)
“… Recently, algorithm-hardware (HW) co-exploration for neural networks (NNs) has become the key to obtaining high-quality solutions. However, previous efforts …”

Volltext

Journal Article

Zu den Favoriten

Gespeichert in:
11

Wird geladen …

Algorithm/Hardware Co-optimization for Sparsity-Aware SpMM Acceleration of GNNs von Gao, Yingxue, Gong, Lei, Wang, Chao, Wang, Teng, Li, Xi, Zhou, Xuehai

ISSN: 0278-0070, 1937-4151

Veröffentlicht: New York IEEE 01.12.2023

Veröffentlicht in IEEE transactions on computer-aided design of integrated circuits and systems (01.12.2023)
“… So in this paper, we demonstrate an algorithm/hardware co-optimization chance to enhance SpMM acceleration for GNNs …”

Volltext

Journal Article

Zu den Favoriten

Gespeichert in:
12

Wird geladen …

SCnC: Efficient Unification of Streaming with Dynamic Task Parallelism von Sbîrlea, Dragoş, Shirako, Jun, Newton, Ryan, Sarkar, Vivek

ISSN: 0885-7458, 1573-7640

Veröffentlicht: New York Springer US 01.04.2016

Veröffentlicht in International journal of parallel programming (01.04.2016)
“… This work shows that it is possible to exploit streaming as a safe and automatic optimization of a more general dataflow-based model …”

Volltext

Journal Article

Zu den Favoriten

Gespeichert in:
13

Wird geladen …

DiMO-Sparse: Differentiable Modeling and Optimization of Sparse CNN Dataflow and Hardware Architecture von Song, Jianfeng, Liang, Rongjian, Gong, Yu, Yuan, Bo, Hu, Jiang

ISSN: 1558-1101

Veröffentlicht: EDAA 25.03.2024

Veröffentlicht in Proceedings - Design, Automation, and Test in Europe Conference and Exhibition (25.03.2024)
“… To the best of our knowledge, this paper presents the first systematic investigation of automatic dataflow and hardware optimization for sparse CNN computation …”

Volltext

Tagungsbericht

Zu den Favoriten

Gespeichert in:
14

Wird geladen …

Mechanisms Towards Energy-Efficient Dynamic Hardware Specialization von Ho, Chen-Han

ISBN: 9781321384222, 132138422X

Veröffentlicht: ProQuest Dissertations & Theses 01.01.2014

“… In the past few decades, Von Neumann superscalar processors have been the prevalent approach for general purpose processing. Hardware specialization, as a …”

Volltext

Dissertation

Zu den Favoriten

Gespeichert in:
15

Wird geladen …

An FPGA-based efficient accelerator for fault interaction of rupture dynamics von Yuan, Ming, Liu, Qiang, Gan, Lin

ISSN: 1573-0484, 0920-8542, 1573-0484

Veröffentlicht: New York Springer Nature B.V 10.09.2025

Veröffentlicht in The Journal of supercomputing (10.09.2025)
“… Efficiently predicting aftershocks based on rupture dynamics simulation is a crucial task in high-performance computing, traditionally dependent on …”

Volltext

Journal Article

Zu den Favoriten

Gespeichert in:
16

Wird geladen …

SpikeFlow: A hardware–software co-designed systolic array for spiking neural networks von Wang, Jianan, Shi, Yang, Chen, Zhaoyun, Wen, Mei

ISSN: 1383-7621

Veröffentlicht: Elsevier B.V 01.12.2025

Veröffentlicht in Journal of systems architecture (01.12.2025)
“… Spiking neural networks (SNNs), often referred to as third-generation neural networks, offer substantial advantages in efficiency and power consumption, which …”

Volltext

Journal Article

Zu den Favoriten

Gespeichert in:
17

Wird geladen …

CHAUS： Scalable VM-Based Channels for Unbounded Streaming von Zhang, Yu, Yu, Yu-Fen, Cao, Hui-Fang, Chen, Jian-Kang, Zhang, Qi-Liang

ISSN: 1000-9000, 1860-4749

Veröffentlicht: New York Springer US 01.11.2017

Veröffentlicht in Journal of computer science and technology (01.11.2017)
“… Stream processing is a special form of the dataflow execution model that offers extensive opportunities for optimization and automatic parallelism …”

Volltext

Journal Article

Zu den Favoriten

Gespeichert in:
18

Wird geladen …

SPADE: Sparse Pillar-based 3D Object Detection Accelerator for Autonomous Driving von Lee, Minjae, Park, Seongmin, Kim, Hyungmin, Yoon, Minyong, Lee, Janghwan, Choi, Jun Won, Kim, Nam Sung, Kang, Mingu, Choi, Jungwook

ISSN: 2378-203X

Veröffentlicht: IEEE 02.03.2024

Veröffentlicht in Proceedings - International Symposium on High-Performance Computer Architecture (02.03.2024)
“… 3D object detection using point cloud (PC) data is essential for perception pipelines of autonomous driving, where efficient encoding is key to meeting …”

Volltext

Tagungsbericht

Zu den Favoriten

Gespeichert in:
19

Wird geladen …

Dataflow computing models, languages, and machines for intelligence computations von Herath, J., Yamaguchi, Y., Saito, N., Yuba, T.

ISSN: 0098-5589, 1939-3520

Veröffentlicht: New York, NY IEEE 01.12.1988

Veröffentlicht in IEEE Transactions on Software Engineering (01.12.1988)
“… The authors compare dataflow computing models, languages, and dataflow computing machines for numerical and nonnumerical computations. The …”

Volltext

Journal Article

Zu den Favoriten

Gespeichert in:
20

Wird geladen …

PolyJuice: Detecting Mis-compilation Bugs in Tensor Compilers with Equality Saturation Based Rewriting von Zhou, Chijin, Qian, Bingzhou, Go, Gwihwan, Zhang, Quan, Li, Shanshan, Jiang, Yu

ISSN: 2475-1421, 2475-1421

Veröffentlicht: New York, NY, USA ACM 08.10.2024

Veröffentlicht in Proceedings of ACM on programming languages (08.10.2024)
“… The main challenge is to construct equivalent graphs capable of efficiently exploring the diverse optimization logic during compilation …”

Volltext

Journal Article

Zu den Favoriten

Gespeichert in:

Suchergebnisse - Computation/Dataflow Optimization

MnnFast: A Fast and Scalable System Architecture for Memory-Augmented Neural Networks von Jang, Hanhwi, Kim, Joonsung, Jo, Jae-Eon, Lee, Jaewon, Kim, Jangwoo

NLP-Fast: A Fast, Scalable, and Flexible System to Accelerate Large-Scale Heterogeneous NLP Models von Kim, Joonsung, Hur, Suyeon, Lee, Eunbok, Lee, Seungho, Kim, Jangwoo

Reconfigurable Dataflow Optimization for Spatiotemporal Spiking Neural Computation on Systolic Array Accelerators von Lee, Jeong-Jun, Li, Peng

VQ-LLM: High-performance Code Generation for Vector Quantization Augmented LLM Inference von Liu, Zihan, Luo, Xinhao, Guo, Junxian, Ni, Wentao, Zhou, Yangjie, Guan, Yue, Guo, Cong, Cui, Weihao, Feng, Yu, Guo, Minyi, Zhu, Yuhao, Zhang, Minjia, Jin, Chen, Leng, Jingwen

Techniques for Efficient Performance Analysis and Memory Optimization in Mapping Dataflow Models of Computation Onto Embedded Systems von Luna, Mauro Martín Letras

A high-performance dataflow-centric optimization framework for deep learning inference on the edge von Zhang, Runhua, Jiang, Hongxu, Geng, Jinkun, Tian, Fangzheng, Ma, Yuhang, Wang, Haojie

SWG: an architecture for sparse weight gradient computation von Wu, Weiwei, Tu, Fengbin, Li, Xiangyu, Wei, Shaojun, Yin, Shouyi

AttentionLib: A Scalable Optimization Framework for Automated Attention Acceleration on FPGA von Liu, Zhenyu, Zhou, Xilang, Sun, Faxian, Chen, Jianli, Yu, Jun, Wang, Kun

Efficient N:M Sparse DNN Training Using Algorithm, Architecture, and Dataflow Co-Design von Fang, Chao, Sun, Wei, Zhou, Aojun, Wang, Zhongfeng

Unleashing Network/Accelerator Co-Exploration Potential on FPGAs: A Deeper Joint Search von Lou, Wenqi, Gong, Lei, Wang, Chao, Qian, Jiaming, Wang, Xuan, Li, Changlong, Zhou, Xuehai

Algorithm/Hardware Co-optimization for Sparsity-Aware SpMM Acceleration of GNNs von Gao, Yingxue, Gong, Lei, Wang, Chao, Wang, Teng, Li, Xi, Zhou, Xuehai

SCnC: Efficient Unification of Streaming with Dynamic Task Parallelism von Sbîrlea, Dragoş, Shirako, Jun, Newton, Ryan, Sarkar, Vivek

DiMO-Sparse: Differentiable Modeling and Optimization of Sparse CNN Dataflow and Hardware Architecture von Song, Jianfeng, Liang, Rongjian, Gong, Yu, Yuan, Bo, Hu, Jiang

Mechanisms Towards Energy-Efficient Dynamic Hardware Specialization von Ho, Chen-Han

An FPGA-based efficient accelerator for fault interaction of rupture dynamics von Yuan, Ming, Liu, Qiang, Gan, Lin

SpikeFlow: A hardware–software co-designed systolic array for spiking neural networks von Wang, Jianan, Shi, Yang, Chen, Zhaoyun, Wen, Mei

CHAUS： Scalable VM-Based Channels for Unbounded Streaming von Zhang, Yu, Yu, Yu-Fen, Cao, Hui-Fang, Chen, Jian-Kang, Zhang, Qi-Liang

SPADE: Sparse Pillar-based 3D Object Detection Accelerator for Autonomous Driving von Lee, Minjae, Park, Seongmin, Kim, Hyungmin, Yoon, Minyong, Lee, Janghwan, Choi, Jun Won, Kim, Nam Sung, Kang, Mingu, Choi, Jungwook

Dataflow computing models, languages, and machines for intelligence computations von Herath, J., Yamaguchi, Y., Saito, N., Yuba, T.

PolyJuice: Detecting Mis-compilation Bugs in Tensor Compilers with Equality Saturation Based Rewriting von Zhou, Chijin, Qian, Bingzhou, Go, Gwihwan, Zhang, Quan, Li, Shanshan, Jiang, Yu

Suchwerkzeuge:

Treffer weiter einschränken

Format

Schlagwortumfeld

Thema

Sprache

Erscheinungsjahr