Search Results - Computation/Dataflow Optimization

1

Loading…

MnnFast: A Fast and Scalable System Architecture for Memory-Augmented Neural Networks by Jang, Hanhwi, Kim, Joonsung, Jo, Jae-Eon, Lee, Jaewon, Kim, Jangwoo

ISSN: 2575-713X

Published: ACM 01.06.2019

Published in 2019 ACM/IEEE 46th Annual International Symposium on Computer Architecture (ISCA) (01.06.2019)
“…Memory-augmented neural networks are getting more attention from many researchers as they can make an inference with the previous history stored in memory…”

Get full text

Conference Proceeding

Save to List

Saved in:
2

Loading…

NLP-Fast: A Fast, Scalable, and Flexible System to Accelerate Large-Scale Heterogeneous NLP Models by Kim, Joonsung, Hur, Suyeon, Lee, Eunbok, Lee, Seungho, Kim, Jangwoo

Published: IEEE 01.09.2021

Published in 2021 30th International Conference on Parallel Architectures and Compilation Techniques (PACT) (01.09.2021)
“…: three end-to-end optimization techniques to accelerate…”

Get full text

Conference Proceeding

Save to List

Saved in:
3

Loading…

Reconfigurable Dataflow Optimization for Spatiotemporal Spiking Neural Computation on Systolic Array Accelerators by Lee, Jeong-Jun, Li, Peng

ISSN: 2576-6996

Published: IEEE 01.10.2020

Published in Proceedings - IEEE International Conference on Computer Design (01.10.2020)
“… Recognizing the need for efficient processing of complex spatiotemporal data while considering the all-or-none nature of spiking activities, we propose holistic reconfigurable dataflow optimization…”

Get full text

Conference Proceeding

Save to List

Saved in:
4

Loading…

VQ-LLM: High-performance Code Generation for Vector Quantization Augmented LLM Inference by Liu, Zihan, Luo, Xinhao, Guo, Junxian, Ni, Wentao, Zhou, Yangjie, Guan, Yue, Guo, Cong, Cui, Weihao, Feng, Yu, Guo, Minyi, Zhu, Yuhao, Zhang, Minjia, Jin, Chen, Leng, Jingwen

ISSN: 2378-203X

Published: IEEE 01.03.2025

Published in Proceedings - International Symposium on High-Performance Computer Architecture (01.03.2025)
“… and uncoordinated computation dataflow. Meanwhile, the diversity of VQ algorithms (e.g., different vector sizes and entry counts…”

Get full text

Conference Proceeding

Save to List

Saved in:
5

Loading…

Techniques for Efficient Performance Analysis and Memory Optimization in Mapping Dataflow Models of Computation Onto Embedded Systems by Luna, Mauro Martín Letras

ISBN: 9798346386964

Published: ProQuest Dissertations & Theses 01.01.2024

“…The power of modern multi-core and many-core platforms is an excellent fit for meeting the performance needs of embedded software applications. However, there…”

Get full text

Dissertation

Save to List

Saved in:
6

Loading…

A high-performance dataflow-centric optimization framework for deep learning inference on the edge by Zhang, Runhua, Jiang, Hongxu, Geng, Jinkun, Tian, Fangzheng, Ma, Yuhang, Wang, Haojie

ISSN: 1383-7621, 1873-6165

Published: Elsevier B.V 01.07.2024

Published in Journal of systems architecture (01.07.2024)
“… Targeting the existing drawbacks of operator-centric frameworks, we design Xenos, which can automatically conduct dataflow-centric optimization of the computation graph and accelerate inference in two dimensions…”

Get full text

Journal Article

Save to List

Saved in:
7

Loading…

SWG: an architecture for sparse weight gradient computation by Wu, Weiwei, Tu, Fengbin, Li, Xiangyu, Wei, Shaojun, Yin, Shouyi

ISSN: 1674-733X, 1869-1919

Published: Beijing Science China Press 01.02.2024

Published in Science China. Information sciences (01.02.2024)
“… Nevertheless, exploiting the optimization opportunities would meet three underutilization problems, which are caused by (1…”

Get full text

Journal Article

Save to List

Saved in:
8

Loading…

AttentionLib: A Scalable Optimization Framework for Automated Attention Acceleration on FPGA by Liu, Zhenyu, Zhou, Xilang, Sun, Faxian, Chen, Jianli, Yu, Jun, Wang, Kun

ISSN: 1558-1101

Published: EDAA 31.03.2025

Published in Proceedings - Design, Automation, and Test in Europe Conference and Exhibition (31.03.2025)
“… AttentionLib automatically performs fusion dataflow optimization for attention computations and generates high-level synthesis code in compliance…”

Get full text

Conference Proceeding

Save to List

Saved in:
9

Loading…

Efficient N:M Sparse DNN Training Using Algorithm, Architecture, and Dataflow Co-Design by Fang, Chao, Sun, Wei, Zhou, Aojun, Wang, Zhongfeng

ISSN: 0278-0070, 1937-4151

Published: New York IEEE 01.02.2024

Published in IEEE transactions on computer-aided design of integrated circuits and systems (01.02.2024)
“…Sparse training is one of the promising techniques to reduce the computational cost of DNNs while retaining high accuracy. In particular, N:M fine-grained…”

Get full text

Journal Article

Save to List

Saved in:
10

Loading…

Unleashing Network/Accelerator Co-Exploration Potential on FPGAs: A Deeper Joint Search by Lou, Wenqi, Gong, Lei, Wang, Chao, Qian, Jiaming, Wang, Xuan, Li, Changlong, Zhou, Xuehai

ISSN: 0278-0070, 1937-4151

Published: New York IEEE 01.10.2024

Published in IEEE transactions on computer-aided design of integrated circuits and systems (01.10.2024)
“…Recently, algorithm-hardware (HW) co-exploration for neural networks (NNs) has become the key to obtaining high-quality solutions. However, previous efforts…”

Get full text

Journal Article

Save to List

Saved in:
11

Loading…

Algorithm/Hardware Co-optimization for Sparsity-Aware SpMM Acceleration of GNNs by Gao, Yingxue, Gong, Lei, Wang, Chao, Wang, Teng, Li, Xi, Zhou, Xuehai

ISSN: 0278-0070, 1937-4151

Published: New York IEEE 01.12.2023

Published in IEEE transactions on computer-aided design of integrated circuits and systems (01.12.2023)
“… So in this paper, we demonstrate an algorithm/hardware co-optimization chance to enhance SpMM acceleration for GNNs…”

Get full text

Journal Article

Save to List

Saved in:
12

Loading…

SCnC: Efficient Unification of Streaming with Dynamic Task Parallelism by Sbîrlea, Dragoş, Shirako, Jun, Newton, Ryan, Sarkar, Vivek

ISSN: 0885-7458, 1573-7640

Published: New York Springer US 01.04.2016

Published in International journal of parallel programming (01.04.2016)
“… This work shows that it is possible to exploit streaming as a safe and automatic optimization of a more general dataflow-based model…”

Get full text

Journal Article

Save to List

Saved in:
13

Loading…

DiMO-Sparse: Differentiable Modeling and Optimization of Sparse CNN Dataflow and Hardware Architecture by Song, Jianfeng, Liang, Rongjian, Gong, Yu, Yuan, Bo, Hu, Jiang

ISSN: 1558-1101

Published: EDAA 25.03.2024

Published in Proceedings - Design, Automation, and Test in Europe Conference and Exhibition (25.03.2024)
“… To the best of our knowledge, this paper presents the first systematic investigation of automatic dataflow and hardware optimization for sparse CNN computation…”

Get full text

Conference Proceeding

Save to List

Saved in:
14

Loading…

Mechanisms Towards Energy-Efficient Dynamic Hardware Specialization by Ho, Chen-Han

ISBN: 9781321384222, 132138422X

Published: ProQuest Dissertations & Theses 01.01.2014

“…In the past few decades, Von Neumann superscalar processors have been the prevalent approach for general purpose processing. Hardware specialization, as a…”

Get full text

Dissertation

Save to List

Saved in:
15

Loading…

An FPGA-based efficient accelerator for fault interaction of rupture dynamics by Yuan, Ming, Liu, Qiang, Gan, Lin

ISSN: 1573-0484, 0920-8542, 1573-0484

Published: New York Springer Nature B.V 10.09.2025

Published in The Journal of supercomputing (10.09.2025)
“…Efficiently predicting aftershocks based on rupture dynamics simulation is a crucial task in high-performance computing, traditionally dependent on…”

Get full text

Journal Article

Save to List

Saved in:
16

Loading…

SpikeFlow: A hardware–software co-designed systolic array for spiking neural networks by Wang, Jianan, Shi, Yang, Chen, Zhaoyun, Wen, Mei

ISSN: 1383-7621

Published: Elsevier B.V 01.12.2025

Published in Journal of systems architecture (01.12.2025)
“…Spiking neural networks (SNNs), often referred to as third-generation neural networks, offer substantial advantages in efficiency and power consumption, which…”

Get full text

Journal Article

Save to List

Saved in:
17

Loading…

CHAUS： Scalable VM-Based Channels for Unbounded Streaming by Zhang, Yu, Yu, Yu-Fen, Cao, Hui-Fang, Chen, Jian-Kang, Zhang, Qi-Liang

ISSN: 1000-9000, 1860-4749

Published: New York Springer US 01.11.2017

Published in Journal of computer science and technology (01.11.2017)
“…Stream processing is a special form of the dataflow execution model that offers extensive opportunities for optimization and automatic parallelism…”

Get full text

Journal Article

Save to List

Saved in:
18

Loading…

SPADE: Sparse Pillar-based 3D Object Detection Accelerator for Autonomous Driving by Lee, Minjae, Park, Seongmin, Kim, Hyungmin, Yoon, Minyong, Lee, Janghwan, Choi, Jun Won, Kim, Nam Sung, Kang, Mingu, Choi, Jungwook

ISSN: 2378-203X

Published: IEEE 02.03.2024

Published in Proceedings - International Symposium on High-Performance Computer Architecture (02.03.2024)
“…3D object detection using point cloud (PC) data is essential for perception pipelines of autonomous driving, where efficient encoding is key to meeting…”

Get full text

Conference Proceeding

Save to List

Saved in:
19

Loading…

Dataflow computing models, languages, and machines for intelligence computations by Herath, J., Yamaguchi, Y., Saito, N., Yuba, T.

ISSN: 0098-5589, 1939-3520

Published: New York, NY IEEE 01.12.1988

Published in IEEE Transactions on Software Engineering (01.12.1988)
“…The authors compare dataflow computing models, languages, and dataflow computing machines for numerical and nonnumerical computations. The…”

Get full text

Journal Article

Save to List

Saved in:
20

Loading…

PolyJuice: Detecting Mis-compilation Bugs in Tensor Compilers with Equality Saturation Based Rewriting by Zhou, Chijin, Qian, Bingzhou, Go, Gwihwan, Zhang, Quan, Li, Shanshan, Jiang, Yu

ISSN: 2475-1421, 2475-1421

Published: New York, NY, USA ACM 08.10.2024

Published in Proceedings of ACM on programming languages (08.10.2024)
“… The main challenge is to construct equivalent graphs capable of efficiently exploring the diverse optimization logic during compilation…”

Get full text

Journal Article

Save to List

Saved in:

Search Results - Computation/Dataflow Optimization

MnnFast: A Fast and Scalable System Architecture for Memory-Augmented Neural Networks by Jang, Hanhwi, Kim, Joonsung, Jo, Jae-Eon, Lee, Jaewon, Kim, Jangwoo

NLP-Fast: A Fast, Scalable, and Flexible System to Accelerate Large-Scale Heterogeneous NLP Models by Kim, Joonsung, Hur, Suyeon, Lee, Eunbok, Lee, Seungho, Kim, Jangwoo

Reconfigurable Dataflow Optimization for Spatiotemporal Spiking Neural Computation on Systolic Array Accelerators by Lee, Jeong-Jun, Li, Peng

VQ-LLM: High-performance Code Generation for Vector Quantization Augmented LLM Inference by Liu, Zihan, Luo, Xinhao, Guo, Junxian, Ni, Wentao, Zhou, Yangjie, Guan, Yue, Guo, Cong, Cui, Weihao, Feng, Yu, Guo, Minyi, Zhu, Yuhao, Zhang, Minjia, Jin, Chen, Leng, Jingwen

Techniques for Efficient Performance Analysis and Memory Optimization in Mapping Dataflow Models of Computation Onto Embedded Systems by Luna, Mauro Martín Letras

A high-performance dataflow-centric optimization framework for deep learning inference on the edge by Zhang, Runhua, Jiang, Hongxu, Geng, Jinkun, Tian, Fangzheng, Ma, Yuhang, Wang, Haojie

SWG: an architecture for sparse weight gradient computation by Wu, Weiwei, Tu, Fengbin, Li, Xiangyu, Wei, Shaojun, Yin, Shouyi

AttentionLib: A Scalable Optimization Framework for Automated Attention Acceleration on FPGA by Liu, Zhenyu, Zhou, Xilang, Sun, Faxian, Chen, Jianli, Yu, Jun, Wang, Kun

Efficient N:M Sparse DNN Training Using Algorithm, Architecture, and Dataflow Co-Design by Fang, Chao, Sun, Wei, Zhou, Aojun, Wang, Zhongfeng

Unleashing Network/Accelerator Co-Exploration Potential on FPGAs: A Deeper Joint Search by Lou, Wenqi, Gong, Lei, Wang, Chao, Qian, Jiaming, Wang, Xuan, Li, Changlong, Zhou, Xuehai

Algorithm/Hardware Co-optimization for Sparsity-Aware SpMM Acceleration of GNNs by Gao, Yingxue, Gong, Lei, Wang, Chao, Wang, Teng, Li, Xi, Zhou, Xuehai

SCnC: Efficient Unification of Streaming with Dynamic Task Parallelism by Sbîrlea, Dragoş, Shirako, Jun, Newton, Ryan, Sarkar, Vivek

DiMO-Sparse: Differentiable Modeling and Optimization of Sparse CNN Dataflow and Hardware Architecture by Song, Jianfeng, Liang, Rongjian, Gong, Yu, Yuan, Bo, Hu, Jiang

Mechanisms Towards Energy-Efficient Dynamic Hardware Specialization by Ho, Chen-Han

An FPGA-based efficient accelerator for fault interaction of rupture dynamics by Yuan, Ming, Liu, Qiang, Gan, Lin

SpikeFlow: A hardware–software co-designed systolic array for spiking neural networks by Wang, Jianan, Shi, Yang, Chen, Zhaoyun, Wen, Mei

CHAUS： Scalable VM-Based Channels for Unbounded Streaming by Zhang, Yu, Yu, Yu-Fen, Cao, Hui-Fang, Chen, Jian-Kang, Zhang, Qi-Liang

SPADE: Sparse Pillar-based 3D Object Detection Accelerator for Autonomous Driving by Lee, Minjae, Park, Seongmin, Kim, Hyungmin, Yoon, Minyong, Lee, Janghwan, Choi, Jun Won, Kim, Nam Sung, Kang, Mingu, Choi, Jungwook

Dataflow computing models, languages, and machines for intelligence computations by Herath, J., Yamaguchi, Y., Saito, N., Yuba, T.

PolyJuice: Detecting Mis-compilation Bugs in Tensor Compilers with Equality Saturation Based Rewriting by Zhou, Chijin, Qian, Bingzhou, Go, Gwihwan, Zhang, Quan, Li, Shanshan, Jiang, Yu

Search Tools:

Refine Results

Format

Subject Area

Topic

Language

Year of Publication