Search Results - "Computer systems organization Architectures Other architectures Reconfigurable computing"

1

Loading…

DRISA: a DRAM-based Reconfigurable In-Situ Accelerator by Li, Shuangchen, Niu, Dimin, Malladi, Krishna T., Zheng, Hongzhong, Brennan, Bob, Xie, Yuan

ISBN: 1450349528, 9781450349529

ISSN: 2379-3155

Published: New York, NY, USA ACM 14.10.2017

Published in MICRO-50 : the 50th annual IEEE/ACM International Symposium on Microarchitecture : proceedings : October 14-18, 2017, Cambridge, MA (14.10.2017)
“…Data movement between the processing units and the memory in traditional von Neumann architecture is creating the "memory wall" problem. To bridge the gap, two…”

Get full text

Conference Proceeding

Save to List

Saved in:
2

Loading…

Maximizing CNN accelerator efficiency through resource partitioning by Yongming Shen, Ferdman, Michael, Milder, Peter

Published: ACM 01.06.2017

Published in 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA) (01.06.2017)
“…Convolutional neural networks (CNNs) are revolutionizing machine learning, but they present significant computational challenges. Recently, many FPGA-based…”

Get full text

Conference Proceeding

Save to List

Saved in:
3

Loading…

Stream-dataflow acceleration by Nowatzki, Tony, Gangadhar, Vinay, Ardalani, Newsha, Sankaralingam, Karthikeyan

Published: ACM 01.06.2017

Published in 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA) (01.06.2017)
“…Demand for low-power data processing hardware continues to rise inexorably. Existing programmable and "general purpose" solutions (eg. SIMD, GPGPUs) are…”

Get full text

Conference Proceeding

Save to List

Saved in:
4

Loading…

Understanding and optimizing asynchronous low-precision stochastic gradient descent by De Sa, Christopher, Feldman, Matthew, Re, Christopher, Olukotun, Kunle

Published: ACM 01.06.2017

Published in 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA) (01.06.2017)
“…Stochastic gradient descent (SGD) is one of the most popular numerical algorithms used in machine learning and other domains. Since this is likely to continue…”

Get full text

Conference Proceeding

Save to List

Saved in:
5

Loading…

Qubit Mapping for Reconfigurable Atom Arrays by Tan, Bochen, Bluvstein, Dolev, Lukin, Mikhail D., Cong, Jason

ISSN: 1558-2434

Published: ACM 29.10.2022

Published in 2022 IEEE/ACM International Conference On Computer Aided Design (ICCAD) (29.10.2022)
“…Because of the largest number of qubits available, and the massive parallel execution of entangling two-qubit gates, atom arrays is a promising platform for…”

Get full text

Conference Proceeding

Save to List

Saved in:
6

Loading…

Caffeine: Towards uniformed representation and acceleration for deep convolutional neural networks by Chen Zhang, Zhenman Fang, Peipei Zhou, Peichen Pan, Jason Cong

ISSN: 1558-2434

Published: ACM 01.11.2016

Published in Digest of technical papers - IEEE/ACM International Conference on Computer-Aided Design (01.11.2016)
“…With the recent advancement of multilayer convolutional neural networks (CNN), deep learning has achieved amazing success in many areas, especially in visual…”

Get full text

Conference Proceeding

Save to List

Saved in:
7

Loading…

SODA: Stencil with Optimized Dataflow Architecture by Chi, Yuze, Cong, Jason, Wei, Peng, Zhou, Peipei

ISSN: 1558-2434

Published: ACM 01.11.2018

Published in 2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD) (01.11.2018)
“…Stencil computation is one of the most important kernels in many application domains such as image processing, solving partial differential equations, and…”

Get full text

Conference Proceeding

Save to List

Saved in:
8

Loading…

TGPA: Tile-Grained Pipeline Architecture for Low Latency CNN Inference by Wei, Xuechao, Liang, Yun, Li, Xiuhong, Yu, Cody Hao, Zhang, Peng, Cong, Jason

ISSN: 1558-2434

Published: ACM 01.11.2018

Published in 2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD) (01.11.2018)
“…FPGAs are more and more widely used as reconfigurable hardware accelerators for applications leveraging convolutional neural networks (CNNs) in recent years…”

Get full text

Conference Proceeding

Save to List

Saved in:
9

Loading…

MECLA: Memory-Compute-Efficient LLM Accelerator with Scaling Sub-matrix Partition by Qin, Yubin, Wang, Yang, Zhao, Zhiren, Yang, Xiaolong, Zhou, Yang, Wei, Shaojun, Hu, Yang, Yin, Shouyi

Published: IEEE 29.06.2024

Published in 2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA) (29.06.2024)
“…Large language models (LLMs) have been showing surprising performance in processing language tasks, bringing a new prevalence to deploy LLM from cloud to edge…”

Get full text

Conference Proceeding

Save to List

Saved in:
10

Loading…

Hardware-Aware Machine Learning: Modeling and Optimization by Marculescu, Diana, Stamoulis, Dimitrios, Cai, Ermao

ISSN: 1558-2434

Published: ACM 01.11.2018

Published in 2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD) (01.11.2018)
“…Recent breakthroughs in Machine Learning (ML) applications, and especially in Deep Learning (DL), have made DL models a key component in almost every modern…”

Get full text

Conference Proceeding

Save to List

Saved in:
11

Loading…

Map-and-Conquer: Energy-Efficient Mapping of Dynamic Neural Nets onto Heterogeneous MPSoCs by Bouzidi, Halima, Odema, Mohanad, Ouarnoughi, Hamza, Niar, Smail, Al Faruque, Mohammad Abdullah

Published: IEEE 09.07.2023

Published in 2023 60th ACM/IEEE Design Automation Conference (DAC) (09.07.2023)
“…Heterogeneous MPSoCs comprise diverse processing units of varying compute capabilities. To date, the mapping strategies of neural networks (NNs) onto such…”

Get full text

Conference Proceeding

Save to List

Saved in:
12

Loading…

A Memory-Efficient LLM Accelerator with Q-K Correlation Prediction using Cluster-Based Associative Array for Selective KV Accessing by Zhou, Zikang, Chen, Kaiqi, Duan, Xuyang, Han, Jun

Published: IEEE 22.06.2025

Published in 2025 62nd ACM/IEEE Design Automation Conference (DAC) (22.06.2025)
“…Attention-based LLMs excel in text generation but face redundant computations in autoregressive token generation. While KV cache mitigates this, it introduces…”

Get full text

Conference Proceeding

Save to List

Saved in:
13

Loading…

Supporting Register-based Addressing Modes for in-DRAM PIM ISAs by Kim, Seok Young, Choi, Byung Ho, Kang, Seokwon, Park, Yongjun, Kim, Seon Wook

Published: IEEE 22.06.2025

Published in 2025 62nd ACM/IEEE Design Automation Conference (DAC) (22.06.2025)
“…Processing-in-Memory architecture presents a promising solution to alleviate the data movement bottleneck that arises from transferring data between memory and…”

Get full text

Conference Proceeding

Save to List

Saved in:
14

Loading…

ARCANE: Adaptive RISC-V Cache Architecture for Near-memory Extensions by Petrolo, Vincenzo, Guella, Flavia, Caon, Michele, Schiavone, Pasquale Davide, Masera, Guido, Martina, Maurizio

Published: IEEE 22.06.2025

Published in 2025 62nd ACM/IEEE Design Automation Conference (DAC) (22.06.2025)
“…Modern data-driven applications expose limitations of von Neumann architectures-extensive data movement, low throughput, and poor energy efficiency…”

Get full text

Conference Proceeding

Save to List

Saved in:
15

Loading…

HH-PIM: Dynamic Optimization of Power and Performance with Heterogeneous-Hybrid PIM for Edge AI Devices by Jeon, Sangmin, Lee, Kangju, Lee, Kyeongwon, Lee, Woojoo

Published: IEEE 22.06.2025

Published in 2025 62nd ACM/IEEE Design Automation Conference (DAC) (22.06.2025)
“…Processing-in-Memory (PIM) architectures offer promising solutions for efficiently handling AI applications in energy-constrained edge environments. While…”

Get full text

Conference Proceeding

Save to List

Saved in:
16

Loading…

RADiT: Redundancy-Aware Diffusion Transformer Acceleration Leveraging Timestep Similarity by Park, Youngjun, Kim, Sangyeon, Kim, Yeonggeon, Ji, Gisan, Ryu, Sungju

Published: IEEE 22.06.2025

Published in 2025 62nd ACM/IEEE Design Automation Conference (DAC) (22.06.2025)
“…Diffusion Transformers (DiTs) have demonstrated unprecedented performance across various generative tasks including image and video generation. However, a…”

Get full text

Conference Proceeding

Save to List

Saved in:
17

Loading…

Late Breaking Results: FPGen-3D: Automated Framework for 3D-FPGA Architecture Generation and Exploration by Youssef, Ismael, Hao, Cong Callie

Published: IEEE 22.06.2025

Published in 2025 62nd ACM/IEEE Design Automation Conference (DAC) (22.06.2025)
“…In this work, we propose FPGen-3D, an automated framework for 3D field-programmable gate arrays (FPGA) architecture generation and exploration. FPGen-3D…”

Get full text

Conference Proceeding

Save to List

Saved in:
18

Loading…

Configurable DSP-Based CAM Architecture for Data-Intensive Applications on FPGAs by Chen, Yao, Yu, Feng, Wu, Di, Wong, Weng-Fai, He, Bingsheng

Published: IEEE 22.06.2025

Published in 2025 62nd ACM/IEEE Design Automation Conference (DAC) (22.06.2025)
“…Content-addressable memory (CAM) is a type of fast memory unique in its ability to perform parallel searches of stored data based on content rather than…”

Get full text

Conference Proceeding

Save to List

Saved in:
19

Loading…

VersaSlot: Efficient Fine-grained FPGA Sharing with Big.Little Slots and Live Migration in FPGA Cluster by Gu, Jianfeng, Wang, Hao, Guo, Xiaorang, Schulz, Martin, Gerndt, Michael

Published: IEEE 22.06.2025

Published in 2025 62nd ACM/IEEE Design Automation Conference (DAC) (22.06.2025)
“…As FPGAs gain popularity for on-demand application acceleration in data center computing, dynamic partial reconfiguration (DPR) has become an effective…”

Get full text

Conference Proceeding

Save to List

Saved in:
20

Loading…

MambaOPU: An FPGA Overlay Processor for State-space-duality-based Mamba Models by Lu, Shaoqiang, Yu, Xuliang, Zhao, Tiandong, Miao, Siyuan, Sheng, Xinsong, Wu, Chen, Zhao, Liang, Lin, Ting-Jung, He, Lei

Published: IEEE 22.06.2025

Published in 2025 62nd ACM/IEEE Design Automation Conference (DAC) (22.06.2025)
“…State-space models (SSMs), such as Mamba, have emerged as a promising alternative to Transformers. However, the recently developed Mamba2, based on state space…”

Get full text

Conference Proceeding

Save to List

Saved in:

Search Results - "Computer systems organization Architectures Other architectures Reconfigurable computing"

DRISA: a DRAM-based Reconfigurable In-Situ Accelerator by Li, Shuangchen, Niu, Dimin, Malladi, Krishna T., Zheng, Hongzhong, Brennan, Bob, Xie, Yuan

Maximizing CNN accelerator efficiency through resource partitioning by Yongming Shen, Ferdman, Michael, Milder, Peter

Stream-dataflow acceleration by Nowatzki, Tony, Gangadhar, Vinay, Ardalani, Newsha, Sankaralingam, Karthikeyan

Understanding and optimizing asynchronous low-precision stochastic gradient descent by De Sa, Christopher, Feldman, Matthew, Re, Christopher, Olukotun, Kunle

Qubit Mapping for Reconfigurable Atom Arrays by Tan, Bochen, Bluvstein, Dolev, Lukin, Mikhail D., Cong, Jason

Caffeine: Towards uniformed representation and acceleration for deep convolutional neural networks by Chen Zhang, Zhenman Fang, Peipei Zhou, Peichen Pan, Jason Cong

SODA: Stencil with Optimized Dataflow Architecture by Chi, Yuze, Cong, Jason, Wei, Peng, Zhou, Peipei

TGPA: Tile-Grained Pipeline Architecture for Low Latency CNN Inference by Wei, Xuechao, Liang, Yun, Li, Xiuhong, Yu, Cody Hao, Zhang, Peng, Cong, Jason

MECLA: Memory-Compute-Efficient LLM Accelerator with Scaling Sub-matrix Partition by Qin, Yubin, Wang, Yang, Zhao, Zhiren, Yang, Xiaolong, Zhou, Yang, Wei, Shaojun, Hu, Yang, Yin, Shouyi

Hardware-Aware Machine Learning: Modeling and Optimization by Marculescu, Diana, Stamoulis, Dimitrios, Cai, Ermao

Map-and-Conquer: Energy-Efficient Mapping of Dynamic Neural Nets onto Heterogeneous MPSoCs by Bouzidi, Halima, Odema, Mohanad, Ouarnoughi, Hamza, Niar, Smail, Al Faruque, Mohammad Abdullah

A Memory-Efficient LLM Accelerator with Q-K Correlation Prediction using Cluster-Based Associative Array for Selective KV Accessing by Zhou, Zikang, Chen, Kaiqi, Duan, Xuyang, Han, Jun

Supporting Register-based Addressing Modes for in-DRAM PIM ISAs by Kim, Seok Young, Choi, Byung Ho, Kang, Seokwon, Park, Yongjun, Kim, Seon Wook

ARCANE: Adaptive RISC-V Cache Architecture for Near-memory Extensions by Petrolo, Vincenzo, Guella, Flavia, Caon, Michele, Schiavone, Pasquale Davide, Masera, Guido, Martina, Maurizio

HH-PIM: Dynamic Optimization of Power and Performance with Heterogeneous-Hybrid PIM for Edge AI Devices by Jeon, Sangmin, Lee, Kangju, Lee, Kyeongwon, Lee, Woojoo

RADiT: Redundancy-Aware Diffusion Transformer Acceleration Leveraging Timestep Similarity by Park, Youngjun, Kim, Sangyeon, Kim, Yeonggeon, Ji, Gisan, Ryu, Sungju

Late Breaking Results: FPGen-3D: Automated Framework for 3D-FPGA Architecture Generation and Exploration by Youssef, Ismael, Hao, Cong Callie

Configurable DSP-Based CAM Architecture for Data-Intensive Applications on FPGAs by Chen, Yao, Yu, Feng, Wu, Di, Wong, Weng-Fai, He, Bingsheng

VersaSlot: Efficient Fine-grained FPGA Sharing with Big.Little Slots and Live Migration in FPGA Cluster by Gu, Jianfeng, Wang, Hao, Guo, Xiaorang, Schulz, Martin, Gerndt, Michael

MambaOPU: An FPGA Overlay Processor for State-space-duality-based Mamba Models by Lu, Shaoqiang, Yu, Xuliang, Zhao, Tiandong, Miao, Siyuan, Sheng, Xinsong, Wu, Chen, Zhao, Liang, Lin, Ting-Jung, He, Lei

Search Tools:

Refine Results

Format

Subject Area

Topic

Language

Year of Publication