Search Results - "Computer systems organization Architectures Serial architectures Pipeline computing"

1

Loading…

Spatz: A Compact Vector Processing Unit for High-Performance and Energy-Efficient Shared-L1 Clusters by Cavalcante, Matheus, Wuthrich, Domenic, Perotti, Matteo, Riedel, Samuel, Benini, Luca

ISSN: 1558-2434

Published: ACM 29.10.2022

Published in 2022 IEEE/ACM International Conference On Computer Aided Design (ICCAD) (29.10.2022)
“…While parallel architectures based on clusters of Processing Elements (PEs) sharing L1 memory are widespread, there is no consensus on how lean their PE should…”

Get full text

Conference Proceeding

Save to List

Saved in:
2

Loading…

Buffer Prospector: Discovering and Exploiting Untapped Buffer Resources in Many-Core DNN Accelerators by Wei, Yuchen, Cai, Jingwei, Gao, Mingyu, Peng, Sen, Wu, Zuotong, Shi, Guiming, Ma, Kaisheng

Published: IEEE 22.06.2025

Published in 2025 62nd ACM/IEEE Design Automation Conference (DAC) (22.06.2025)
“…In large-scale DNN inference accelerators, the many-core architecture has emerged as a predominant design, with layer-pipeline (LP) mapping being a mainstream…”

Get full text

Conference Proceeding

Save to List

Saved in:
3

Loading…

Lookup Table-based Multiplication-free All-digital DNN Accelerator Featuring Self-Synchronous Pipeline Accumulation by Tagata, Hiroto, Sato, Takashi, Awano, Hiromitsu

Published: IEEE 22.06.2025

Published in 2025 62nd ACM/IEEE Design Automation Conference (DAC) (22.06.2025)
“…Deep neural networks (DNNs) have been widely applied in our society, yet reducing power consumption due to large-scale matrix computations remains a critical…”

Get full text

Conference Proceeding

Save to List

Saved in:
4

Loading…

UDP: Utility-Driven Fetch Directed Instruction Prefetching by Oh, Surim, Xu, Mingsheng, Khan, Tanvir Ahmed, Kasikci, Baris, Litz, Heiner

Published: IEEE 29.06.2024

Published in 2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA) (29.06.2024)
“…Datacenter applications exhibit large instruction footprints causing significant instruction cache misses and, as a result, frontend stalls. To address this…”

Get full text

Conference Proceeding

Save to List

Saved in:
5

Loading…

Constable: Improving Performance and Power Efficiency by Safely Eliminating Load Instruction Execution by Bera, Rahul, Ranganathan, Adithya, Rakshit, Joydeep, Mahto, Sujit, Nori, Anant V., Gaur, Jayesh, Olgun, Ataberk, Kanellopoulos, Konstantinos, Sadrosadati, Mohammad, Subramoney, Sreenivas, Mutlu, Onur

Published: IEEE 29.06.2024

Published in 2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA) (29.06.2024)
“…Load instructions often limit instruction-level parallelism (ILP) in modern processors due to data and resource dependences they cause. Prior techniques like…”

Get full text

Conference Proceeding

Save to List

Saved in:
6

Loading…

Alternate Path Fetch by Deshmukh, Aniket, Cai, Lingzhe Chester, Patt, Yale N.

Published: IEEE 29.06.2024

Published in 2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA) (29.06.2024)
“…Modern out-of-order cores rely on a large instruction supply from the processor frontend to achieve high performance. This requires building wider pipelines…”

Get full text

Conference Proceeding

Save to List

Saved in:
7

Loading…

Alternate Path μ-op Cache Prefetching by Singh, Sawan, Perais, Arthur, Jimborean, Alexandra, Ros, Alberto

Published: IEEE 29.06.2024

Published in 2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA) (29.06.2024)
“…Datacenter applications are well-known for their large code footprints. This has caused frontend design to evolve by implementing decoupled fetching and large…”

Get full text

Conference Proceeding

Save to List

Saved in:
8

Loading…

Sparse-T: Hardware accelerator thread for unstructured sparse data processing by Vasireddy, Pranathi, Kavi, Krishna, Mehta, Gayatri

ISSN: 1558-2434

Published: ACM 29.10.2022

Published in 2022 IEEE/ACM International Conference On Computer Aided Design (ICCAD) (29.10.2022)
“…Sparse matrix-dense vector (SpMV) multiplication is inherent in most scientific, neural networks and machine learning algorithms. To efficiently exploit…”

Get full text

Conference Proceeding

Save to List

Saved in:
9

Loading…

Bit-level Perceptron Prediction for Indirect Branches by Garza, Elba, Mirbagher-Ajorpaz, Samira, Khan, Tahsin Ahmad, Jimenez, Daniel A.

ISSN: 2575-713X

Published: ACM 01.06.2019

Published in 2019 ACM/IEEE 46th Annual International Symposium on Computer Architecture (ISCA) (01.06.2019)
“…Modern software uses indirect branches for various purposes including, but not limited to, virtual method dispatch and implementation of switch statements…”

Get full text

Conference Proceeding

Save to List

Saved in:
10

Loading…

AVM-BTB: Adaptive and Virtualized Multi-level Branch Target Buffer by Liu, Yunzhe, Li, Xinyu, Zhang, Tingting, Liu, Tianyi, Guo, Qi, Zhang, Fuxin, Wang, Jian

Published: IEEE 29.06.2024

Published in 2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA) (29.06.2024)
“…Branch Target Buffer (BTB) plays an important role in modern processors. It is used to identify branches in the instruction stream and predict branch targets…”

Get full text

Conference Proceeding

Save to List

Saved in:
11

Loading…

PipeLink: A Pipelined Resource Sharing System for Dataflow High-Level Synthesis by Li, Rui, Berkley, Lincoln, Manohar, Rajit

Published: IEEE 22.06.2025

Published in 2025 62nd ACM/IEEE Design Automation Conference (DAC) (22.06.2025)
“…Dynamically scheduled high-level synthesis (HLS) is an approach to HLS that maps programs into dataflow circuits. These circuits use distributed control for…”

Get full text

Conference Proceeding

Save to List

Saved in:
12

Loading…

UpPipe: A Novel Pipeline Management on In-Memory Processors for RNA-seq Quantification by Chen, Liang-Chi, Ho, Chien-Chung, Chang, Yuan-Hao

Published: IEEE 09.07.2023

Published in 2023 60th ACM/IEEE Design Automation Conference (DAC) (09.07.2023)
“…RNA sequence quantification is an important analysis method to measure transcript abundances. A key overhead in RNA-seq quantification is to map a set of RNA…”

Get full text

Conference Proceeding

Save to List

Saved in:
13

Loading…

Leaky MDU: ARM Memory Disambiguation Unit Uncovered and Vulnerabilities Exposed by Liu, Chang, Lyu, Yongqiang, Wang, Haixia, Qiu, Pengfei, Ju, Dapeng, Qu, Gang, Wang, Dongsheng

Published: IEEE 09.07.2023

Published in 2023 60th ACM/IEEE Design Automation Conference (DAC) (09.07.2023)
“…Memory Disambiguation Unit (MDU) is widely used on modern processors to speculatively execute load instructions and improve pipeline performance. Given that…”

Get full text

Conference Proceeding

Save to List

Saved in:
14

Loading…

Load value prediction via path-based address prediction: avoiding mispredictions due to conflicting stores by Sheikh, Rami, Cain, Harold W., Damodaran, Raguram

ISBN: 1450349528, 9781450349529

ISSN: 2379-3155

Published: New York, NY, USA ACM 14.10.2017

Published in MICRO-50 : the 50th annual IEEE/ACM International Symposium on Microarchitecture : proceedings : October 14-18, 2017, Cambridge, MA (14.10.2017)
“…Current flagship processors excel at extracting instruction-level-parallelism (ILP) by forming large instruction windows. Even then, extracting ILP is…”

Get full text

Conference Proceeding

Save to List

Saved in:
15

Loading…

MixPipe: Efficient Bidirectional Pipeline Parallelism for Training Large-Scale Models by Zhang, Weigang, Zhou, Biyu, Tang, Xuehai, Wang, Zhaoxing, Hu, Songlin

Published: IEEE 09.07.2023

Published in 2023 60th ACM/IEEE Design Automation Conference (DAC) (09.07.2023)
“…The rapid development of large-scale deep neural networks has put forward an urgent demand for the efficiency of parallel training. Recently, bidirectional…”

Get full text

Conference Proceeding

Save to List

Saved in:
16

Loading…

SMT-COP: Defeating Side-Channel Attacks on Execution Units in SMT Processors by Townley, Daniel, Ponomarev, Dmitry

ISSN: 2641-7936

Published: IEEE 01.09.2019

Published in Proceedings / International Conference on Parallel Architectures and Compilation Techniques (01.09.2019)
“…Recent advances in side-channel attacks put intoquestion the viability of Simultaneous Multithreading (SMT) architectures from the security standpoint. To…”

Get full text

Conference Proceeding

Save to List

Saved in:
17

Loading…

Filter Caching for Free: The Untapped Potential of the Store-Buffer by Alves, Ricardo, Ros, Alberto, Black-Schaffer, David, Kaxiras, Stefanos

ISSN: 2575-713X

Published: ACM 01.06.2019

Published in 2019 ACM/IEEE 46th Annual International Symposium on Computer Architecture (ISCA) (01.06.2019)
“…Modern processors contain store-buffers to allow stores to retire under a miss, thus hiding store-miss latency. The store-buffer needs to be large (for…”

Get full text

Conference Proceeding

Save to List

Saved in:
18

Loading…

FabScalar: composing synthesizable RTL designs of arbitrary cores within a canonical superscalar template by Choudhary, Niket K., Wadhavkar, Salil V., Shah, Tanmay A., Mayukh, Hiran, Gandhi, Jayneel, Dwiel, Brandon H., Navada, Sandeep, Najaf-abadi, Hashem H., Rotenberg, Eric

ISBN: 9781450304726, 1450304729

ISSN: 1063-6897

Published: New York, NY, USA ACM 04.06.2011

Published in 2011 38th Annual International Symposium on Computer Architecture (ISCA) (04.06.2011)
“…A growing body of work has compiled a strong case for the single-ISA heterogeneous multi-core paradigm. A single-ISA heterogeneous multi-core provides…”

Get full text

Conference Proceeding

Save to List

Saved in:
19

Loading…

X-Layer: Building Composable Pipelined Dataflows for Low-Rank Convolutions by Vedula, Naveen, Hojabr, Reza, Khonsari, Ahmad, Shriraman, Arrvindh

Published: IEEE 01.09.2021

Published in 2021 30th International Conference on Parallel Architectures and Compilation Techniques (PACT) (01.09.2021)
“…Prior research in hardware accelerators has largely focused on spatial convolutions (CONV). However, state-of-the-art DNNs employ low-rank convolutions…”

Get full text

Conference Proceeding

Save to List

Saved in:
20

Loading…

Pipelining a triggered processing element by Repetti, Thomas J., Cerqueira, João P., Kim, Martha A., Seok, Mingoo

ISBN: 1450349528, 9781450349529

ISSN: 2379-3155

Published: New York, NY, USA ACM 14.10.2017

Published in MICRO-50 : the 50th annual IEEE/ACM International Symposium on Microarchitecture : proceedings : October 14-18, 2017, Cambridge, MA (14.10.2017)
“…Programmable spatial architectures composed of ensembles of autonomous fixed-ISA processing elements offer a compelling design point between the flexibility of…”

Get full text

Conference Proceeding

Save to List

Saved in:

Search Results - "Computer systems organization Architectures Serial architectures Pipeline computing"

Spatz: A Compact Vector Processing Unit for High-Performance and Energy-Efficient Shared-L1 Clusters by Cavalcante, Matheus, Wuthrich, Domenic, Perotti, Matteo, Riedel, Samuel, Benini, Luca

Buffer Prospector: Discovering and Exploiting Untapped Buffer Resources in Many-Core DNN Accelerators by Wei, Yuchen, Cai, Jingwei, Gao, Mingyu, Peng, Sen, Wu, Zuotong, Shi, Guiming, Ma, Kaisheng

Lookup Table-based Multiplication-free All-digital DNN Accelerator Featuring Self-Synchronous Pipeline Accumulation by Tagata, Hiroto, Sato, Takashi, Awano, Hiromitsu

UDP: Utility-Driven Fetch Directed Instruction Prefetching by Oh, Surim, Xu, Mingsheng, Khan, Tanvir Ahmed, Kasikci, Baris, Litz, Heiner

Alternate Path Fetch by Deshmukh, Aniket, Cai, Lingzhe Chester, Patt, Yale N.

Alternate Path μ-op Cache Prefetching by Singh, Sawan, Perais, Arthur, Jimborean, Alexandra, Ros, Alberto

Sparse-T: Hardware accelerator thread for unstructured sparse data processing by Vasireddy, Pranathi, Kavi, Krishna, Mehta, Gayatri

Bit-level Perceptron Prediction for Indirect Branches by Garza, Elba, Mirbagher-Ajorpaz, Samira, Khan, Tahsin Ahmad, Jimenez, Daniel A.

AVM-BTB: Adaptive and Virtualized Multi-level Branch Target Buffer by Liu, Yunzhe, Li, Xinyu, Zhang, Tingting, Liu, Tianyi, Guo, Qi, Zhang, Fuxin, Wang, Jian

PipeLink: A Pipelined Resource Sharing System for Dataflow High-Level Synthesis by Li, Rui, Berkley, Lincoln, Manohar, Rajit

UpPipe: A Novel Pipeline Management on In-Memory Processors for RNA-seq Quantification by Chen, Liang-Chi, Ho, Chien-Chung, Chang, Yuan-Hao

Leaky MDU: ARM Memory Disambiguation Unit Uncovered and Vulnerabilities Exposed by Liu, Chang, Lyu, Yongqiang, Wang, Haixia, Qiu, Pengfei, Ju, Dapeng, Qu, Gang, Wang, Dongsheng

Load value prediction via path-based address prediction: avoiding mispredictions due to conflicting stores by Sheikh, Rami, Cain, Harold W., Damodaran, Raguram

MixPipe: Efficient Bidirectional Pipeline Parallelism for Training Large-Scale Models by Zhang, Weigang, Zhou, Biyu, Tang, Xuehai, Wang, Zhaoxing, Hu, Songlin

SMT-COP: Defeating Side-Channel Attacks on Execution Units in SMT Processors by Townley, Daniel, Ponomarev, Dmitry

Filter Caching for Free: The Untapped Potential of the Store-Buffer by Alves, Ricardo, Ros, Alberto, Black-Schaffer, David, Kaxiras, Stefanos

FabScalar: composing synthesizable RTL designs of arbitrary cores within a canonical superscalar template by Choudhary, Niket K., Wadhavkar, Salil V., Shah, Tanmay A., Mayukh, Hiran, Gandhi, Jayneel, Dwiel, Brandon H., Navada, Sandeep, Najaf-abadi, Hashem H., Rotenberg, Eric

X-Layer: Building Composable Pipelined Dataflows for Low-Rank Convolutions by Vedula, Naveen, Hojabr, Reza, Khonsari, Ahmad, Shriraman, Arrvindh

Pipelining a triggered processing element by Repetti, Thomas J., Cerqueira, João P., Kim, Martha A., Seok, Mingoo

Search Tools:

Refine Results

Format

Subject Area

Topic

Language

Year of Publication