Výsledky vyhledávání - "Computer systems organization Architectures Parallel architectures"
-
1
Splitwise: Efficient Generative LLM Inference Using Phase Splitting
Vydáno: IEEE 29.06.2024Vydáno v 2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA) (29.06.2024)“…Generative large language model (LLM) applications are growing rapidly, leading to large-scale deployments of expensive and power-hungry GPUs. Our…”
Získat plný text
Konferenční příspěvek -
2
Scheduling techniques for GPU architectures with processing-in-memory capabilities
Vydáno: ACM 01.09.2016Vydáno v 2016 International Conference on Parallel Architecture and Compilation Techniques (PACT) (01.09.2016)“…Processing data in or near memory (PIM), as opposed to in conventional computational units in a processor, can greatly alleviate the performance and energy…”
Získat plný text
Konferenční příspěvek -
3
Trapezoid: A Versatile Accelerator for Dense and Sparse Matrix Multiplications
Vydáno: IEEE 29.06.2024Vydáno v 2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA) (29.06.2024)“…Accelerating matrix multiplication is crucial to achieve high performance in many application domains, including neural networks, graph analytics, and…”
Získat plný text
Konferenční příspěvek -
4
CoNDA: Efficient Cache Coherence Support for Near-Data Accelerators
ISSN: 2575-713XVydáno: ACM 01.06.2019Vydáno v 2019 ACM/IEEE 46th Annual International Symposium on Computer Architecture (ISCA) (01.06.2019)“…Specialized on-chip accelerators are widely used to improve the energy efficiency of computing systems. Recent advances in memory technology have enabled…”
Získat plný text
Konferenční příspěvek -
5
MCM-GPU: Multi-chip-module GPUs for continued performance scalability
Vydáno: ACM 01.06.2017Vydáno v 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA) (01.06.2017)“…Historically, improvements in GPU-based high performance computing have been tightly coupled to transistor scaling. As Moore's law slows down, and the number…”
Získat plný text
Konferenční příspěvek -
6
HIVE: A High-Priority Victim Cache for Accelerating GPU Memory Accesses
Vydáno: IEEE 22.06.2025Vydáno v 2025 62nd ACM/IEEE Design Automation Conference (DAC) (22.06.2025)“…The victim cache was originally designed as a secondary cache to handle misses in the L1 data (L1D) cache in CPUs. However, this design is often sub-optimal…”
Získat plný text
Konferenční příspěvek -
7
SCNN: An accelerator for compressed-sparse convolutional neural networks
Vydáno: ACM 01.06.2017Vydáno v 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA) (01.06.2017)“…Convolutional Neural Networks (CNNs) have emerged as a fundamental technology for machine learning. High performance and extreme energy efficiency are critical…”
Získat plný text
Konferenční příspěvek -
8
ACRS: Adjacent Computation Resource Sharing among Partitioned GPU Sub-Cores
Vydáno: IEEE 22.06.2025Vydáno v 2025 62nd ACM/IEEE Design Automation Conference (DAC) (22.06.2025)“…Modern GPUs typically segment Streaming Multiprocessors (SMs) into sub-cores (e.g. 4 sub-cores) to reduce power consumption and chip area. However, this…”
Získat plný text
Konferenční příspěvek -
9
GoPTX: Fine-grained GPU Kernel Fusion by PTX-level Instruction Flow Weaving
Vydáno: IEEE 22.06.2025Vydáno v 2025 62nd ACM/IEEE Design Automation Conference (DAC) (22.06.2025)“…GPUs have been heavily utilized in diverse applications, and numerous approaches, including kernel fusion, have been proposed to boost GPU efficiency through…”
Získat plný text
Konferenční příspěvek -
10
SplitSync: Bank Group-Level Split-Synchronization for High-Performance DRAM PIM
Vydáno: IEEE 22.06.2025Vydáno v 2025 62nd ACM/IEEE Design Automation Conference (DAC) (22.06.2025)“…Processing in Memory (PIM) architectures enhance memory bandwidth by utilizing bank-level parallelism, typically implemented with a SIMD structure where all…”
Získat plný text
Konferenční příspěvek -
11
Late Breaking Results: On-the-Fly Hadamard Hypervector Processing for Efficient Hyperdimensional Computing
Vydáno: IEEE 22.06.2025Vydáno v 2025 62nd ACM/IEEE Design Automation Conference (DAC) (22.06.2025)“…Inspired by the human brain, Hyperdimensional Computing (HDC) processes information efficiently by operating in high-dimensional space using hypervectors…”
Získat plný text
Konferenční příspěvek -
12
RASA: Efficient Register-Aware Systolic Array Matrix Engine for CPU
Vydáno: IEEE 05.12.2021Vydáno v 2021 58th ACM/IEEE Design Automation Conference (DAC) (05.12.2021)“…As AI-based applications become pervasive, CPU vendors are starting to incorporate matrix engines within the datapath to boost efficiency. Systolic arrays have…”
Získat plný text
Konferenční příspěvek -
13
Ambit: in-memory accelerator for bulk bitwise operations using commodity DRAM technology
ISBN: 1450349528, 9781450349529ISSN: 2379-3155Vydáno: New York, NY, USA ACM 14.10.2017Vydáno v MICRO-50 : the 50th annual IEEE/ACM International Symposium on Microarchitecture : proceedings : October 14-18, 2017, Cambridge, MA (14.10.2017)“…Many important applications trigger bulk bitwise operations, i.e., bitwise operations on large bit vectors. In fact, recent works design techniques that…”
Získat plný text
Konferenční příspěvek -
14
DenSparSA: A Balanced Systolic Array Approach for Dense and Sparse Matrix Multiplication
Vydáno: IEEE 22.06.2025Vydáno v 2025 62nd ACM/IEEE Design Automation Conference (DAC) (22.06.2025)“…Numerous studies have proposed hardware architectures to accelerate sparse matrix multiplication, but these approaches often incur substantial area and power…”
Získat plný text
Konferenční příspěvek -
15
INTERPRET: Inter-Warp Register Reuse for GPU Tensor Core
Vydáno: IEEE 21.10.2023Vydáno v 2023 32nd International Conference on Parallel Architectures and Compilation Techniques (PACT) (21.10.2023)“…Tensor cores in the recent NVIDIA GPUs are under the spotlight due to their superior computation throughput for general matrix-matrix multiplication (GEMM)…”
Získat plný text
Konferenční příspěvek -
16
A scalable processing-in-memory accelerator for parallel graph processing
ISSN: 1063-6897Vydáno: IEEE 01.06.2015Vydáno v Proceedings - International Symposium on Computer Architecture (01.06.2015)“…The explosion of digital data and the ever-growing need for fast data analysis have made in-memory big-data processing in computer systems increasingly…”
Získat plný text
Konferenční příspěvek -
17
FSPA: An FeFET-based Sparse Matrix-Dense Vector Multiplication Accelerator
Vydáno: IEEE 09.07.2023Vydáno v 2023 60th ACM/IEEE Design Automation Conference (DAC) (09.07.2023)“…Sparse matrix-dense vector multiplication (SpMV) is widely used in various applications. The performance of traditional SpMV accelerators is bounded by memory…”
Získat plný text
Konferenční příspěvek -
18
Spatz: A Compact Vector Processing Unit for High-Performance and Energy-Efficient Shared-L1 Clusters
ISSN: 1558-2434Vydáno: ACM 29.10.2022Vydáno v 2022 IEEE/ACM International Conference On Computer Aided Design (ICCAD) (29.10.2022)“…While parallel architectures based on clusters of Processing Elements (PEs) sharing L1 memory are widespread, there is no consensus on how lean their PE should…”
Získat plný text
Konferenční příspěvek -
19
Bit-pragmatic deep neural network computing
ISBN: 1450349528, 9781450349529ISSN: 2379-3155Vydáno: New York, NY, USA ACM 14.10.2017Vydáno v MICRO-50 : the 50th annual IEEE/ACM International Symposium on Microarchitecture : proceedings : October 14-18, 2017, Cambridge, MA (14.10.2017)“…Deep Neural Networks expose a high degree of parallelism, making them amenable to highly data parallel architectures. However, data-parallel architectures…”
Získat plný text
Konferenční příspěvek -
20
HexaMesh: Scaling to Hundreds of Chiplets with an Optimized Chiplet Arrangement
Vydáno: IEEE 09.07.2023Vydáno v 2023 60th ACM/IEEE Design Automation Conference (DAC) (09.07.2023)“…2.5D integration is an important technique to tackle the growing cost of manufacturing chips in advanced technology nodes. This poses the challenge of…”
Získat plný text
Konferenční příspěvek

