Search Results - "Computing methodologies Computer graphics Graphics systems AND interfaces Graphics processors*"

Refine Results
  1. 1

    Fine-grained DRAM: energy-efficient DRAM for extreme bandwidth systems by O'Connor, Mike, Chatterjee, Niladrish, Lee, Donghyuk, Wilson, John, Agrawal, Aditya, Keckler, Stephen W., Dally, William J.

    ISBN: 1450349528, 9781450349529
    ISSN: 2379-3155
    Published: New York, NY, USA ACM 14.10.2017
    “…Future GPUs and other high-performance throughput processors will require multiple TB/s of bandwidth to DRAM. Satisfying this bandwidth demand within an…”
    Get full text
    Conference Proceeding
  2. 2

    MCM-GPU: Multi-chip-module GPUs for continued performance scalability by Arunkumar, Akhil, Bolotin, Evgeny, Cho, Benjamin, Milic, Ugljesa, Ebrahimi, Eiman, Villa, Oreste, Jaleel, Aamer, Wu, Carole-Jean, Nellans, David

    Published: ACM 01.06.2017
    “…Historically, improvements in GPU-based high performance computing have been tightly coupled to transistor scaling. As Moore's law slows down, and the number…”
    Get full text
    Conference Proceeding
  3. 3

    RT-NeRF: Real-Time On-Device Neural Radiance Fields Towards Immersive AR/VR Rendering by Li, Chaojian, Li, Sixu, Zhao, Yang, Zhu, Wenbo, Lin, Yingyan

    ISSN: 1558-2434
    Published: ACM 29.10.2022
    “…Neural Radiance Field (NeRF) based rendering has attracted growing attention thanks to its state-of-the-art (SOTA) rendering quality and wide applications in…”
    Get full text
    Conference Proceeding
  4. 4

    Interplay between Hardware Prefetcher and Page Eviction Policy in CPU-GPU Unified Virtual Memory by Ganguly, Debashis, Zhang, Ziyu, Yang, Jun, Melhem, Rami

    ISSN: 2575-713X
    Published: ACM 01.06.2019
    “…Memory capacity in GPGPUs is a major challenge for data-intensive applications with their ever increasing memory requirement. To fit a workload into the…”
    Get full text
    Conference Proceeding
  5. 5

    Adaptive Cache Management for Energy-Efficient GPU Computing by Xuhao Chen, Li-Wen Chang, Rodrigues, Christopher I., Jie Lv, Zhiying Wang, Wen-Mei Hwu

    ISSN: 1072-4451
    Published: IEEE 01.12.2014
    “…With the SIMT execution model, GPUs can hide memory latency through massive multithreading for many applications that have regular memory access patterns. To…”
    Get full text
    Conference Proceeding
  6. 6

    Rethinking Page Table Structure for Fast Address Translation in GPUs: A Fixed-Size Hashed Page Table by Jang, Sungbin, Park, Junhyeok, Kwon, Osang, Lee, Yongho, Hong, Seokin

    Published: ACM 13.10.2024
    “…GPU memory virtualization has become essential for efficient programming, memory management, and address space sharing among computing devices in heterogeneous…”
    Get full text
    Conference Proceeding
  7. 7

    Beyond the socket: NUMA-aware GPUs by Milic, Ugljesa, Villa, Oreste, Bolotin, Evgeny, Arunkumar, Akhil, Ebrahimi, Eiman, Jaleel, Aamer, Ramirez, Alex, Nellans, David

    ISBN: 1450349528, 9781450349529
    ISSN: 2379-3155
    Published: New York, NY, USA ACM 14.10.2017
    “…GPUs achieve high throughput and power efficiency by employing many small single instruction multiple thread (SIMT) cores. To minimize scheduling logic and…”
    Get full text
    Conference Proceeding Publication
  8. 8

    Vulkan-Sim: A GPU Architecture Simulator for Ray Tracing by Saed, Mohammadreza, Chou, Yuan Hsi, Liu, Lufei, Nowicki, Tyler, Aamodt, Tor M.

    Published: IEEE 01.10.2022
    “…Ray tracing can generate photorealistic images with more convincing visual effects compared to rasterization. Recent hardware advances have enabled ray tracing…”
    Get full text
    Conference Proceeding
  9. 9

    Mars: A MapReduce Framework on graphics processors by He, Bingsheng, Fang, Wenbin, Luo, Qiong, Govindaraju, Naga K., Wang, Tuyong

    Published: ACM 01.10.2008
    “…We design and implement Mars, a MapReduce framework, on graphics processors (GPUs). MapReduce is a distributed programming framework originally proposed by…”
    Get full text
    Conference Proceeding
  10. 10

    A case for Core-Assisted Bottleneck Acceleration in GPUs: Enabling flexible data compression with assist warps by Vijaykumar, Nandita, Pekhimenko, Gennady, Jog, Adwait, Bhowmick, Abhishek, Ausavarungnirun, Rachata, Das, Chita, Kandemir, Mahmut, Mowry, Todd C., Mutlu, Onur

    ISSN: 1063-6897
    Published: IEEE 13.06.2015
    “…Modern Graphics Processing Units (GPUs) are well provisioned to support the concurrent execution of thousands of threads. Unfortunately, diUerent bottlenecks…”
    Get full text
    Conference Proceeding
  11. 11

    StocHD: Stochastic Hyperdimensional System for Efficient and Robust Learning from Raw Data by Poduval, Prathyush, Zou, Zhuowen, Najafi, Hassan, Homayoun, Houman, Imani, Mohsen

    Published: IEEE 05.12.2021
    “…Hyperdimensional Computing (HDC) is a neurally-inspired computation model working based on the observation that the human brain operates on high-dimensional…”
    Get full text
    Conference Proceeding
  12. 12

    Cambricon-D: Full-Network Differential Acceleration for Diffusion Models by Kong, Weihao, Hao, Yifan, Guo, Qi, Zhao, Yongwei, Song, Xinkai, Li, Xiaqing, Zou, Mo, Du, Zidong, Zhang, Rui, Liu, Chang, Wen, Yuanbo, Jin, Pengwei, Hu, Xing, Li, Wei, Xu, Zhiwei, Chen, Tianshi

    Published: IEEE 29.06.2024
    “…Diffusion models have made significant progress in current image generation tasks, thus becoming a prominent area of research. Diffusion models necessitate…”
    Get full text
    Conference Proceeding
  13. 13

    STREAMINGGS: Voxel-Based Streaming 3D Gaussian Splatting with Memory Optimization and Architectural Support by Zhang, Chenqi, Feng, Yu, Zhao, Jieru, Liu, Guangda, Ding, Wenchao, Wu, Chentao, Guo, Minyi

    Published: IEEE 22.06.2025
    “…3D Gaussian Splatting (3DGS) has gained popularity for its efficiency and sparse Gaussian-based representation. However, 3DGS struggles to meet the real-time…”
    Get full text
    Conference Proceeding
  14. 14

    Warped-Compression: Enabling power efficient GPUs through register compression by Lee, Sangpil, Kim, Keunsoo, Koo, Gunjae, Jeon, Hyeran, Ro, Won Woo, Annavaram, Murali

    ISSN: 1063-6897
    Published: IEEE 13.06.2015
    “…This paper presents Warped-Compression, a warp-level register compression scheme for reducing GPU power consumption. This work is motivated by the observation…”
    Get full text
    Conference Proceeding
  15. 15

    GPU-accelerated Path-based Timing Analysis by Guo, Guannan, Huang, Tsung-Wei, Lin, Yibo, Wong, Martin

    Published: IEEE 05.12.2021
    “…Path-based Analysis (PBA) is an important step in the design closure flow for reducing slack pessimism. However, PBA is extremely time-consuming. Recent years…”
    Get full text
    Conference Proceeding
  16. 16

    Harnessing Conventional Video Processing Insights for Emerging 3D Video Generation Models: A Comprehensive Attention-aware Way by Zhao, Tianlang, Liu, Jun, Li, Xingyang, Ding, Li, Li, Jinhao, Li, Shuaiheng, Hu, Jinbo, Dai, Guohao

    Published: IEEE 22.06.2025
    “…Video Generation Models based on 3D full attention (3D-VGMs) have significantly enhanced video quality. However, their inference overhead remains substantial,…”
    Get full text
    Conference Proceeding
  17. 17

    SQ-DM: Accelerating Diffusion Models with Aggressive Quantization and Temporal Sparsity by Fan, Zichen, Dai, Steve, Venkatesan, Rangharajan, Sylvester, Dennis, Khailany, Brucek

    Published: IEEE 22.06.2025
    “…Diffusion models have gained significant popularity in image generation tasks. However, generating high-quality content remains notably slow because it…”
    Get full text
    Conference Proceeding
  18. 18

    GEM: GPU-Accelerated Emulator-Inspired RTL Simulation by Guo, Zizheng, Zhang, Yanqing, Wang, Runsheng, Lin, Yibo, Ren, Haoxing

    Published: IEEE 22.06.2025
    “…In this paper, we present a GPU-accelerated RTL simulator addressing critical challenges in high-speed circuit verification. Traditional CPU-based RTL…”
    Get full text
    Conference Proceeding
  19. 19

    ACRS: Adjacent Computation Resource Sharing among Partitioned GPU Sub-Cores by Song, Penghao, Wang, Chongxi, Han, Chenji, Zhao, Haoyu, Zhang, Tingting, Liu, Tianyi, Wang, Jian

    Published: IEEE 22.06.2025
    “…Modern GPUs typically segment Streaming Multiprocessors (SMs) into sub-cores (e.g. 4 sub-cores) to reduce power consumption and chip area. However, this…”
    Get full text
    Conference Proceeding
  20. 20

    GauRast: Enhancing GPU Triangle Rasterizers to Accelerate 3D Gaussian Splatting by Li, Sixu, Keller, Ben, Lin, Yingyan Celine, Khailany, Brucek

    Published: IEEE 22.06.2025
    “…3D intelligence leverages rich 3D features and stands as a promising frontier in AI, with 3D rendering fundamental to many downstream applications. 3D Gaussian…”
    Get full text
    Conference Proceeding