Search Results - "Design and analysis of algorithms"

Refine Results
  1. 1

    Late Breaking Results: An Efficient and Scalable Track Assignment with GPU Parallelism by Liu, Genggeng, Huang, Pengcheng, Li, Zepeng, Liu, Wen-Hao, Huang, Xing, Guo, Wenzhong

    Published: IEEE 22.06.2025
    “…The track assignment has been introduced between global routing and detail routing. Based on the independence and divisibility of track assignment, we propose…”
    Get full text
    Conference Proceeding
  2. 2

    DS-GL: Advancing Graph Learning via Harnessing Nature's Power within Scalable Dynamical Systems by Song, Ruibing, Wu, Chunshu, Liu, Chuan, Li, Ang, Huang, Michael, Geng, Tony Tong

    Published: IEEE 29.06.2024
    “…With the rapid digitization of the world, an increasing number of real-world applications are turning to non-Euclidean data, modeled as graphs. Due to their…”
    Get full text
    Conference Proceeding
  3. 3

    BLESS: Bandwidth and Locality Enhanced SMEM Seeding Acceleration for DNA Sequencing by Han, Seunghee, Moon, Seungjae, Suh, Teokkyu, Heo, JaeHoon, Kim, Joo-Young

    Published: IEEE 29.06.2024
    “…In an era marked by the pervasive spread of harmful viruses like COVID-19, the importance of DNA sequencing has grown significantly, given its crucial role in…”
    Get full text
    Conference Proceeding
  4. 4

    BlasPart: A Deterministic Parallel Partitioner for Balanced Large-Scale Hypergraph Partitioning by Tong, Shengbo, Pei, Chunyan, Yu, Wenjian

    Published: IEEE 22.06.2025
    “…Balanced hypergraph partitioning is a fundamental problem in applications like VLSI design, high-performance computing, etc. Nowadays, large-scale hypergraphs…”
    Get full text
    Conference Proceeding
  5. 5

    Skywalker: Efficient Alias-Method-Based Graph Sampling and Random Walk on GPUs by Wang, Pengyu, Li, Chao, Wang, Jing, Wang, Taolei, Zhang, Lu, Leng, Jingwen, Chen, Quan, Guo, Minyi

    Published: IEEE 01.09.2021
    “…Graph sampling and random walk operations, capturing the structural properties of graphs, are playing an important role today as we cannot directly adopt…”
    Get full text
    Conference Proceeding
  6. 6

    PertNAS: Architectural Perturbations for Memory-Efficient Neural Architecture Search by Ahmad, Afzal, Xie, Zhiyao, Zhang, Wei

    Published: IEEE 09.07.2023
    “…Differentiable Neural Architecture Search (NAS) relies on aggressive weight-sharing to reduce its search cost. This leads to GPU-memory bottlenecks that hamper…”
    Get full text
    Conference Proceeding
  7. 7

    Invited: Algorithms and Architectures for Accelerating Long Read Sequence Analysis by Gamaarachchi, Hasindu, Liyanage, Kisaru, Parameswaran, Sri

    Published: IEEE 09.07.2023
    “…Genome sequencing is continuing to revolutionize the medical, forensics, agricultural, and biosecurity fields. The enormous amounts of data from modern…”
    Get full text
    Conference Proceeding
  8. 8

    GARL: Genetic Algorithm-Augmented Reinforcement Learning to Detect Violations in Marker-Based Autonomous Landing Systems by Liang, Linfeng, Deng, Yao, Morton, Kye, Kallinen, Valtteri, James, Alice, Seth, Avishkar, Kuantama, Endrowednes, Mukhopadhyay, Subhas, Han, Richard, Zheng, Xi

    ISSN: 1558-1225
    Published: IEEE 26.04.2025
    “…Automated Uncrewed Aerial Vehicle (UAV) landing is crucial for autonomous UAV services such as monitoring, surveying, and package delivery. It involves…”
    Get full text
    Conference Proceeding
  9. 9

    ParGNN: A Scalable Graph Neural Network Training Framework on multi-GPUs by Gu, Junyu, Li, Shunde, Cao, Rongqiang, Wang, Jue, Wang, Zijian, Liang, Zhiqiang, Liu, Fang, Li, Shigang, Zhou, Chunbao, Wang, Yangang, Chi, Xuebin

    Published: IEEE 22.06.2025
    “…Full-batch Graph Neural Network (GNN) training is indispensable for interdisciplinary applications. Although fullbatch training has advantages in convergence…”
    Get full text
    Conference Proceeding
  10. 10

    Optimal Memory Allocation and Scheduling for DMA Data Transfers under the LET Paradigm by Pazzaglia, Paolo, Casini, Daniel, Biondi, Alessandro, Natale, Marco Di

    Published: IEEE 05.12.2021
    “…The Logical Execution Time (LET) paradigm is increasingly used to achieve predictable communications in modern multicore automotive applications. Direct Memory…”
    Get full text
    Conference Proceeding
  11. 11

    SumPA: Efficient Pattern-Centric Graph Mining with Pattern Abstraction by Gui, Chuangyi, Liao, Xiaofei, Zheng, Long, Yao, Pengcheng, Wang, Qinggang, Jin, Hai

    Published: IEEE 01.09.2021
    “…Graph mining aims to explore interesting structural information of a graph. Pattern-centric systems typically transform a generic-purpose graph mining problem…”
    Get full text
    Conference Proceeding
  12. 12

    A Universal Method for Task Allocation on FP-FPS Multiprocessor Systems with Spin Locks by Zhao, Shuai, Chen, Nan, Fang, Yinjie, Li, Zhao, Chang, Wanli

    Published: IEEE 09.07.2023
    “…Many complex real-time systems, such as increasingly automated vehicles and 5G wireless base stations, contain a large amount of shared resources that must be…”
    Get full text
    Conference Proceeding
  13. 13

    Parallelizing Maximal Clique Enumeration on GPUs by Almasri, Mohammad, Chang, Yen-Hsiang, Hajj, Izzat El, Nagi, Rakesh, Xiong, Jinjun, Hwu, Wen-mei

    Published: IEEE 21.10.2023
    “…We present a GPU solution for exact maximal clique enumeration (MCE) that performs a search tree traversal following the Bron-Kerbosch algorithm. Prior works…”
    Get full text
    Conference Proceeding
  14. 14

    InnerSP: A Memory Efficient Sparse Matrix Multiplication Accelerator with Locality-Aware Inner Product Processing by Baek, Daehyeon, Hwang, Soojin, Heo, Taekyung, Kim, Daehoon, Huh, Jaehyuk

    Published: IEEE 01.09.2021
    “…Sparse matrix multiplication is one of the key computational kernels in large-scale data analytics. However, a naive implementation suffers from the overheads…”
    Get full text
    Conference Proceeding
  15. 15

    ACGraph: Accelerating Streaming Graph Processing via Dependence Hierarchy by Jiang, Zihan, Mao, Fubing, Guo, Yapu, Liu, Xu, Liu, Haikun, Liao, Xiaofei, Jin, Hai, Zhang, Wei

    Published: IEEE 09.07.2023
    “…Streaming graph processing needs to timely evaluate continuous queries. Prior systems suffer from massive redundant computations due to the irregular order of…”
    Get full text
    Conference Proceeding
  16. 16

    Mixed-Precision Quantization for Deep Vision Models with Integer Quadratic Programming by Deng, Zihao, Sharify, Sayeh, Wang, Xin, Orshansky, Michael

    Published: IEEE 22.06.2025
    “…Quantization is a widely used technique to compress neural networks. Assigning uniform bit-widths across all layers can result in significant accuracy…”
    Get full text
    Conference Proceeding
  17. 17

    pSyncPIM: Partially Synchronous Execution of Sparse Matrix Operations for All-Bank PIM Architectures by Baek, Daehyeon, Hwang, Soojin, Huh, Jaehyuk

    Published: IEEE 29.06.2024
    “…Recent commercial incarnations of processing-in-memory (PIM) maintain the standard DRAM interface and employ the all-bank mode execution to maximize bank-level…”
    Get full text
    Conference Proceeding
  18. 18

    BLOwing Trees to the Ground: Layout Optimization of Decision Trees on Racetrack Memory by Hakert, Christian, Khan, Asif Ali, Chen, Kuan-Hsun, Hameed, Fazal, Castrillon, Jeronimo, Chen, Jian-Jia

    Published: IEEE 05.12.2021
    “…Modern distributed low power systems tend to integrate machine learning algorithms, which are directly executed on the distributed devices (on the edge). In…”
    Get full text
    Conference Proceeding
  19. 19

    Seer: Predictive Runtime Kernel Selection for Irregular Problems by Swann, Ryan, Osama, Muhammad, Sangaiah, Karthik, Mahmud, Jalal

    ISSN: 2643-2838
    Published: IEEE 02.03.2024
    “…Modern GPUs are designed for regular problems and suffer from load imbalance when processing irregular data. Prior to our work, a domain expert selects the…”
    Get full text
    Conference Proceeding
  20. 20

    Formulating Data-arrival Synchronizers in Integer Linear Programming for CGRA Mapping by Guo, Yijiang, Wang, Jiarui, Zhang, Jiaxi, Luo, Guojie

    Published: IEEE 05.12.2021
    “…Coarse-grained reconfigurable architecture (CGRA) is a promising programmable device with high performance and power efficiency. The CGRA compilation problem…”
    Get full text
    Conference Proceeding