Výsledky vyhledávání - Theory of computation - Design and analysis of algorithms

1

Načítá se…

Late Breaking Results: An Efficient and Scalable Track Assignment with GPU Parallelism Autor Liu, Genggeng, Huang, Pengcheng, Li, Zepeng, Liu, Wen-Hao, Huang, Xing, Guo, Wenzhong

Vydáno: IEEE 22.06.2025

Vydáno v 2025 62nd ACM/IEEE Design Automation Conference (DAC) (22.06.2025)
“… Based on the independence and divisibility of track assignment, we propose a GPU-accelerated parallel track assignment algorithm…”

Získat plný text

Konferenční příspěvek

Přidat do oblíbených

Uloženo v:
2

Načítá se…

DS-GL: Advancing Graph Learning via Harnessing Nature's Power within Scalable Dynamical Systems Autor Song, Ruibing, Wu, Chunshu, Liu, Chuan, Li, Ang, Huang, Michael, Geng, Tony Tong

Vydáno: IEEE 29.06.2024

Vydáno v 2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA) (29.06.2024)
“… problems and have been adopted for traditional graph computation, such as max-cut. However, when performing complex Graph Learning (GL…”

Získat plný text

Konferenční příspěvek

Přidat do oblíbených

Uloženo v:
3

Načítá se…

Invited: Algorithms and Architectures for Accelerating Long Read Sequence Analysis Autor Gamaarachchi, Hasindu, Liyanage, Kisaru, Parameswaran, Sri

Vydáno: IEEE 09.07.2023

Vydáno v 2023 60th ACM/IEEE Design Automation Conference (DAC) (09.07.2023)
“…; and three, novel algorithms and domain-specific architectures for rapid in situ analysis of third-generation sequencing data…”

Získat plný text

Konferenční příspěvek

Přidat do oblíbených

Uloženo v:
4

Načítá se…

Optimal Memory Allocation and Scheduling for DMA Data Transfers under the LET Paradigm Autor Pazzaglia, Paolo, Casini, Daniel, Biondi, Alessandro, Natale, Marco Di

Vydáno: IEEE 05.12.2021

Vydáno v 2021 58th ACM/IEEE Design Automation Conference (DAC) (05.12.2021)
“…The Logical Execution Time (LET) paradigm is increasingly used to achieve predictable communications in modern multicore automotive applications. Direct Memory…”

Získat plný text

Konferenční příspěvek

Přidat do oblíbených

Uloženo v:
5

Načítá se…

Skywalker: Efficient Alias-Method-Based Graph Sampling and Random Walk on GPUs Autor Wang, Pengyu, Li, Chao, Wang, Jing, Wang, Taolei, Zhang, Lu, Leng, Jingwen, Chen, Quan, Guo, Minyi

Vydáno: IEEE 01.09.2021

Vydáno v 2021 30th International Conference on Parallel Architectures and Compilation Techniques (PACT) (01.09.2021)
“…Graph sampling and random walk operations, capturing the structural properties of graphs, are playing an important role today as we cannot directly adopt computing-intensive algorithms on large-scale graphs…”

Získat plný text

Konferenční příspěvek

Přidat do oblíbených

Uloženo v:
6

Načítá se…

BlasPart: A Deterministic Parallel Partitioner for Balanced Large-Scale Hypergraph Partitioning Autor Tong, Shengbo, Pei, Chunyan, Yu, Wenjian

Vydáno: IEEE 22.06.2025

Vydáno v 2025 62nd ACM/IEEE Design Automation Conference (DAC) (22.06.2025)
“…Balanced hypergraph partitioning is a fundamental problem in applications like VLSI design, high-performance computing, etc…”

Získat plný text

Konferenční příspěvek

Přidat do oblíbených

Uloženo v:
7

Načítá se…

ParGNN: A Scalable Graph Neural Network Training Framework on multi-GPUs Autor Gu, Junyu, Li, Shunde, Cao, Rongqiang, Wang, Jue, Wang, Zijian, Liang, Zhiqiang, Liu, Fang, Li, Shigang, Zhou, Chunbao, Wang, Yangang, Chi, Xuebin

Vydáno: IEEE 22.06.2025

Vydáno v 2025 62nd ACM/IEEE Design Automation Conference (DAC) (22.06.2025)
“… over-partition to alleviate load imbalance. Based on the over-partition results, we present a subgraph pipeline algorithm to overlap communication and computation while maintaining the accuracy of GNN training…”

Získat plný text

Konferenční příspěvek

Přidat do oblíbených

Uloženo v:
8

Načítá se…

PertNAS: Architectural Perturbations for Memory-Efficient Neural Architecture Search Autor Ahmad, Afzal, Xie, Zhiyao, Zhang, Wei

Vydáno: IEEE 09.07.2023

Vydáno v 2023 60th ACM/IEEE Design Automation Conference (DAC) (09.07.2023)
“… This leads to GPU-memory bottlenecks that hamper the algorithm's scalability. To resolve these bottlenecks, we propose a perturbations-based evolutionary approach…”

Získat plný text

Konferenční příspěvek

Přidat do oblíbených

Uloženo v:
9

Načítá se…

BLESS: Bandwidth and Locality Enhanced SMEM Seeding Acceleration for DNA Sequencing Autor Han, Seunghee, Moon, Seungjae, Suh, Teokkyu, Heo, JaeHoon, Kim, Joo-Young

Vydáno: IEEE 29.06.2024

Vydáno v 2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA) (29.06.2024)
“… The seeding process, which aims to find locations of super-maximal exact matches (SMEM) between the DNA samples and reference genome for comparative analysis, has emerged as a major bottleneck due to its memory-intensive characteristics…”

Získat plný text

Konferenční příspěvek

Přidat do oblíbených

Uloženo v:
10

Načítá se…

A Universal Method for Task Allocation on FP-FPS Multiprocessor Systems with Spin Locks Autor Zhao, Shuai, Chen, Nan, Fang, Yinjie, Li, Zhao, Chang, Wanli

Vydáno: IEEE 09.07.2023

Vydáno v 2023 60th ACM/IEEE Design Automation Conference (DAC) (09.07.2023)
“… Unfortunately, these existing methods either are tailored for specific scheduling and analysis approaches, or introduce runtime overhead that undermines their applicability…”

Získat plný text

Konferenční příspěvek

Přidat do oblíbených

Uloženo v:
11

Načítá se…

ACGraph: Accelerating Streaming Graph Processing via Dependence Hierarchy Autor Jiang, Zihan, Mao, Fubing, Guo, Yapu, Liu, Xu, Liu, Haikun, Liao, Xiaofei, Jin, Hai, Zhang, Wei

Vydáno: IEEE 09.07.2023

Vydáno v 2023 60th ACM/IEEE Design Automation Conference (DAC) (09.07.2023)
“…Streaming graph processing needs to timely evaluate continuous queries. Prior systems suffer from massive redundant computations due to the irregular order of processing vertices influenced by updates…”

Získat plný text

Konferenční příspěvek

Přidat do oblíbených

Uloženo v:
12

Načítá se…

GARL: Genetic Algorithm-Augmented Reinforcement Learning to Detect Violations in Marker-Based Autonomous Landing Systems Autor Liang, Linfeng, Deng, Yao, Morton, Kye, Kallinen, Valtteri, James, Alice, Seth, Avishkar, Kuantama, Endrowednes, Mukhopadhyay, Subhas, Han, Richard, Zheng, Xi

ISSN: 1558-1225

Vydáno: IEEE 26.04.2025

Vydáno v Proceedings / International Conference on Software Engineering (26.04.2025)
“… To address these issues, we introduce GARL, a framework combining a genetic algorithm (GA) and reinforcement learning (RL…”

Získat plný text

Konferenční příspěvek

Přidat do oblíbených

Uloženo v:
13

Načítá se…

Parallelizing Maximal Clique Enumeration on GPUs Autor Almasri, Mohammad, Chang, Yen-Hsiang, Hajj, Izzat El, Nagi, Rakesh, Xiong, Jinjun, Hwu, Wen-mei

Vydáno: IEEE 21.10.2023

Vydáno v 2023 32nd International Conference on Parallel Architectures and Compilation Techniques (PACT) (21.10.2023)
“…We present a GPU solution for exact maximal clique enumeration (MCE) that performs a search tree traversal following the Bron-Kerbosch algorithm…”

Získat plný text

Konferenční příspěvek

Přidat do oblíbených

Uloženo v:
14

Načítá se…

SumPA: Efficient Pattern-Centric Graph Mining with Pattern Abstraction Autor Gui, Chuangyi, Liao, Xiaofei, Zheng, Long, Yao, Pengcheng, Wang, Qinggang, Jin, Hai

Vydáno: IEEE 01.09.2021

Vydáno v 2021 30th International Conference on Parallel Architectures and Compilation Techniques (PACT) (01.09.2021)
“…Graph mining aims to explore interesting structural information of a graph. Pattern-centric systems typically transform a generic-purpose graph mining problem…”

Získat plný text

Konferenční příspěvek

Přidat do oblíbených

Uloženo v:
15

Načítá se…

Placement Tomography-Based Routing Blockage Generation for DRV Hotspot Mitigation Autor Kahng, Andrew B., Kundu, Sayak, Yoon, Dooseok

ISSN: 1558-2434

Vydáno: ACM 27.10.2024

Vydáno v Digest of technical papers - IEEE/ACM International Conference on Computer-Aided Design (27.10.2024)
“…A fundamental goal in modern physical design is for the post-route layout to have a fixable number of remaining design rule violations (DRVs…”

Získat plný text

Konferenční příspěvek

Přidat do oblíbených

Uloženo v:
16

Načítá se…

Asynchronous Distributed-Memory Parallel Algorithms for Influence Maximization Autor Singhal, Shubhendra Pal, Hati, Souvadra, Young, Jeffrey, Sarkar, Vivek, Hayashi, Akihiro, Vuduc, Richard

Vydáno: IEEE 17.11.2024

Vydáno v SC24: International Conference for High Performance Computing, Networking, Storage and Analysis (17.11.2024)
“… We propose distributed-memory parallel algorithms for the two main kernels of a state-of-the-art implementation of one IM algorithm, influence maximization via martingales (IMM…”

Získat plný text

Konferenční příspěvek

Přidat do oblíbených

Uloženo v:
17

Načítá se…

InnerSP: A Memory Efficient Sparse Matrix Multiplication Accelerator with Locality-Aware Inner Product Processing Autor Baek, Daehyeon, Hwang, Soojin, Heo, Taekyung, Kim, Daehoon, Huh, Jaehyuk

Vydáno: IEEE 01.09.2021

Vydáno v 2021 30th International Conference on Parallel Architectures and Compilation Techniques (PACT) (01.09.2021)
“… To mitigate the memory access overheads, recent accelerator designs advocated the outer product processing which minimizes input accesses but generates intermediate products to be merged to the final output matrix…”

Získat plný text

Konferenční příspěvek

Přidat do oblíbených

Uloženo v:
18

Načítá se…

NEO-DNND: Communication-Optimized Distributed Nearest Neighbor Graph Construction Autor Iwabuchi, Keita, Steil, Trevor, Priest, Benjamin W., Pearce, Roger, Sanders, Geoffrey

Vydáno: IEEE 17.11.2024

Vydáno v SC24-W: Workshops of the International Conference for High Performance Computing, Networking, Storage and Analysis (17.11.2024)
“…Graph-based approximate nearest neighbor algorithms have shown high neighbor structure representation quality…”

Získat plný text

Konferenční příspěvek

Přidat do oblíbených

Uloženo v:
19

Načítá se…

BLOwing Trees to the Ground: Layout Optimization of Decision Trees on Racetrack Memory Autor Hakert, Christian, Khan, Asif Ali, Chen, Kuan-Hsun, Hameed, Fazal, Castrillon, Jeronimo, Chen, Jian-Jia

Vydáno: IEEE 05.12.2021

Vydáno v 2021 58th ACM/IEEE Design Automation Conference (DAC) (05.12.2021)
“…Modern distributed low power systems tend to integrate machine learning algorithms, which are directly executed on the distributed devices (on the edge…”

Získat plný text

Konferenční příspěvek

Přidat do oblíbených

Uloženo v:
20

Načítá se…

Network-Offloaded Bandwidth-Optimal Broadcast and Allgather for Distributed AI Autor Khalilov, Mikhail, Girolamo, Salvatore Di, Chrapek, Marcin, Nudelman, Rami, Bloch, Gil, Hoefler, Torsten

Vydáno: IEEE 17.11.2024

Vydáno v SC24: International Conference for High Performance Computing, Networking, Storage and Analysis (17.11.2024)
“…In the Fully Sharded Data Parallel (FSDP) training pipeline, collective operations can be interleaved to maximize the communication/computation overlap…”

Získat plný text

Konferenční příspěvek

Přidat do oblíbených

Uloženo v:

Výsledky vyhledávání - Theory of computation - Design and analysis of algorithms

Late Breaking Results: An Efficient and Scalable Track Assignment with GPU Parallelism Autor Liu, Genggeng, Huang, Pengcheng, Li, Zepeng, Liu, Wen-Hao, Huang, Xing, Guo, Wenzhong

DS-GL: Advancing Graph Learning via Harnessing Nature's Power within Scalable Dynamical Systems Autor Song, Ruibing, Wu, Chunshu, Liu, Chuan, Li, Ang, Huang, Michael, Geng, Tony Tong

Invited: Algorithms and Architectures for Accelerating Long Read Sequence Analysis Autor Gamaarachchi, Hasindu, Liyanage, Kisaru, Parameswaran, Sri

Optimal Memory Allocation and Scheduling for DMA Data Transfers under the LET Paradigm Autor Pazzaglia, Paolo, Casini, Daniel, Biondi, Alessandro, Natale, Marco Di

Skywalker: Efficient Alias-Method-Based Graph Sampling and Random Walk on GPUs Autor Wang, Pengyu, Li, Chao, Wang, Jing, Wang, Taolei, Zhang, Lu, Leng, Jingwen, Chen, Quan, Guo, Minyi

BlasPart: A Deterministic Parallel Partitioner for Balanced Large-Scale Hypergraph Partitioning Autor Tong, Shengbo, Pei, Chunyan, Yu, Wenjian

ParGNN: A Scalable Graph Neural Network Training Framework on multi-GPUs Autor Gu, Junyu, Li, Shunde, Cao, Rongqiang, Wang, Jue, Wang, Zijian, Liang, Zhiqiang, Liu, Fang, Li, Shigang, Zhou, Chunbao, Wang, Yangang, Chi, Xuebin

PertNAS: Architectural Perturbations for Memory-Efficient Neural Architecture Search Autor Ahmad, Afzal, Xie, Zhiyao, Zhang, Wei

BLESS: Bandwidth and Locality Enhanced SMEM Seeding Acceleration for DNA Sequencing Autor Han, Seunghee, Moon, Seungjae, Suh, Teokkyu, Heo, JaeHoon, Kim, Joo-Young

A Universal Method for Task Allocation on FP-FPS Multiprocessor Systems with Spin Locks Autor Zhao, Shuai, Chen, Nan, Fang, Yinjie, Li, Zhao, Chang, Wanli

ACGraph: Accelerating Streaming Graph Processing via Dependence Hierarchy Autor Jiang, Zihan, Mao, Fubing, Guo, Yapu, Liu, Xu, Liu, Haikun, Liao, Xiaofei, Jin, Hai, Zhang, Wei

GARL: Genetic Algorithm-Augmented Reinforcement Learning to Detect Violations in Marker-Based Autonomous Landing Systems Autor Liang, Linfeng, Deng, Yao, Morton, Kye, Kallinen, Valtteri, James, Alice, Seth, Avishkar, Kuantama, Endrowednes, Mukhopadhyay, Subhas, Han, Richard, Zheng, Xi

Parallelizing Maximal Clique Enumeration on GPUs Autor Almasri, Mohammad, Chang, Yen-Hsiang, Hajj, Izzat El, Nagi, Rakesh, Xiong, Jinjun, Hwu, Wen-mei

SumPA: Efficient Pattern-Centric Graph Mining with Pattern Abstraction Autor Gui, Chuangyi, Liao, Xiaofei, Zheng, Long, Yao, Pengcheng, Wang, Qinggang, Jin, Hai

Placement Tomography-Based Routing Blockage Generation for DRV Hotspot Mitigation Autor Kahng, Andrew B., Kundu, Sayak, Yoon, Dooseok

Asynchronous Distributed-Memory Parallel Algorithms for Influence Maximization Autor Singhal, Shubhendra Pal, Hati, Souvadra, Young, Jeffrey, Sarkar, Vivek, Hayashi, Akihiro, Vuduc, Richard

InnerSP: A Memory Efficient Sparse Matrix Multiplication Accelerator with Locality-Aware Inner Product Processing Autor Baek, Daehyeon, Hwang, Soojin, Heo, Taekyung, Kim, Daehoon, Huh, Jaehyuk

NEO-DNND: Communication-Optimized Distributed Nearest Neighbor Graph Construction Autor Iwabuchi, Keita, Steil, Trevor, Priest, Benjamin W., Pearce, Roger, Sanders, Geoffrey

BLOwing Trees to the Ground: Layout Optimization of Decision Trees on Racetrack Memory Autor Hakert, Christian, Khan, Asif Ali, Chen, Kuan-Hsun, Hameed, Fazal, Castrillon, Jeronimo, Chen, Jian-Jia

Network-Offloaded Bandwidth-Optimal Broadcast and Allgather for Distributed AI Autor Khalilov, Mikhail, Girolamo, Salvatore Di, Chrapek, Marcin, Nudelman, Rami, Bloch, Gil, Hoefler, Torsten

Vyhledávací nástroje:

Upřesnit hledání

Médium

Předmětová oblast

Téma

Jazyk

Rok vydání