Exploring Fine-Grained Task-Based Execution on Multi-GPU Systems

Using multi-GPU systems, including GPU clusters, is gaining popularity in scientific computing. However, when using multiple GPUs concurrently, the conventional data parallel GPU programming paradigms, e.g., CUDA, cannot satisfactorily address certain issues, such as load balancing, GPU resource uti...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:2011 IEEE International Conference on Cluster Computing s. 386 - 394
Hlavní autoři: Long Chen, Villa, O., Gao, G. R.
Médium: Konferenční příspěvek
Jazyk:angličtina
Vydáno: IEEE 01.09.2011
Témata:
ISBN:9781457713552, 1457713551
ISSN:1552-5244
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Using multi-GPU systems, including GPU clusters, is gaining popularity in scientific computing. However, when using multiple GPUs concurrently, the conventional data parallel GPU programming paradigms, e.g., CUDA, cannot satisfactorily address certain issues, such as load balancing, GPU resource utilization, overlapping fine grained computation with communication, etc. In this paper, we present a fine-grained task-based execution framework for multi-GPU systems. By scheduling finer-grained tasks than what is supported in the conventional CUDA programming method among multiple GPUs, and allowing concurrent task execution on a single GPU, our framework provides means for solving the above issues and efficiently utilizing multi-GPU systems. Experiments with a molecular dynamics application show that, for nonuniform distributed workload, the solutions based on our framework achieve good load balance, and considerable performance improvement over other solutions based on the standard CUDA programming methodologies.
ISBN:9781457713552
1457713551
ISSN:1552-5244
DOI:10.1109/CLUSTER.2011.50