Mapping parallel programs to heterogeneous multi-core systems

Saved in:
Bibliographic Details
Title: Mapping parallel programs to heterogeneous multi-core systems
Authors: Grewe, Dominik
Contributors: O'Boyle, Michael, Franke, Bjoern
Publisher Information: University of Edinburgh, 2014.
Publication Year: 2014
Collection: University of Edinburgh
Subject Terms: 004, heterogeneous computing, parallel computing, GPU, OpenCL, predictive modeling
Description: Heterogeneous computer systems are ubiquitous in all areas of computing, from mobile to high-performance computing. They promise to deliver increased performance at lower energy cost than purely homogeneous, CPU-based systems. In recent years GPU-based heterogeneous systems have become increasingly popular. They combine a programmable GPU with a multi-core CPU. GPUs have become flexible enough to not only handle graphics workloads but also various kinds of general-purpose algorithms. They are thus used as a coprocessor or accelerator alongside the CPU. Developing applications for GPU-based heterogeneous systems involves several challenges. Firstly, not all algorithms are equally suited for GPU computing. It is thus important to carefully map the tasks of an application to the most suitable processor in a system. Secondly, current frameworks for heterogeneous computing, such as OpenCL, are low-level, requiring a thorough understanding of the hardware by the programmer. This high barrier to entry could be lowered by automatically generating and tuning this code from a high-level and thus more user-friendly programming language. Both challenges are addressed in this thesis. For the task mapping problem a machine learning-based approach is presented in this thesis. It combines static features of the program code with runtime information on input sizes to predict the optimal mapping of OpenCL kernels. This approach is further extended to also take contention on the GPU into account. Both methods are able to outperform competing mapping approaches by a significant margin. Furthermore, this thesis develops a method for targeting GPU-based heterogeneous systems from OpenMP, a directive-based framework for parallel computing. OpenMP programs are translated to OpenCL and optimized for GPU performance. At runtime a predictive model decides whether to execute the original OpenMP code on the CPU or the generated OpenCL code on the GPU. This approach is shown to outperform both a competing approach as well as hand-tuned code.
Document Type: Electronic Thesis or Dissertation
Language: English
Access URL: https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.615438
Accession Number: edsble.615438
Database: British Library EThOS
Be the first to leave a comment!
You must be logged in first