Topology-aware GPU scheduling for learning workloads in cloud environments

Recent advances in hardware, such as systems with multiple GPUs and their availability in the cloud, are enabling deep learning in various domains including health care, autonomous vehicles, and Internet of Things. Multi-GPU systems exhibit complex connectivity among GPUs and between GPUs and CPUs....

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	International Conference for High Performance Computing, Networking, Storage and Analysis (Online) S. 1 - 12
Hauptverfasser:	Amaral, Marcelo, Polo, Jordà, Carrera, David, Seelam, Seetharami, Steinder, Malgorzata
Format:	Tagungsbericht Verlag
Sprache:	Englisch
Veröffentlicht:	New York, NY, USA ACM 12.11.2017 Association for Computing Machinery (ACM)
Schriftenreihe:	ACM Conferences
Schlagworte:	Algorismes paral·lels Cloud computing Computer systems organization > Architectures > Distributed architectures > Cloud computing Deep Learning GPU Graphics processing units High performance computing Informàtica Interference Machine learning algorithms Multi-GPU Parallel algorithms Performance Analysis Placement Prototypes Resource Contention Scheduling Supercomputadors Theory of computation > Design and analysis of algorithms > Approximation algorithms analysis > Scheduling algorithms Theory of computation > Design and analysis of algorithms > Graph algorithms analysis Theory of computation > Theory and algorithms for application domains > Machine learning theory Workload Interference Workload Interference and Deep Learning Àrees temàtiques de la UPC multi-GPU resource contention scheduling placement GPU performance analysis workload interference and deep learning
ISBN:	9781450351140, 145035114X
ISSN:	2167-4337
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Recent advances in hardware, such as systems with multiple GPUs and their availability in the cloud, are enabling deep learning in various domains including health care, autonomous vehicles, and Internet of Things. Multi-GPU systems exhibit complex connectivity among GPUs and between GPUs and CPUs. Workload schedulers must consider hardware topology and workload communication requirements in order to allocate CPU and GPU resources for optimal execution time and improved utilization in shared cloud environments. This paper presents a new topology-aware workload placement strategy to schedule deep learning jobs on multi-GPU systems. The placement strategy is evaluated with a prototype on a Power8 machine with Tesla P100 cards, showing speedups of up to ≈1.30x compared to state-of-the-art strategies; the proposed algorithm achieves this result by allocating GPUs that satisfy workload requirements while preventing interference. Additionally, a large-scale simulation shows that the proposed strategy provides higher resource utilization and performance in cloud systems.
ISBN:	9781450351140 145035114X
ISSN:	2167-4337
DOI:	10.1145/3126908.3126933