Towards enabling I/O awareness in task-based programming models

Storage systems have not kept the same technology improvement rate as computing systems. As applications produce more and more data, I/O becomes the limiting factor for increasing application performance. I/O congestion caused by concurrent access to storage devices is one of the main obstacles that...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Future generation computer systems Jg. 121; S. 74 - 89
Hauptverfasser:	Elshazly, Hatem, Ejarque, Jorge, Lordan, Francesc, Badia, Rosa M.
Format:	Journal Article
Sprache:	Englisch
Veröffentlicht:	Elsevier B.V 01.08.2021
Schlagworte:	Auto-tunable constraints I/O awareness I/O congestion I/O intensive applications I/O scheduling I/O-compute overlap Task-based programming models I/O congestion I/O-compute overlap I/O awareness I/O intensive applications I/O scheduling Task-based programming models Auto-tunable constraints
ISSN:	0167-739X, 1872-7115
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Storage systems have not kept the same technology improvement rate as computing systems. As applications produce more and more data, I/O becomes the limiting factor for increasing application performance. I/O congestion caused by concurrent access to storage devices is one of the main obstacles that cause I/O performance degradation and, consequently, total performance degradation. Although task-based programming models made it possible to achieve higher levels of parallelism by enabling the execution of tasks in large-scale distributed platforms, this parallelism only benefited the compute workload of the application. Previous efforts addressing I/O performance bottlenecks either focused on optimizing fine-grained I/O access patterns using I/O libraries or avoiding system-wide I/O congestion by minimizing interference between multiple applications. In this paper, we propose enabling I/O Awareness in task-based programming models for improving the total performance of applications. An I/O aware programming model is able to create more parallelism and mitigate the causes of I/O performance degradation. On the one hand, more parallelism can be created by supporting special tasks for executing I/O workloads, called I/O tasks, that can overlap with the execution of compute tasks. On the other hand, I/O congestion can be mitigated by constraining I/O tasks scheduling. We propose two approaches for specifying such constraints: explicitly set by the users or automatically inferred and tuned during application’s execution to optimize the execution of variable I/O workloads on a certain storage infrastructure. We implement our proposal using PyCOMPSs: a Task-based programming model for parallelizing Python applications. Our experiments on the MareNostrum 4 Supercomputer demonstrate that using I/O aware PyCOMPSs can achieve significant performance improvement in the total execution time of applications with different I/O workloads. This performance improvement can reach up to 43% of total application performance as compared to the I/O non-aware version of PyCOMPSs. •Programming model and runtime support to overlap I/O and compute workloads execution.•Programming model support and scheduling techniques to mitigate I/O congestion.•Mechanisms to automatically infer and tune I/O task bandwidth constraints at runtime.•Prototype implementation of I/O Awareness on an actual task-based programming model.•Evaluation of I/O awareness performance benefits on the MareNostrum4 supercomputer.
ISSN:	0167-739X 1872-7115
DOI:	10.1016/j.future.2021.03.009