Latency‐aware adaptive micro‐batching techniques for streamed data compression on graphics processing units

Summary Stream processing is a parallel paradigm used in many application domains. With the advance of graphics processing units (GPUs), their usage in stream processing applications has increased as well. The efficient utilization of GPU accelerators in streaming scenarios requires to batch input e...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Concurrency and computation Jg. 33; H. 11
Hauptverfasser:	Stein, Charles M., Rockenbach, Dinei A., Griebler, Dalvan, Torquati, Massimo, Mencagli, Gabriele, Danelutto, Marco, Fernandes, Luiz G.
Format:	Journal Article
Sprache:	Englisch
Veröffentlicht:	Hoboken Wiley Subscription Services, Inc 10.06.2021
Schlagworte:	Accelerators Adaptive algorithms Algorithms Compression tests Data compression data compression algorithms dynamic reconfiguration Graphics processing units parallel programming service level objective stream parallelism stream processing Workload Workloads
ISSN:	1532-0626, 1532-0634
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Summary Stream processing is a parallel paradigm used in many application domains. With the advance of graphics processing units (GPUs), their usage in stream processing applications has increased as well. The efficient utilization of GPU accelerators in streaming scenarios requires to batch input elements in microbatches, whose computation is offloaded on the GPU leveraging data parallelism within the same batch of data. Since data elements are continuously received based on the input speed, the bigger the microbatch size the higher the latency to completely buffer it and to start the processing on the device. Unfortunately, stream processing applications often have strict latency requirements that need to find the best size of the microbatches and to adapt it dynamically based on the workload conditions as well as according to the characteristics of the underlying device and network. In this work, we aim at implementing latency‐aware adaptive microbatching techniques and algorithms for streaming compression applications targeting GPUs. The evaluation is conducted using the Lempel‐Ziv‐Storer‐Szymanski compression application considering different input workloads. As a general result of our work, we noticed that algorithms with elastic adaptation factors respond better for stable workloads, while algorithms with narrower targets respond better for highly unbalanced workloads.
Bibliographie:	Funding information Conselho Nacional de Desenvolvimento Científico e Tecnológico, 437693/2018‐0; Coordenação de Aperfeiçoamento de Pessoal de Nível Superior, 001; Fundação de Amparo à Pesquisa do Estado do Rio Grande do Sul, 17/2551‐0000871‐5 ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	1532-0626 1532-0634
DOI:	10.1002/cpe.5786