Accelerating aggregation using intra-cycle parallelism

Modern CPUs have word width of 64 bits but real data values are usually represented using bits fewer than a CPU word. This underutilization of CPU at register level has motivated the recent development of bit-parallel algorithms that carry out data processing operations (e.g, filter scan) on CPU wor...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	2015 IEEE 31st International Conference on Data Engineering S. 291 - 302
Hauptverfasser:	Ziqiang Feng, Lo, Eric
Format:	Tagungsbericht
Sprache:	Englisch
Veröffentlicht:	IEEE 01.04.2015
Schlagworte:	Acceleration Algorithm design and analysis Central Processing Unit Layout Parallel processing Registers
ISSN:	1063-6382
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Modern CPUs have word width of 64 bits but real data values are usually represented using bits fewer than a CPU word. This underutilization of CPU at register level has motivated the recent development of bit-parallel algorithms that carry out data processing operations (e.g, filter scan) on CPU words packed with data values (e.g, 8 data values are packed into one 64-bit word). Bit-parallel algorithms fully unleash the intra-cycle parallelism of modern CPUs and they are especially attractive to main-memory column stores whose goal is to process data at the speed of the "bare metal". Main-memory column stores generally focus on analytical queries, where aggregation is a common operation. Current bit-parallel algorithms, however, have not covered aggregation yet. In this paper, we present a suite of bit-parallel algorithms to accelerate all standard aggregation operations: SUM, MIN, MAX, AVG, MEDIAN, COUNT. The algorithms are designed to fully leverage the intra-cycle parallelism in CPU cores when aggregating words of packed values. Experimental evaluation shows that our bit-parallel aggregation algorithms exhibit significant performance benefits compared with non-bit-parallel methods.
ISSN:	1063-6382
DOI:	10.1109/ICDE.2015.7113292