Accelerating aggregation using intra-cycle parallelism

Modern CPUs have word width of 64 bits but real data values are usually represented using bits fewer than a CPU word. This underutilization of CPU at register level has motivated the recent development of bit-parallel algorithms that carry out data processing operations (e.g, filter scan) on CPU wor...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:2015 IEEE 31st International Conference on Data Engineering S. 291 - 302
Hauptverfasser: Ziqiang Feng, Lo, Eric
Format: Tagungsbericht
Sprache:Englisch
Veröffentlicht: IEEE 01.04.2015
Schlagworte:
ISSN:1063-6382
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Modern CPUs have word width of 64 bits but real data values are usually represented using bits fewer than a CPU word. This underutilization of CPU at register level has motivated the recent development of bit-parallel algorithms that carry out data processing operations (e.g, filter scan) on CPU words packed with data values (e.g, 8 data values are packed into one 64-bit word). Bit-parallel algorithms fully unleash the intra-cycle parallelism of modern CPUs and they are especially attractive to main-memory column stores whose goal is to process data at the speed of the "bare metal". Main-memory column stores generally focus on analytical queries, where aggregation is a common operation. Current bit-parallel algorithms, however, have not covered aggregation yet. In this paper, we present a suite of bit-parallel algorithms to accelerate all standard aggregation operations: SUM, MIN, MAX, AVG, MEDIAN, COUNT. The algorithms are designed to fully leverage the intra-cycle parallelism in CPU cores when aggregating words of packed values. Experimental evaluation shows that our bit-parallel aggregation algorithms exhibit significant performance benefits compared with non-bit-parallel methods.
ISSN:1063-6382
DOI:10.1109/ICDE.2015.7113292