LRP: learned robust data partitioning for efficient processing of large dynamic queries
The interconnection between query processing and data partitioning is pivotal for the acceleration of massive data processing during query execution, primarily by minimizing the number of scanned block files. Existing partitioning techniques predominantly focus on query accesses on numeric columns f...
Uloženo v:
| Vydáno v: | Frontiers of Computer Science Ročník 19; číslo 9; s. 199607 |
|---|---|
| Hlavní autoři: | , , , , |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
Beijing
Higher Education Press
01.09.2025
Springer Nature B.V |
| Témata: | |
| ISSN: | 2095-2228, 2095-2236 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Shrnutí: | The interconnection between query processing and data partitioning is pivotal for the acceleration of massive data processing during query execution, primarily by minimizing the number of scanned block files. Existing partitioning techniques predominantly focus on query accesses on numeric columns for constructing partitions, often overlooking non-numeric columns and thus limiting optimization potential. Additionally, these techniques, despite creating fine-grained partitions from representative queries to enhance system performance, experience from notable performance declines due to unpredictable fluctuations in future queries. To tackle these issues, we introduce LRP, a learned robust partitioning system for dynamic query processing. LRP first proposes a method for data and query encoding that captures comprehensive column access patterns from historical queries. It then employs Multi-Layer Perceptron and Long Short-Term Memory networks to predict shifts in the distribution of historical queries. To create high-quality, robust partitions based on these predictions, LRP adopts a greedy beam search algorithm for optimal partition division and implements a data redundancy mechanism to share frequently accessed data across partitions. Experimental evaluations reveal that LRP yields partitions with more stable performance under incoming queries and significantly surpasses state-of-the-art partitioning methods. |
|---|---|
| Bibliografie: | data partitioning data encoding data redundancy Document received on :2024-05-21 Document accepted on :2024-09-08 query prediction beam search ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ISSN: | 2095-2228 2095-2236 |
| DOI: | 10.1007/s11704-024-40509-4 |