SQLoop: High Performance Iterative Processing in Data Management

Increasingly more iterative and recursive query tasks are processed in data management systems, such as graph-structured data analytics, demanding fast response time. However, existing CTE-based recursive SQL and its implementation ineffectively respond to this intensive query processing with two ma...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	2018 IEEE 38th International Conference on Distributed Computing Systems (ICDCS) S. 1039 - 1051
Hauptverfasser:	Floratos, Sofoklis, Zhang, Yanfeng, Yuan, Yuan, Lee, Rubao, Zhang, Xiaodong
Format:	Tagungsbericht
Sprache:	Englisch
Veröffentlicht:	IEEE 01.07.2018
Schlagworte:	Acceleration Asynchronous Computation Engines Iterative Parallel Query Execution Query processing SQL Structured Query Language Task analysis
ISSN:	2575-8411
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Increasingly more iterative and recursive query tasks are processed in data management systems, such as graph-structured data analytics, demanding fast response time. However, existing CTE-based recursive SQL and its implementation ineffectively respond to this intensive query processing with two major drawbacks. First, its iteration execution model is based on implicit set-oriented terminating conditions that cannot express aggregation-based tasks, such as PageRank. Second, its synchronous execution model cannot perform asynchronous computing to further accelerate execution in parallel. To address these two issues, we have designed and implemented SQLoop, a framework that extends the semantics of current SQL standard in order to accommodate iterative SQL queries. SQLoop interfaces between users and different database engines with two powerful components. First, it provides an uniform SQL expression for users to access any database engine so that they do not need to write database dependent SQL or move datasets from a target engine to process in their own sites. Second, SQLoop automatically parallelizes iterative queries that contain certain aggregate functions in both synchronous and asynchronous ways. More specifically, SQLoop is able to take advantage of intermediate results generated between different iterations and to prioritize the execution of partitions that accelerate the query processing. We have tested and evaluated SQLoop by using three popular database engines with real-world datasets and queries, and shown its effectiveness and high performance.
ISSN:	2575-8411
DOI:	10.1109/ICDCS.2018.00104