SQLoop: High Performance Iterative Processing in Data Management

Increasingly more iterative and recursive query tasks are processed in data management systems, such as graph-structured data analytics, demanding fast response time. However, existing CTE-based recursive SQL and its implementation ineffectively respond to this intensive query processing with two ma...

Full description

Saved in:

Bibliographic Details
Published in:	2018 IEEE 38th International Conference on Distributed Computing Systems (ICDCS) pp. 1039 - 1051
Main Authors:	Floratos, Sofoklis, Zhang, Yanfeng, Yuan, Yuan, Lee, Rubao, Zhang, Xiaodong
Format:	Conference Proceeding
Language:	English
Published:	IEEE 01.07.2018
Subjects:	Acceleration Asynchronous Computation Engines Iterative Parallel Query Execution Query processing SQL Structured Query Language Task analysis
ISSN:	2575-8411
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Increasingly more iterative and recursive query tasks are processed in data management systems, such as graph-structured data analytics, demanding fast response time. However, existing CTE-based recursive SQL and its implementation ineffectively respond to this intensive query processing with two major drawbacks. First, its iteration execution model is based on implicit set-oriented terminating conditions that cannot express aggregation-based tasks, such as PageRank. Second, its synchronous execution model cannot perform asynchronous computing to further accelerate execution in parallel. To address these two issues, we have designed and implemented SQLoop, a framework that extends the semantics of current SQL standard in order to accommodate iterative SQL queries. SQLoop interfaces between users and different database engines with two powerful components. First, it provides an uniform SQL expression for users to access any database engine so that they do not need to write database dependent SQL or move datasets from a target engine to process in their own sites. Second, SQLoop automatically parallelizes iterative queries that contain certain aggregate functions in both synchronous and asynchronous ways. More specifically, SQLoop is able to take advantage of intermediate results generated between different iterations and to prioritize the execution of partitions that accelerate the query processing. We have tested and evaluated SQLoop by using three popular database engines with real-world datasets and queries, and shown its effectiveness and high performance.
ISSN:	2575-8411
DOI:	10.1109/ICDCS.2018.00104