Massively Distributed Time Series Indexing and Querying

Indexing is crucial for many data mining tasks that rely on efficient and effective similarity query processing. Consequently, indexing large volumes of time series, along with high performance similarity query processing, have became topics of high interest. For many applications across diverse dom...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on knowledge and data engineering Vol. 32; no. 1; pp. 108 - 120
Main Authors: Yagoubi, Djamel-Edine, Akbarinia, Reza, Masseglia, Florent, Palpanas, Themis
Format: Journal Article
Language:English
Published: New York IEEE 01.01.2020
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Institute of Electrical and Electronics Engineers
Subjects:
ISSN:1041-4347, 1558-2191
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Indexing is crucial for many data mining tasks that rely on efficient and effective similarity query processing. Consequently, indexing large volumes of time series, along with high performance similarity query processing, have became topics of high interest. For many applications across diverse domains though, the amount of data to be processed might be intractable for a single machine, making existing centralized indexing solutions inefficient. We propose a parallel indexing solution that gracefully scales to billions of time series, and a parallel query processing strategy that, given a batch of queries, efficiently exploits the index. Our experiments, on both synthetic and real world data, illustrate that our index creation algorithm works on four billion time series in less than five hours, while the state of the art centralized algorithms do not scale and have their limit on 1 billion time series, where they need more than five days. Also, our distributed querying algorithm is able to efficiently process millions of queries over collections of billions of time series, thanks to an effective load balancing mechanism.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1041-4347
1558-2191
DOI:10.1109/TKDE.2018.2880215