Distributed detection of sequential anomalies in univariate time series

The automated detection of sequential anomalies in time series is an essential task for many applications, such as the monitoring of technical systems, fraud detection in high-frequency trading, or the early detection of disease symptoms. All these applications require the detection to find all sequ...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	The VLDB journal Jg. 30; H. 4; S. 579 - 602
Hauptverfasser:	Schneider, Johannes, Wenig, Phillip, Papenbrock, Thorsten
Format:	Journal Article
Sprache:	Englisch
Veröffentlicht:	Berlin/Heidelberg Springer Berlin Heidelberg 01.07.2021 Springer Nature B.V
Schlagworte:	Anomalies Clusters Computer Science Database Management Fraud Processors Regular Paper Sequences Signs and symptoms Synchronism Time series Distributed programming Sequential anomaly Actor model Data mining Time series
ISSN:	1066-8888, 0949-877X
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	The automated detection of sequential anomalies in time series is an essential task for many applications, such as the monitoring of technical systems, fraud detection in high-frequency trading, or the early detection of disease symptoms. All these applications require the detection to find all sequential anomalies possibly fast on potentially very large time series. In other words, the detection needs to be effective, efficient and scalable w.r.t. the input size. Series2Graph is an effective solution based on graph embeddings that are robust against re-occurring anomalies and can discover sequential anomalies of arbitrary length and works without training data. Yet, Series2Graph is no t scalable due to its single-threaded approach; it cannot, in particular, process arbitrarily large sequences due to the memory constraints of a single machine. In this paper, we propose our distributed anomaly detection system, short DADS, which is an efficient and scalable adaptation of Series2Graph. Based on the actor programming model, DADS distributes the input time sequence, intermediate state and the computation to all processors of a cluster in a way that minimizes communication costs and synchronization barriers. Our evaluation shows that DADS is orders of magnitude faster than S2G, scales almost linearly with the number of processors in the cluster and can process much larger input sequences due to its scale-out property.
Bibliographie:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	1066-8888 0949-877X
DOI:	10.1007/s00778-021-00657-6