FluteDB: An efficient and scalable in-memory time series database for sensor-cloud

Recently, with the widespread use of large-scale sensor network, time series data is vastly generated and requires to be processed. However, those traditional databases show their limitations on storage when handling such a large stream data in cloud, and even their actual reliability and availabili...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Journal of parallel and distributed computing Jg. 122; S. 95 - 108
Hauptverfasser: Li, Chen, Li, Bo, Bhuiyan, Md Zakirul Alam, Wang, Lihong, Si, Jinghui, Wei, Guanyu, Li, Jianxin
Format: Journal Article
Sprache:Englisch
Veröffentlicht: Elsevier Inc 01.12.2018
Schlagworte:
ISSN:0743-7315, 1096-0848
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Recently, with the widespread use of large-scale sensor network, time series data is vastly generated and requires to be processed. However, those traditional databases show their limitations on storage when handling such a large stream data in cloud, and even their actual reliability and availability are also difficult to be guaranteed. To deal with the problem, this paper proposes FluteDB, an efficient and scalable in-memory time series database for sensor-cloud. We adequately analyze the unique characteristics of time series data and its relevant operations to strike the right balance among efficiency, scalability, resources consumption, reliability and availability. Specifically, on basis of the aggregate analysis of root cause for ongoing time series problems, FluteDB targeted optimizes the strategies for key operations in memory and physical storage, at the expense of partial acceptable data precision and consistency. FluteDB’s enhanced strategies are primarily comprised of Triggered Time Series Merge Tree (TTSM Tree), time series enhanced cache management and corresponding compression algorithms for different data types. The validations of all sub-modules have demonstrated that our improved strategies outperform existing methods in real time series environment significantly. Global experimental results also show that the integrated FluteDB reduces query latency by 17x, improves write rate by 98x and saves about 47% storage resources. The average available service time and recovery rate and degree of FluteDB are competitive with the state-of-the-art reliability and availability strategy in real and simulated faults, which demonstrates FluteDB can provide highly stable large-scale data cloud services. •FluteDB is an efficient and scalable time series database for sensor-cloud.•The index in FluteDB equips flexible storage tricks for time series data.•FluteDB improves efficiency by adjusting disk accesses according to data temperature.•FluteDB optimizes its data encapsulation and fault tolerant strategies.
ISSN:0743-7315
1096-0848
DOI:10.1016/j.jpdc.2018.07.021