Research on Join Operation of Temporal Big Data in Distributed Environment

Distributed system is an ideal choice for processing temporal large data join operation,but the existing distributed system cannot support the original temporal join query and cannot meet the processing requirements of temporal large data with low latency and high throughput.Therefore,a two-level in...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Ji suan ji gong cheng Jg. 45; H. 3; S. 20 - 25,31
1. Verfasser:	ZHANG Wei,WANG Zhijie
Format:	Journal Article
Sprache:	Chinesisch Englisch
Veröffentlicht:	Editorial Office of Computer Engineering 01.03.2019
Schlagworte:	temporal big data\| distributed memory computing\| temporal join\| two-level index\| partition method\| spark framework
ISSN:	1000-3428
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Distributed system is an ideal choice for processing temporal large data join operation,but the existing distributed system cannot support the original temporal join query and cannot meet the processing requirements of temporal large data with low latency and high throughput.Therefore,a two-level index memory solution scheme based on Spark is proposed.The global index is used to prune the distributed partitions,and the local temporal index is used to query the partitions in order to improve the efficiency of data retrieval.A partition method is designed for temporal data to optimize global pruning.Experimental results based on real and synthetic datasets show that the scheme can significantly improve the processing efficiency of temporal join operation.
ISSN:	1000-3428
DOI:	10.19678/j.issn.1000-3428.0052626