An intermediate data placement algorithm for load balancing in Spark computing environment

Since MapReduce became an effective and popular programming framework for parallel data processing, key skew in intermediate data has become one of the important system performance bottlenecks. For solving the load imbalance of bucket containers in the shuffle process of the Spark computing framewor...

Full description

Saved in:

Bibliographic Details
Published in:	Future generation computer systems Vol. 78; pp. 287 - 301
Main Authors:	Tang, Zhuo, Zhang, Xiangshen, Li, Kenli, Li, Keqin
Format:	Journal Article
Language:	English
Published:	Elsevier B.V 01.01.2018
Subjects:	Data sampling Data skew Load balancing MapReduce Spark Data skew Spark Load balancing Data sampling MapReduce
ISSN:	0167-739X, 1872-7115
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Be the first to leave a comment!