Performance Factor Analysis and Scope of Optimization for Big Data Processing on Cluster

Use of computational cluster for large-scale Big Data processing has attracted attention as a technology trend for its time efficiency. Modern cluster equipped with latest multi, many-core distributed shared architecture, high speed interconnect and file system, ensures high performance using messag...

Full description

Saved in:

Bibliographic Details
Published in:	International Conference on Parallel, Distributed and Grid Computing (PDGC ...) pp. 418 - 423
Main Authors:	Godara, Hanuman, Govil, M.C., Pilli, E.S.
Format:	Conference Proceeding
Language:	English
Published:	IEEE 01.12.2018
Subjects:	Batch production systems Big Data Big Data Processing Computational Cluster Conferences High-Performance Computing Multicore processing Optimization Parallelization PARAM-Kanchenjunga Program processors Sparks
ISSN:	2573-3079
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Use of computational cluster for large-scale Big Data processing has attracted attention as a technology trend for its time efficiency. Modern cluster equipped with latest multi, many-core distributed shared architecture, high speed interconnect and file system, ensures high performance using message passing and multi-threading parallel approaches, also handles batch, micro-batch and stream processing of high dimensional massive dataset but running data-intensive Big Data application on compute-centric cluster imposes challenges to its performance because of several runtime overheads. In order to alleviate these bottlenecks and exploit full potential of the cluster a state of the practice, performance-oriented technical analysis covering all relevant aspects is presented in the context of Terascale Big data processing on TeraFLOPS cluster PARAM-Kanchenjunga, with identification of major factors influencing the performance or sources of these overheads related to computation, communication or IPC, memory, I/O contention, scheduling, load imbalance, synchronization, latency and network jitter; by determining their impact. As existing approaches found insufficient, to achieve possible speedup advance methods with a variety of alternatives as RDMA enabled libraries, PFS, MPI-Integrated extensions, loop tiling, hybrid parallelization are provided to consider for optimization purposes. This paper will assist to prepare performance aware design of experiments and performance modeling.
ISSN:	2573-3079
DOI:	10.1109/PDGC.2018.8745857