Real-Time Processing of Big Data Streams: Lifecycle, Tools, Tasks, and Challenges

In today's technological environments, the vast majority of big data-driven applications and solutions are based on real-time processing of streaming data. The real-time processing and analytics of big data streams play a crucial role in the development of big-data driven applications and solut...

Full description

Saved in:
Bibliographic Details
Published in:2018 2nd International Symposium on Multidisciplinary Studies and Innovative Technologies (ISMSIT) pp. 1 - 6
Main Authors: Gurcan, Fatih, Berigel, Muhammet
Format: Conference Proceeding
Language:English
Japanese
Published: IEEE 01.10.2018
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In today's technological environments, the vast majority of big data-driven applications and solutions are based on real-time processing of streaming data. The real-time processing and analytics of big data streams play a crucial role in the development of big-data driven applications and solutions. From this perspective, this paper defines a lifecycle for the real-time big data processing. It describes existing tools, tasks, and frameworks by associating them with the phases of the lifecycle, which include data ingestion, data storage, stream processing, analytical data store, and analysis and reporting. The paper also investigates the real-time big data processing tools consisting of Flume, Kafka, Nifi, Storm, Spark Streaming, S4, Flink, Samza, Hbase, Hive, Cassandra, Splunk, and Sap Hana. As well as, it discusses the up-to-date challenges of the real-time big data processing such as "volume, variety and heterogeneity", "data capture and storage", "inconsistency and incompleteness", "scalability", "real-time processing", "data visualization", "skill requirements", and "privacy and security". This paper may provide valuable insights into the understanding of the lifecycle, related tools and tasks, and challenges of real-time big data processing.
DOI:10.1109/ISMSIT.2018.8567061