SpecK: Composition of Stream Processing Applications over Fog Environments

Saved in:
Bibliographic Details
Title: SpecK: Composition of Stream Processing Applications over Fog Environments
Authors: Battulga, Davaadorj, Miorandi, Daniele, Tedeschi, Cédric
Contributors: U-Hopper srl Trento, Design and Implementation of Autonomous Distributed Systems (MYRIADS), Inria Rennes – Bretagne Atlantique, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-SYSTÈMES LARGE ÉCHELLE (IRISA-D1), Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA), Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-Institut National de Recherche en Informatique et en Automatique (Inria)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-IMT Atlantique (IMT Atlantique), Institut Mines-Télécom Paris (IMT)-Institut Mines-Télécom Paris (IMT)-Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut Mines-Télécom Paris (IMT)-Institut Mines-Télécom Paris (IMT)-Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-IMT Atlantique (IMT Atlantique), Institut Mines-Télécom Paris (IMT)-Institut Mines-Télécom Paris (IMT), European Project: 765452,h2020,H2020-MSCA-ITN-2017,FogGuru(2017)
Source: DAIS 2021 - 21st International Conference on Distributed Applications and Interoperable Systems ; https://inria.hal.science/hal-03259975 ; DAIS 2021 - 21st International Conference on Distributed Applications and Interoperable Systems, Jun 2021, Valetta, Malta. pp.38-54, ⟨10.1007/978-3-030-78198-9_3⟩
Publisher Information: HAL CCSD
Publication Year: 2021
Collection: Université de Rennes 1: Publications scientifiques (HAL)
Subject Terms: stream processing, deployment, geographically distributed platforms, [INFO.INFO-OS]Computer Science [cs]/Operating Systems [cs.OS], [INFO.INFO-DC]Computer Science [cs]/Distributed, Parallel, and Cluster Computing [cs.DC]
Subject Geographic: Valetta, Malta
Description: International audience ; Stream Processing (SP), i.e., the processing of data in motion, as soon as it becomes available, is a hot topic in cloud computing. Various SP stacks exist today, with applications ranging from IoT analytics to processing of payment transactions. The backbone of said stacks are Stream Processing Engines (SPEs), software packages offering a high-level programming model and scalable execution of data stream processing pipelines. SPEs have been traditionally developed to work inside a single datacenter, and optimised for speed. With the advent of Fog computing, however, the processing of data streams needs to be carried out over multiple geographically distributed computing sites: Data gets typically pre-processed close to where they are generated, then aggregated at intermediate nodes, and finally globally and persistently stored in the Cloud. SPEs were not designed to address these new scenarios. In this paper, we argue that large scale Fog-based stream processing should rely on the coordinated composition of geographically dispersed SPE instances. We propose an architecture based on the composition of multiple SPE instances and their communication via distributed message brokers. We introduce SpecK, a tool to automate the deployment and adaptation of pipelines over a Fog computing platform. Given a description of the pipeline, SpecKcovers all the operations needed to deploy a stream processing computation over the different SPE instances targeted, using their own APIs and establishing the required communication channels to forward data among them. A prototypical implementation of SpecK is presented, and its performance is evaluated over Grid'5000, a large-scale, distributed experimental facility.
Document Type: conference object
Language: English
Relation: info:eu-repo/grantAgreement//765452/EU/FogGuru: Training the Next Generation of European Fog Computing Experts/FogGuru
DOI: 10.1007/978-3-030-78198-9_3
Availability: https://inria.hal.science/hal-03259975
https://inria.hal.science/hal-03259975v1/document
https://inria.hal.science/hal-03259975v1/file/dais.pdf
https://doi.org/10.1007/978-3-030-78198-9_3
Rights: info:eu-repo/semantics/OpenAccess
Accession Number: edsbas.DF3B6AE7
Database: BASE
Description
Abstract:International audience ; Stream Processing (SP), i.e., the processing of data in motion, as soon as it becomes available, is a hot topic in cloud computing. Various SP stacks exist today, with applications ranging from IoT analytics to processing of payment transactions. The backbone of said stacks are Stream Processing Engines (SPEs), software packages offering a high-level programming model and scalable execution of data stream processing pipelines. SPEs have been traditionally developed to work inside a single datacenter, and optimised for speed. With the advent of Fog computing, however, the processing of data streams needs to be carried out over multiple geographically distributed computing sites: Data gets typically pre-processed close to where they are generated, then aggregated at intermediate nodes, and finally globally and persistently stored in the Cloud. SPEs were not designed to address these new scenarios. In this paper, we argue that large scale Fog-based stream processing should rely on the coordinated composition of geographically dispersed SPE instances. We propose an architecture based on the composition of multiple SPE instances and their communication via distributed message brokers. We introduce SpecK, a tool to automate the deployment and adaptation of pipelines over a Fog computing platform. Given a description of the pipeline, SpecKcovers all the operations needed to deploy a stream processing computation over the different SPE instances targeted, using their own APIs and establishing the required communication channels to forward data among them. A prototypical implementation of SpecK is presented, and its performance is evaluated over Grid'5000, a large-scale, distributed experimental facility.
DOI:10.1007/978-3-030-78198-9_3