Ignis: An efficient and scalable multi-language Big Data framework.

Saved in:
Bibliographic Details
Title: Ignis: An efficient and scalable multi-language Big Data framework.
Authors: Piñeiro, César1 (AUTHOR) cesaralfredo.pineiro@usc.es, Martínez-Castaño, Rodrigo1 (AUTHOR) rodrigo.martinezl@usc.es, Pichel, Juan C.1 (AUTHOR) juancarlos.pichel@usc.es
Source: Future Generation Computer Systems. Apr2020, Vol. 105, p705-716. 12p.
Subject Terms: *BIG data, *PROGRAMMING languages, *ELECTRONIC data processing, *HIGH performance computing, SCIENTIFIC community, PIPING, PHYSICAL environment
Abstract: Most of the relevant Big Data processing frameworks (e.g., Apache Hadoop, Apache Spark) only support JVM (Java Virtual Machine) languages by default. In order to support non-JVM languages, subprocesses are created and connected to the framework using system pipes. With this technique, the impossibility of managing the data at thread level arises together with an important loss in the performance. To address this problem we introduce Ignis, a new Big Data framework that benefits from an elegant way to create multi-language executors managed through an RPC system. As a consequence, the new system is able to execute natively applications implemented using non-JVM languages. In addition, Ignis allows users to combine in the same application the benefits of implementing each computational task in the best suited programming language without additional overhead. The system runs completely inside Docker containers, isolating the execution environment from the physical machine. A comparison with Apache Spark shows the advantages of our proposal in terms of performance and scalability. • A new step forward toward the real convergence of HPC and Big Data worlds. • Efficient execution of multi-language applications without additional overhead. • Outperforms Spark considering some of the most typical algorithmic models. • Ignis API inspired by Spark to facilitate the adoption by the research community. • A completely isolated framework running inside Docker containers. [ABSTRACT FROM AUTHOR]
Copyright of Future Generation Computer Systems is the property of Elsevier B.V. and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Database: Business Source Index
Description
Abstract:Most of the relevant Big Data processing frameworks (e.g., Apache Hadoop, Apache Spark) only support JVM (Java Virtual Machine) languages by default. In order to support non-JVM languages, subprocesses are created and connected to the framework using system pipes. With this technique, the impossibility of managing the data at thread level arises together with an important loss in the performance. To address this problem we introduce Ignis, a new Big Data framework that benefits from an elegant way to create multi-language executors managed through an RPC system. As a consequence, the new system is able to execute natively applications implemented using non-JVM languages. In addition, Ignis allows users to combine in the same application the benefits of implementing each computational task in the best suited programming language without additional overhead. The system runs completely inside Docker containers, isolating the execution environment from the physical machine. A comparison with Apache Spark shows the advantages of our proposal in terms of performance and scalability. • A new step forward toward the real convergence of HPC and Big Data worlds. • Efficient execution of multi-language applications without additional overhead. • Outperforms Spark considering some of the most typical algorithmic models. • Ignis API inspired by Spark to facilitate the adoption by the research community. • A completely isolated framework running inside Docker containers. [ABSTRACT FROM AUTHOR]
ISSN:0167739X
DOI:10.1016/j.future.2019.12.052