KheOps: Cost-effective Repeatability, Reproducibility, and Replicability of Edge-to-Cloud Experiments

Saved in:
Bibliographic Details
Title: KheOps: Cost-effective Repeatability, Reproducibility, and Replicability of Edge-to-Cloud Experiments
Authors: Rosendo, Daniel, Keahey, Kate, Costan, Alexandru, Simonin, Matthieu, Valduriez, Patrick, Antoniu, Gabriel
Contributors: Scalable Storage for Clouds and Beyond (KerData), Centre Inria de l'Université de Rennes, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-SYSTÈMES LARGE ÉCHELLE (IRISA-D1), Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA), Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-Institut National de Recherche en Informatique et en Automatique (Inria)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-IMT Atlantique (IMT Atlantique), Institut Mines-Télécom Paris (IMT)-Institut Mines-Télécom Paris (IMT)-Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut Mines-Télécom Paris (IMT)-Institut Mines-Télécom Paris (IMT)-Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-IMT Atlantique (IMT Atlantique), Institut Mines-Télécom Paris (IMT)-Institut Mines-Télécom Paris (IMT), Scientific Data Management (ZENITH), Centre Inria d'Université Côte d'Azur, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier (LIRMM), Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM), Argonne National Laboratory Lemont (ANL), University of Chicago, Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA), Design and Implementation of Autonomous Distributed Systems (MYRIADS), HPC-BigData Inria Challenge (IPL), JLESC, UNIFY 2, ANR-15-CE25-0003,OverFlow,Workflow Data Management as a Service pour des Applications Multi-Site(2015)
Source: ACM REP '23: Proceedings of the 2023 ACM Conference on Reproducibility and Replicability ; REP 2023 - ACM Conference on Reproducibility and Replicability ; https://hal.science/hal-04157720 ; REP 2023 - ACM Conference on Reproducibility and Replicability, Jun 2023, Santa Cruz, CA, United States. pp.62-73, ⟨10.1145/3589806.3600032⟩ ; https://dl.acm.org/doi/proceedings/10.1145/3589806
Publisher Information: CCSD
ACM
Publication Year: 2023
Collection: LIRMM: HAL (Laboratoire d’Informatique, de Robotique et de Microélectronique de Montpellier)
Subject Terms: Reproducibility, Replicability, Repeatability, Computing Continuum, Workflows, Edge Computing, Cloud Computing, [INFO.INFO-DC]Computer Science [cs]/Distributed, Parallel, and Cluster Computing [cs.DC], [INFO.INFO-NI]Computer Science [cs]/Networking and Internet Architecture [cs.NI], [INFO.INFO-SY]Computer Science [cs]/Systems and Control [cs.SY]
Subject Geographic: Santa Cruz, CA, United States
Description: International audience ; Distributed infrastructures for computation and analytics are now evolving towards an interconnected ecosystem allowing complex scientific workflows to be executed across hybrid systems spanning from IoT Edge devices to Clouds, and sometimes to supercomputers (the Computing Continuum). Understanding the performance trade-offs of large-scale workflows deployed on such complex Edge-to-Cloud Continuum is challenging. To achieve this, one needs to systematically perform experiments, to enable their reproducibility and allow other researchers to replicate the study and the obtained conclusions on different infrastructures. This breaks down to the tedious process of reconciling the numerous experimental requirements and constraints with low-level infrastructure design choices.To address the limitations of the main state-of-the-art approaches for distributed, collaborative experimentation, such as Google Colab, Kaggle, and Code Ocean, we propose KheOps, a collaborative environment specifically designed to enable cost-effective reproducibility and replicability of Edge-to-Cloud experiments. KheOps is composed of three core elements: (1) an experiment repository; (2) a notebook environment; and (3) a multi-platform experiment methodology.We illustrate KheOps with a real-life Edge-to-Cloud application. The evaluations explore the point of view of the authors of an experiment described in an article (who aim to make their experiments reproducible) and the perspective of their readers (who aim to replicate the experiment). The results show how KheOps helps authors to systematically perform repeatable and reproducible experiments on the Grid5000 + FIT IoT LAB testbeds. Furthermore, KheOps helps readers to cost-effectively replicate authors experiments in different infrastructures such as Chameleon Cloud + CHI@Edge testbeds, and obtain the same conclusions with high accuracies (> 88% for all performance metrics).
Document Type: conference object
Language: English
Relation: info:eu-repo/semantics/altIdentifier/arxiv/2307.12796; ARXIV: 2307.12796
DOI: 10.1145/3589806.3600032
Availability: https://hal.science/hal-04157720
https://hal.science/hal-04157720v1/document
https://hal.science/hal-04157720v1/file/0_main.pdf
https://doi.org/10.1145/3589806.3600032
Rights: http://creativecommons.org/licenses/by/ ; info:eu-repo/semantics/OpenAccess
Accession Number: edsbas.32D51E38
Database: BASE
Be the first to leave a comment!
You must be logged in first