Acala: Aggregate Monitoring for Geo-Distributed Cluster Federations

Gespeichert in:
Bibliographische Detailangaben
Titel: Acala: Aggregate Monitoring for Geo-Distributed Cluster Federations
Autoren: Huang, Chih-Kai, Pierre, Guillaume
Weitere Verfasser: Design and Implementation of Autonomous Distributed Systems (MYRIADS), Inria Rennes – Bretagne Atlantique, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-SYSTÈMES LARGE ÉCHELLE (IRISA-D1), Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA), Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-Institut National de Recherche en Informatique et en Automatique (Inria)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-IMT Atlantique (IMT Atlantique), Institut Mines-Télécom Paris (IMT)-Institut Mines-Télécom Paris (IMT)-Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut Mines-Télécom Paris (IMT)-Institut Mines-Télécom Paris (IMT)-Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-IMT Atlantique (IMT Atlantique), Institut Mines-Télécom Paris (IMT)-Institut Mines-Télécom Paris (IMT), Grid5000
Quelle: SAC 2023 - 38th ACM/SIGAPP Symposium On Applied Computing ; https://inria.hal.science/hal-03899133 ; SAC 2023 - 38th ACM/SIGAPP Symposium On Applied Computing, Mar 2023, Tallinn, Estonia. pp.1-9, ⟨10.1145/3555776.3577716⟩
Verlagsinformationen: HAL CCSD
ACM
Publikationsjahr: 2023
Bestand: Université de Rennes 1: Publications scientifiques (HAL)
Schlagwörter: Monitoring, Geo-distributed cluster federations, Prometheus, Metrics aggregation and deduplication, Fog computing, [INFO.INFO-OS]Computer Science [cs]/Operating Systems [cs.OS], [INFO.INFO-DC]Computer Science [cs]/Distributed, Parallel, and Cluster Computing [cs.DC]
Geographisches Schlagwort: Tallinn, Estonia
Time: Tallinn, Estonia
Beschreibung: International audience ; Distributed monitoring is an essential functionality to allow large cluster federations to efficiently schedule applications on a set of available geo-distributed resources. However, periodically reporting the precise status of each available server is both unnecessary to allow accurate scheduling and unscalable when the number of servers grows. This paper proposes Acala, a monitoring framework for geo-distributed cluster federations which aims to provide the management cluster with aggregate information about the entire cluster instead of individual servers. Our evaluations, based on actual deployment under controlled environment in the geodistributed Grid'5000 testbed, show that Acala reduces the crosscluster network traffic by up to 99% and the scrape duration by up to 55%.
Publikationsart: conference object
Sprache: English
DOI: 10.1145/3555776.3577716
Verfügbarkeit: https://inria.hal.science/hal-03899133
https://inria.hal.science/hal-03899133v1/document
https://inria.hal.science/hal-03899133v1/file/main.pdf
https://doi.org/10.1145/3555776.3577716
Rights: http://creativecommons.org/licenses/by/ ; info:eu-repo/semantics/OpenAccess
Dokumentencode: edsbas.32C4BC7F
Datenbank: BASE
Beschreibung
Abstract:International audience ; Distributed monitoring is an essential functionality to allow large cluster federations to efficiently schedule applications on a set of available geo-distributed resources. However, periodically reporting the precise status of each available server is both unnecessary to allow accurate scheduling and unscalable when the number of servers grows. This paper proposes Acala, a monitoring framework for geo-distributed cluster federations which aims to provide the management cluster with aggregate information about the entire cluster instead of individual servers. Our evaluations, based on actual deployment under controlled environment in the geodistributed Grid'5000 testbed, show that Acala reduces the crosscluster network traffic by up to 99% and the scrape duration by up to 55%.
DOI:10.1145/3555776.3577716