Acala: Aggregate Monitoring for Geo-Distributed Cluster Federations

Uloženo v:
Podrobná bibliografie
Název: Acala: Aggregate Monitoring for Geo-Distributed Cluster Federations
Autoři: Huang, Chih-Kai, Pierre, Guillaume
Přispěvatelé: Design and Implementation of Autonomous Distributed Systems (MYRIADS), Inria Rennes – Bretagne Atlantique, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-SYSTÈMES LARGE ÉCHELLE (IRISA-D1), Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA), Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-Institut National de Recherche en Informatique et en Automatique (Inria)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-IMT Atlantique (IMT Atlantique), Institut Mines-Télécom Paris (IMT)-Institut Mines-Télécom Paris (IMT)-Université de Rennes (UR)-Institut National des Sciences Appliquées - Rennes (INSA Rennes), Institut Mines-Télécom Paris (IMT)-Institut Mines-Télécom Paris (IMT)-Institut de Recherche en Informatique et Systèmes Aléatoires (IRISA), Institut National des Sciences Appliquées (INSA)-Institut National des Sciences Appliquées (INSA)-Université de Bretagne Sud (UBS)-École normale supérieure - Rennes (ENS Rennes)-CentraleSupélec-Centre National de la Recherche Scientifique (CNRS)-IMT Atlantique (IMT Atlantique), Institut Mines-Télécom Paris (IMT)-Institut Mines-Télécom Paris (IMT), Grid5000
Zdroj: SAC 2023 - 38th ACM/SIGAPP Symposium On Applied Computing ; https://inria.hal.science/hal-03899133 ; SAC 2023 - 38th ACM/SIGAPP Symposium On Applied Computing, Mar 2023, Tallinn, Estonia. pp.1-9, ⟨10.1145/3555776.3577716⟩
Informace o vydavateli: HAL CCSD
ACM
Rok vydání: 2023
Sbírka: Université de Rennes 1: Publications scientifiques (HAL)
Témata: Monitoring, Geo-distributed cluster federations, Prometheus, Metrics aggregation and deduplication, Fog computing, [INFO.INFO-OS]Computer Science [cs]/Operating Systems [cs.OS], [INFO.INFO-DC]Computer Science [cs]/Distributed, Parallel, and Cluster Computing [cs.DC]
Geografické téma: Tallinn, Estonia
Time: Tallinn, Estonia
Popis: International audience ; Distributed monitoring is an essential functionality to allow large cluster federations to efficiently schedule applications on a set of available geo-distributed resources. However, periodically reporting the precise status of each available server is both unnecessary to allow accurate scheduling and unscalable when the number of servers grows. This paper proposes Acala, a monitoring framework for geo-distributed cluster federations which aims to provide the management cluster with aggregate information about the entire cluster instead of individual servers. Our evaluations, based on actual deployment under controlled environment in the geodistributed Grid'5000 testbed, show that Acala reduces the crosscluster network traffic by up to 99% and the scrape duration by up to 55%.
Druh dokumentu: conference object
Jazyk: English
DOI: 10.1145/3555776.3577716
Dostupnost: https://inria.hal.science/hal-03899133
https://inria.hal.science/hal-03899133v1/document
https://inria.hal.science/hal-03899133v1/file/main.pdf
https://doi.org/10.1145/3555776.3577716
Rights: http://creativecommons.org/licenses/by/ ; info:eu-repo/semantics/OpenAccess
Přístupové číslo: edsbas.32C4BC7F
Databáze: BASE
Popis
Abstrakt:International audience ; Distributed monitoring is an essential functionality to allow large cluster federations to efficiently schedule applications on a set of available geo-distributed resources. However, periodically reporting the precise status of each available server is both unnecessary to allow accurate scheduling and unscalable when the number of servers grows. This paper proposes Acala, a monitoring framework for geo-distributed cluster federations which aims to provide the management cluster with aggregate information about the entire cluster instead of individual servers. Our evaluations, based on actual deployment under controlled environment in the geodistributed Grid'5000 testbed, show that Acala reduces the crosscluster network traffic by up to 99% and the scrape duration by up to 55%.
DOI:10.1145/3555776.3577716