A Serverless Engine for High Energy Physics Distributed Analysis

The Large Hadron Collider (LHC) at CERN has generated in the last decade an unprecedented volume of data for the High-Energy Physics (HEP) field. Scientific collaborations interested in analysing such data very often require computing power beyond a single machine. This issue has been tackled tradit...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:arXiv.org
Hauptverfasser: Kuśnierz, Jacek, Padulano, Vincenzo Eduardo, Malawski, Maciej, Burkiewicz, Kamil, Enric Tejedor Saavedra, Alonso-Jordá, Pedro, Pitt, Michael, Avati, Valentina
Format: Paper
Sprache:Englisch
Veröffentlicht: Ithaca Cornell University Library, arXiv.org 02.06.2022
Schlagworte:
ISSN:2331-8422
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Abstract The Large Hadron Collider (LHC) at CERN has generated in the last decade an unprecedented volume of data for the High-Energy Physics (HEP) field. Scientific collaborations interested in analysing such data very often require computing power beyond a single machine. This issue has been tackled traditionally by running analyses in distributed environments using stateful, managed batch computing systems. While this approach has been effective so far, current estimates for future computing needs of the field present large scaling challenges. Such a managed approach may not be the only viable way to tackle them and an interesting alternative could be provided by serverless architectures, to enable an even larger scaling potential. This work describes a novel approach to running real HEP scientific applications through a distributed serverless computing engine. The engine is built upon ROOT, a well-established HEP data analysis software, and distributes its computations to a large pool of concurrent executions on Amazon Web Services Lambda Serverless Platform. Thanks to the developed tool, physicists are able to access datasets stored at CERN (also those that are under restricted access policies) and process it on remote infrastructures outside of their typical environment. The analysis of the serverless functions is monitored at runtime to gather performance metrics, both for data- and computation-intensive workloads.
AbstractList The Large Hadron Collider (LHC) at CERN has generated in the last decade an unprecedented volume of data for the High-Energy Physics (HEP) field. Scientific collaborations interested in analysing such data very often require computing power beyond a single machine. This issue has been tackled traditionally by running analyses in distributed environments using stateful, managed batch computing systems. While this approach has been effective so far, current estimates for future computing needs of the field present large scaling challenges. Such a managed approach may not be the only viable way to tackle them and an interesting alternative could be provided by serverless architectures, to enable an even larger scaling potential. This work describes a novel approach to running real HEP scientific applications through a distributed serverless computing engine. The engine is built upon ROOT, a well-established HEP data analysis software, and distributes its computations to a large pool of concurrent executions on Amazon Web Services Lambda Serverless Platform. Thanks to the developed tool, physicists are able to access datasets stored at CERN (also those that are under restricted access policies) and process it on remote infrastructures outside of their typical environment. The analysis of the serverless functions is monitored at runtime to gather performance metrics, both for data- and computation-intensive workloads.
Author Alonso-Jordá, Pedro
Avati, Valentina
Malawski, Maciej
Enric Tejedor Saavedra
Burkiewicz, Kamil
Pitt, Michael
Kuśnierz, Jacek
Padulano, Vincenzo Eduardo
Author_xml – sequence: 1
  givenname: Jacek
  surname: Kuśnierz
  fullname: Kuśnierz, Jacek
– sequence: 2
  givenname: Vincenzo
  surname: Padulano
  middlename: Eduardo
  fullname: Padulano, Vincenzo Eduardo
– sequence: 3
  givenname: Maciej
  surname: Malawski
  fullname: Malawski, Maciej
– sequence: 4
  givenname: Kamil
  surname: Burkiewicz
  fullname: Burkiewicz, Kamil
– sequence: 5
  fullname: Enric Tejedor Saavedra
– sequence: 6
  givenname: Pedro
  surname: Alonso-Jordá
  fullname: Alonso-Jordá, Pedro
– sequence: 7
  givenname: Michael
  surname: Pitt
  fullname: Pitt, Michael
– sequence: 8
  givenname: Valentina
  surname: Avati
  fullname: Avati, Valentina
BookMark eNotj11LwzAYhYMoOOd-gHcBr1vffDa7s8zphIGCux9JmnQZJdWkHe7fW9GrwzkXD-e5QZexjw6hOwIlV0LAg07f4VRSCrIEWHJ6gWaUMVIoTuk1WuR8BAAqKyoEm6HHGn-4dHKpcznjdWxDdNj3CW9Ce5i6S-0Zvx_OOdiMn0IeUjDj4BpcR91Na75FV1532S3-c452z-vdalNs315eV_W20IKSwnAGWoJsOFAiGTgvl7oy1lgwRjAwVmriWGOIrqzxCprGTgbUK-WJVozN0f0f9jP1X6PLw_7Yj2n6kPe_LoqDJIT9APWqTHs
ContentType Paper
Copyright 2022. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Copyright_xml – notice: 2022. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
DBID 8FE
8FG
ABJCF
ABUWG
AFKRA
AZQEC
BENPR
BGLVJ
CCPQU
DWQXO
HCIFZ
L6V
M7S
PHGZM
PHGZT
PIMPY
PKEHL
PQEST
PQGLB
PQQKQ
PQUKI
PRINS
PTHSS
DOI 10.48550/arxiv.2206.00942
DatabaseName ProQuest SciTech Collection
ProQuest Technology Collection
Materials Science & Engineering Collection
ProQuest Central (Alumni)
ProQuest Central UK/Ireland
ProQuest Central Essentials
ProQuest Central
ProQuest Technology Collection
ProQuest One Community College
ProQuest Central Korea
SciTech Premium Collection
ProQuest Engineering Collection
Engineering Database
ProQuest One Academic
ProQuest One Academic (New)
Publicly Available Content Database
ProQuest One Academic Middle East (New)
ProQuest One Academic Eastern Edition (DO NOT USE)
ProQuest One Applied & Life Sciences
ProQuest One Academic (retired)
ProQuest One Academic UKI Edition
ProQuest Central China
Engineering Collection
DatabaseTitle Publicly Available Content Database
Engineering Database
Technology Collection
ProQuest One Academic Middle East (New)
ProQuest Central Essentials
ProQuest One Academic Eastern Edition
ProQuest Central (Alumni Edition)
SciTech Premium Collection
ProQuest One Community College
ProQuest Technology Collection
ProQuest SciTech Collection
ProQuest Central China
ProQuest Central
ProQuest One Applied & Life Sciences
ProQuest Engineering Collection
ProQuest One Academic UKI Edition
ProQuest Central Korea
Materials Science & Engineering Collection
ProQuest Central (New)
ProQuest One Academic
ProQuest One Academic (New)
Engineering Collection
DatabaseTitleList Publicly Available Content Database
Database_xml – sequence: 1
  dbid: PIMPY
  name: ProQuest Publicly Available Content Database
  url: http://search.proquest.com/publiccontent
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Discipline Physics
EISSN 2331-8422
Genre Working Paper/Pre-Print
GroupedDBID 8FE
8FG
ABJCF
ABUWG
AFKRA
ALMA_UNASSIGNED_HOLDINGS
AZQEC
BENPR
BGLVJ
CCPQU
DWQXO
FRJ
HCIFZ
L6V
M7S
M~E
PHGZM
PHGZT
PIMPY
PKEHL
PQEST
PQGLB
PQQKQ
PQUKI
PRINS
PTHSS
ID FETCH-LOGICAL-a521-b430a606d4021630ef69a7bcbc0bb530bc6a1e3db1a7cbf80ddc2062f88f1a833
IEDL.DBID M7S
IngestDate Mon Jun 30 09:23:08 EDT 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-a521-b430a606d4021630ef69a7bcbc0bb530bc6a1e3db1a7cbf80ddc2062f88f1a833
Notes SourceType-Working Papers-1
ObjectType-Working Paper/Pre-Print-1
content type line 50
OpenAccessLink https://www.proquest.com/docview/2672840611?pq-origsite=%requestingapplication%
PQID 2672840611
PQPubID 2050157
ParticipantIDs proquest_journals_2672840611
PublicationCentury 2000
PublicationDate 20220602
PublicationDateYYYYMMDD 2022-06-02
PublicationDate_xml – month: 06
  year: 2022
  text: 20220602
  day: 02
PublicationDecade 2020
PublicationPlace Ithaca
PublicationPlace_xml – name: Ithaca
PublicationTitle arXiv.org
PublicationYear 2022
Publisher Cornell University Library, arXiv.org
Publisher_xml – name: Cornell University Library, arXiv.org
SSID ssj0002672553
Score 1.7964649
SecondaryResourceType preprint
Snippet The Large Hadron Collider (LHC) at CERN has generated in the last decade an unprecedented volume of data for the High-Energy Physics (HEP) field. Scientific...
SourceID proquest
SourceType Aggregation Database
SubjectTerms CERN
Computation
Data analysis
Large Hadron Collider
Performance measurement
System effectiveness
Web services
Title A Serverless Engine for High Energy Physics Distributed Analysis
URI https://www.proquest.com/docview/2672840611
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV07T8MwED5BCxITbxUolQfW0CR2Emfi2QokqCLoUKbKT6lLW5JS8fPxuSlIDCyMVhIpujj3-s7fB3DBc8WUQOjf5knAEuV-KVc4BzQ1NDQqY0x7dv2nbDDgo1Fe1A23qh6rXPtE76j1TGGPvBunmfOkLvpEV_P3AFWjEF2tJTQ2oYksCZEf3Xv97rHgM0lCV2Cmp-7qivJzsryMY49B5CjQ_ssF-7jS3_3vG-1BsxBzU-7DhpkewLaf51TVIVzfEPQCBqH0iqxIB4nLTwnOdbg1Hvgj9d3kHslzUffKaLImKTmCYb83vHsIarGEQLgIHEhGQ-GKEe3qQZdihcamucikkiqUMqGhVKmIDNUyEpmSlodaK2eE2HJuI8EpPYbGdDY1LSCUm8SG2sU2KplyKYBLilgmbMx1zrhNTqC9tse43vDV-McYp39fPoOdGE8QYCMjbkNjUX6Yc9hSy8WkKjvQvO0NipeO_45uVTw-F29ft42nWg
linkProvider ProQuest
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V05T8MwFLYqCoKJWxwFPMAY6tg5nAEBolStWqpKdOgW-YrUpS1JKfCj-I88uw1IDGwdGCNnsJ7t733vRuiSJypQwob-syT0glDBkwLD2WORYcSoOAi0667fjXs9Phwm_Qr6LGthbFpliYkOqPVEWR95nUYxICloH_92-uLZqVE2ulqO0Fhci475eAOTrbhpN-B8ryhtPg4eWt5yqoAnQFV5MmBEAGvXYDgBFyEmixIRSyUVkTJkRKpI-IZp6YtYyYwTrRUlEc04z3zBrf8TEL8KLIImLlPw-dulY7cYhmwRO3Wdwuoifx_Nryl1IY_EzoP_hfhOjTW3_5kAdlC1L6Ym30UVM95DGy5bVRX76O4eW4wzNlGgwIuWihjYN7ZZK_Btyxnx8m_csK2B7VQvo3HZguUADVax50O0Np6MzRHCjJswIxo0N5OBAoIDlC-IRUa5TgKehceoVoo_XT7nIv2R_cnfyxdoszV46qbddq9ziraorZWwLhtaQ2uz_NWcoXU1n42K_NxdHYzSFZ_UF0JDAQU
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+Serverless+Engine+for+High+Energy+Physics+Distributed+Analysis&rft.jtitle=arXiv.org&rft.au=Ku%C5%9Bnierz%2C+Jacek&rft.au=Padulano%2C+Vincenzo+Eduardo&rft.au=Malawski%2C+Maciej&rft.au=Burkiewicz%2C+Kamil&rft.date=2022-06-02&rft.pub=Cornell+University+Library%2C+arXiv.org&rft.eissn=2331-8422&rft_id=info:doi/10.48550%2Farxiv.2206.00942