A Serverless Engine for High Energy Physics Distributed Analysis
The Large Hadron Collider (LHC) at CERN has generated in the last decade an unprecedented volume of data for the High-Energy Physics (HEP) field. Scientific collaborations interested in analysing such data very often require computing power beyond a single machine. This issue has been tackled tradit...
Saved in:
| Published in: | 2022 22nd IEEE International Symposium on Cluster, Cloud and Internet Computing (CCGrid) pp. 575 - 584 |
|---|---|
| Main Authors: | , , , , , , , |
| Format: | Conference Proceeding |
| Language: | English |
| Published: |
IEEE
01.05.2022
|
| Subjects: | |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Abstract | The Large Hadron Collider (LHC) at CERN has generated in the last decade an unprecedented volume of data for the High-Energy Physics (HEP) field. Scientific collaborations interested in analysing such data very often require computing power beyond a single machine. This issue has been tackled traditionally by running analyses in distributed environments using stateful, managed batch computing systems. While this approach has been effective so far, current estimates for future computing needs of the field present large scaling challenges. Such a managed approach may not be the only viable way to tackle them and an interesting alternative could be provided by serverless architectures, to enable an even larger scaling potential. This work describes a novel approach to running real HEP scientific applications through a distributed serverless computing engine. The engine is built upon ROOT, a well-established HEP data analysis software, and distributes its computations to a large pool of concurrent executions on Amazon Web Services Lambda Serverless Platform. Thanks to the developed tool, physicists are able to access datasets stored at CERN (also those that are under restricted access policies) and process it on remote infrastructures outside of their typical environment. The analysis of the serverless functions is monitored at runtime to gather performance metrics, both for data- and computation-intensive workloads. |
|---|---|
| AbstractList | The Large Hadron Collider (LHC) at CERN has generated in the last decade an unprecedented volume of data for the High-Energy Physics (HEP) field. Scientific collaborations interested in analysing such data very often require computing power beyond a single machine. This issue has been tackled traditionally by running analyses in distributed environments using stateful, managed batch computing systems. While this approach has been effective so far, current estimates for future computing needs of the field present large scaling challenges. Such a managed approach may not be the only viable way to tackle them and an interesting alternative could be provided by serverless architectures, to enable an even larger scaling potential. This work describes a novel approach to running real HEP scientific applications through a distributed serverless computing engine. The engine is built upon ROOT, a well-established HEP data analysis software, and distributes its computations to a large pool of concurrent executions on Amazon Web Services Lambda Serverless Platform. Thanks to the developed tool, physicists are able to access datasets stored at CERN (also those that are under restricted access policies) and process it on remote infrastructures outside of their typical environment. The analysis of the serverless functions is monitored at runtime to gather performance metrics, both for data- and computation-intensive workloads. |
| Author | Avati, Valentina Malawski, Maciej Kusnierz, Jacek Saavedra, Enric Tejedor Burkiewicz, Kamil Alonso-Jorda, Pedro Pitt, Michael Padulano, Vincenzo E. |
| Author_xml | – sequence: 1 givenname: Jacek surname: Kusnierz fullname: Kusnierz, Jacek email: kusnierz@protonmail.com organization: Institute of Computer Science, AGH,Kraków,Poland – sequence: 2 givenname: Vincenzo E. surname: Padulano fullname: Padulano, Vincenzo E. email: vincenzo.eduardo.padulano@cern.ch organization: EP-SFT, CERN,Geneva,Switzerland – sequence: 3 givenname: Maciej surname: Malawski fullname: Malawski, Maciej email: malawski@agh.edu.pl organization: Institute of Computer Science, AGH,Kraków,Poland – sequence: 4 givenname: Kamil surname: Burkiewicz fullname: Burkiewicz, Kamil organization: Institute of Computer Science, AGH,Kraków,Poland – sequence: 5 givenname: Enric Tejedor surname: Saavedra fullname: Saavedra, Enric Tejedor organization: Institute of Computer Science, AGH,Kraków,Poland – sequence: 6 givenname: Pedro surname: Alonso-Jorda fullname: Alonso-Jorda, Pedro email: palonso@upv.es organization: DSIC, UPV,Valencia,Spain – sequence: 7 givenname: Michael surname: Pitt fullname: Pitt, Michael organization: EP-CMG-OS, CERN,Geneva,Switzerland – sequence: 8 givenname: Valentina surname: Avati fullname: Avati, Valentina organization: EP-UHC, CERN,Geneva,Switzerland |
| BookMark | eNotjs1Kw0AURkfQhdY-gSDzAonzP3N3hlhbodCCui5J5046EFOZiULe3oiuDt9ZfJwbcjmcByTknrOScwYPdb1O0WulnSoFE6JkjBl7QZZgHTdGKwBt4Jo8VvQV0zemHnOmq6GLA9JwTnQTu9O8MXUT3Z-mHI-ZPsU8pth-jehpNTT9bPMtuQpNn3H5zwV5f1691Ztiu1u_1NW2OApnxwIEgPutEUE4poIVrG21DI0P2jDw3DVK28C5YSh1AGGk9S54Myv0XskFufv7jYh4-Ezxo0nTAZwwTBr5A-BfRyE |
| CODEN | IEEPAD |
| ContentType | Conference Proceeding |
| DBID | 6IE 6IL CBEJK RIE RIL |
| DOI | 10.1109/CCGrid54584.2022.00067 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| EISBN | 9781665499569 1665499567 |
| EndPage | 584 |
| ExternalDocumentID | 9826036 |
| Genre | orig-research |
| GroupedDBID | 6IE 6IL CBEJK RIE RIL |
| ID | FETCH-LOGICAL-c287t-9299845842f2804f720bb53fadf5609d18a457f1160e35f92637d8fd6f11edd43 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 2 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000855065800058&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| IngestDate | Thu Jun 29 18:36:46 EDT 2023 |
| IsDoiOpenAccess | false |
| IsOpenAccess | true |
| IsPeerReviewed | false |
| IsScholarly | false |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c287t-9299845842f2804f720bb53fadf5609d18a457f1160e35f92637d8fd6f11edd43 |
| OpenAccessLink | http://cds.cern.ch/record/2815205 |
| PageCount | 10 |
| ParticipantIDs | ieee_primary_9826036 |
| PublicationCentury | 2000 |
| PublicationDate | 2022-May |
| PublicationDateYYYYMMDD | 2022-05-01 |
| PublicationDate_xml | – month: 05 year: 2022 text: 2022-May |
| PublicationDecade | 2020 |
| PublicationTitle | 2022 22nd IEEE International Symposium on Cluster, Cloud and Internet Computing (CCGrid) |
| PublicationTitleAbbrev | CCGRID |
| PublicationYear | 2022 |
| Publisher | IEEE |
| Publisher_xml | – name: IEEE |
| Score | 1.8582599 |
| Snippet | The Large Hadron Collider (LHC) at CERN has generated in the last decade an unprecedented volume of data for the High-Energy Physics (HEP) field. Scientific... |
| SourceID | ieee |
| SourceType | Publisher |
| StartPage | 575 |
| SubjectTerms | AWS C++ languages CERN Codes Computer architecture Distributed Computing HEP Lambda Large Hadron Collider MapReduce ROOT Runtime Serverless Serverless computing Web services |
| Title | A Serverless Engine for High Energy Physics Distributed Analysis |
| URI | https://ieeexplore.ieee.org/document/9826036 |
| WOSCitedRecordID | wos000855065800058&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV09T8MwED21FQMToBbxLQ-MhCZO4o8NFChMVQeQulVObEtdWpSk_H7unFCExMKWWJGic2zf5e7eewC33uCqMFkeecd9lAlOYOXYR7iWeCVMEHsIYhNyPlfLpV4M4G6PhXHOheYzd0-XoZZvt9WOUmVTjbEwnrhDGEopOqxWD_pNYj0tipd6bakQRLkSHog4xW_VlOA0Zkf_e90xTH7Qd2yx9ysnMHCbMTw8MtrWjmrjDetYBBkGnIwaNfCeEHwstHNWDXsiNlwSsnKWfbOOTOB99vxWvEa9-kFU4V9MG2HcohUZwz1XceYlj8syT72xHqMUbROaYumTRMQuzb3mIpVWeStwyFmbpacw2mw37gwY16VUpRKUdMrQpRsujFFa4OMy51Kcw5isX310BBer3vCLv4cv4ZCmt-v6u4JRW-_cNRxUn-26qW_CV_kCTP6Nmw |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV07T8MwED6VggQToBbxxgMjoYnj-LGBCqWIEnUoUrfKiW2pS1r1we_H55QiJBa2xIoUnWP7Lnf3fR_ArdN-VWiWRc5SFzFOEawcu8ivJVpyHcQegtiEyHM5HqthA-62WBhrbWg-s_d4GWr5ZlauMVXWUT4W9ifuDuxmjNG4RmttYL9JrDrd7stiarAUhNkSGqg4-W_dlOA2eof_e-ERtH_wd2S49SzH0LBVCx4eCW5si9XxJal5BIkPOQm2avh7xPCR0NBZLskT8uGilJU15Jt3pA0fvedRtx9t9A-i0v_HrCIfuSiJxlBHZcycoHFRZKnTxvk4RZkEJ1m4JOGxTTOnKE-Fkc5wP2SNYekJNKtZZU-BUFUIWUiOaSfmnbqmXGupuH9cZFTwM2ih9ZN5TXEx2Rh-_vfwDez3R--DyeA1f7uAA5zqugfwEpqrxdpewV75uZouF9fhC30B3U2Q4g |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2022+22nd+IEEE+International+Symposium+on+Cluster%2C+Cloud+and+Internet+Computing+%28CCGrid%29&rft.atitle=A+Serverless+Engine+for+High+Energy+Physics+Distributed+Analysis&rft.au=Kusnierz%2C+Jacek&rft.au=Padulano%2C+Vincenzo+E.&rft.au=Malawski%2C+Maciej&rft.au=Burkiewicz%2C+Kamil&rft.date=2022-05-01&rft.pub=IEEE&rft.spage=575&rft.epage=584&rft_id=info:doi/10.1109%2FCCGrid54584.2022.00067&rft.externalDocID=9826036 |