HBase/Phoenix-based Data Collection and Storage for the ATLAS EventIndex.

Uloženo v:
Podrobná bibliografie
Název: HBase/Phoenix-based Data Collection and Storage for the ATLAS EventIndex.
Autoři: García Montoro, Carlos, Sánchez, Javier, Barberis, Dario, González de la Hoz, Santiago, Salt, Jose
Zdroj: EPJ Web of Conferences; 5/6/2024, Vol. 295, p1-7, 7p
Témata: ACQUISITION of data, BACK up systems, DATA warehousing, ORCHESTRA, EXTRACTION apparatus
Abstrakt: The ATLAS EventIndex is the global catalogue of all ATLAS real and simulated events. During the LHC long shutdown between Run 2 (20152018) and Run 3 (2022-2025) its components were substantially revised, and a new system was deployed for the start of Run 3 in Spring 2022. The new core storage system is based on HBase tables with a Phoenix interface. It allows faster data ingestion rates and scales better than the old system. This paper describes the data collection, the technical design of the core storage, and the properties that make it fast and efficient, namely the compact and optimized design of the events table, which already holds more than 400 billion entries, and all the auxiliary tables, and the EventIndex Supervisor, in charge of orchestrating the whole data collection, now simplified thanks to the Loaders, the Spark jobs that load the data into the new core system. The extractors, in charge of preparing the pieces of data that the loaders will put into the final back-end, have been updated too. The data migration from HDFS to HBase and Phoenix is also described. [ABSTRACT FROM AUTHOR]
Copyright of EPJ Web of Conferences is the property of EDP Sciences and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Databáze: Complementary Index
FullText Text:
  Availability: 0
CustomLinks:
  – Url: https://resolver.ebscohost.com/openurl?sid=EBSCO:edb&genre=article&issn=21016275&ISBN=&volume=295&issue=&date=20240506&spage=1&pages=1-7&title=EPJ Web of Conferences&atitle=HBase%2FPhoenix-based%20Data%20Collection%20and%20Storage%20for%20the%20ATLAS%20EventIndex.&aulast=Garc%C3%ADa%20Montoro%2C%20Carlos&id=DOI:10.1051/epjconf/202429501034
    Name: Full Text Finder
    Category: fullText
    Text: Full Text Finder
    Icon: https://imageserver.ebscohost.com/branding/images/FTF.gif
    MouseOverText: Full Text Finder
  – Url: https://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=EBSCO&SrcAuth=EBSCO&DestApp=WOS&ServiceName=TransferToWoS&DestLinkType=GeneralSearchSummary&Func=Links&author=Montoro%20G
    Name: ISI
    Category: fullText
    Text: Nájsť tento článok vo Web of Science
    Icon: https://imagesrvr.epnet.com/ls/20docs.gif
    MouseOverText: Nájsť tento článok vo Web of Science
Header DbId: edb
DbLabel: Complementary Index
An: 177902233
RelevancyScore: 998
AccessLevel: 6
PubType: Conference
PubTypeId: conference
PreciseRelevancyScore: 998.42138671875
IllustrationInfo
Items – Name: Title
  Label: Title
  Group: Ti
  Data: HBase/Phoenix-based Data Collection and Storage for the ATLAS EventIndex.
– Name: Author
  Label: Authors
  Group: Au
  Data: <searchLink fieldCode="AR" term="%22García+Montoro%2C+Carlos%22">García Montoro, Carlos</searchLink><br /><searchLink fieldCode="AR" term="%22Sánchez%2C+Javier%22">Sánchez, Javier</searchLink><br /><searchLink fieldCode="AR" term="%22Barberis%2C+Dario%22">Barberis, Dario</searchLink><br /><searchLink fieldCode="AR" term="%22González+de+la+Hoz%2C+Santiago%22">González de la Hoz, Santiago</searchLink><br /><searchLink fieldCode="AR" term="%22Salt%2C+Jose%22">Salt, Jose</searchLink>
– Name: TitleSource
  Label: Source
  Group: Src
  Data: EPJ Web of Conferences; 5/6/2024, Vol. 295, p1-7, 7p
– Name: Subject
  Label: Subject Terms
  Group: Su
  Data: <searchLink fieldCode="DE" term="%22ACQUISITION+of+data%22">ACQUISITION of data</searchLink><br /><searchLink fieldCode="DE" term="%22BACK+up+systems%22">BACK up systems</searchLink><br /><searchLink fieldCode="DE" term="%22DATA+warehousing%22">DATA warehousing</searchLink><br /><searchLink fieldCode="DE" term="%22ORCHESTRA%22">ORCHESTRA</searchLink><br /><searchLink fieldCode="DE" term="%22EXTRACTION+apparatus%22">EXTRACTION apparatus</searchLink>
– Name: Abstract
  Label: Abstract
  Group: Ab
  Data: The ATLAS EventIndex is the global catalogue of all ATLAS real and simulated events. During the LHC long shutdown between Run 2 (20152018) and Run 3 (2022-2025) its components were substantially revised, and a new system was deployed for the start of Run 3 in Spring 2022. The new core storage system is based on HBase tables with a Phoenix interface. It allows faster data ingestion rates and scales better than the old system. This paper describes the data collection, the technical design of the core storage, and the properties that make it fast and efficient, namely the compact and optimized design of the events table, which already holds more than 400 billion entries, and all the auxiliary tables, and the EventIndex Supervisor, in charge of orchestrating the whole data collection, now simplified thanks to the Loaders, the Spark jobs that load the data into the new core system. The extractors, in charge of preparing the pieces of data that the loaders will put into the final back-end, have been updated too. The data migration from HDFS to HBase and Phoenix is also described. [ABSTRACT FROM AUTHOR]
– Name: Abstract
  Label:
  Group: Ab
  Data: <i>Copyright of EPJ Web of Conferences is the property of EDP Sciences and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract.</i> (Copyright applies to all Abstracts.)
PLink https://erproxy.cvtisr.sk/sfx/access?url=https://search.ebscohost.com/login.aspx?direct=true&site=eds-live&db=edb&AN=177902233
RecordInfo BibRecord:
  BibEntity:
    Identifiers:
      – Type: doi
        Value: 10.1051/epjconf/202429501034
    Languages:
      – Code: eng
        Text: English
    PhysicalDescription:
      Pagination:
        PageCount: 7
        StartPage: 1
    Subjects:
      – SubjectFull: ACQUISITION of data
        Type: general
      – SubjectFull: BACK up systems
        Type: general
      – SubjectFull: DATA warehousing
        Type: general
      – SubjectFull: ORCHESTRA
        Type: general
      – SubjectFull: EXTRACTION apparatus
        Type: general
    Titles:
      – TitleFull: HBase/Phoenix-based Data Collection and Storage for the ATLAS EventIndex.
        Type: main
  BibRelationships:
    HasContributorRelationships:
      – PersonEntity:
          Name:
            NameFull: García Montoro, Carlos
      – PersonEntity:
          Name:
            NameFull: Sánchez, Javier
      – PersonEntity:
          Name:
            NameFull: Barberis, Dario
      – PersonEntity:
          Name:
            NameFull: González de la Hoz, Santiago
      – PersonEntity:
          Name:
            NameFull: Salt, Jose
    IsPartOfRelationships:
      – BibEntity:
          Dates:
            – D: 06
              M: 05
              Text: 5/6/2024
              Type: published
              Y: 2024
          Identifiers:
            – Type: issn-print
              Value: 21016275
          Numbering:
            – Type: volume
              Value: 295
          Titles:
            – TitleFull: EPJ Web of Conferences
              Type: main
ResultId 1