Large-Scale Merging of Histograms using Distributed In-Memory Computing
Most high-energy physics analysis jobs are embarrassingly parallel except for the final merging of the output objects, which are typically histograms. Currently, the merging of output histograms scales badly. The running time for distributed merging depends not only on the overall number of bins but...
Uložené v:
| Vydané v: | Journal of physics. Conference series Ročník 664; číslo 9; s. 92003 - 92008 |
|---|---|
| Hlavní autori: | , |
| Médium: | Journal Article |
| Jazyk: | English |
| Vydavateľské údaje: |
Bristol
IOP Publishing
23.12.2015
|
| Predmet: | |
| ISSN: | 1742-6588, 1742-6596 |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Abstract | Most high-energy physics analysis jobs are embarrassingly parallel except for the final merging of the output objects, which are typically histograms. Currently, the merging of output histograms scales badly. The running time for distributed merging depends not only on the overall number of bins but also on the number partial histogram output files. That means, while the time to analyze data decreases linearly with the number of worker nodes, the time to merge the histograms in fact increases with the number of worker nodes. On the grid, merging jobs that take a few hours are not unusual. In order to improve the situation, we present a distributed and decentral merging algorithm whose running time is independent of the number of worker nodes. We exploit full bisection bandwidth of local networks and we keep all intermediate results in memory. We present benchmarks from an implementation using the parallel ROOT facility (PROOF) and RAMCloud, a distributed key-value store that keeps all data in DRAM. |
|---|---|
| AbstractList | Most high-energy physics analysis jobs are embarrassingly parallel except for the final merging of the output objects, which are typically histograms. Currently, the merging of output histograms scales badly. The running time for distributed merging depends not only on the overall number of bins but also on the number partial histogram output files. That means, while the time to analyze data decreases linearly with the number of worker nodes, the time to merge the histograms in fact increases with the number of worker nodes. On the grid, merging jobs that take a few hours are not unusual. In order to improve the situation, we present a distributed and decentral merging algorithm whose running time is independent of the number of worker nodes. We exploit full bisection bandwidth of local networks and we keep all intermediate results in memory. We present benchmarks from an implementation using the parallel ROOT facility (PROOF) and RAMCloud, a distributed key-value store that keeps all data in DRAM. |
| Author | Blomer, Jakob Ganis, Gerardo |
| Author_xml | – sequence: 1 givenname: Jakob surname: Blomer fullname: Blomer, Jakob email: jblomer@cern.ch organization: CERN , Geneva, Switzerland – sequence: 2 givenname: Gerardo surname: Ganis fullname: Ganis, Gerardo organization: CERN , Geneva, Switzerland |
| BookMark | eNqFkE1PwzAMhiM0JLbBX0CVOJfmo22SIyqwTerEAThHSZZWndamJO1h_55UncYRX2zZ72tbzwosOtsZAB4RfEaQsQTRFMd5xvMkz9OEJ5BjCMkNWF4Hi2vN2B1YeX8MghB0CTaldLWJP7U8mWhvXN10dWSraNv4wdZOtj4a_dR7DQ3XqHEwh2jXxXvTWneOCtv24xDm9-C2kidvHi55Db7f376KbVx-bHbFSxlrgtEQG220Ck9XKCWUcJ0xTjQzmkGZEYo1QQjjVCl9QAopjjKO0mCgSlWaKyjJGjzNe3tnf0bjB3G0o-vCSYEzmmeEpAwHVT6rtLPeO1OJ3jWtdGeBoJigiYmHmNiIAE1wMUMLRjwbG9v_bf7H9AsWv2-V |
| Cites_doi | 10.1016/S0168-9002(97)00048-X 10.1016/j.cpc.2011.02.008 10.1016/j.cpc.2009.08.005 10.1145/1965724.1965751 10.1145/1327452.1327492 |
| ContentType | Journal Article |
| Copyright | Published under licence by IOP Publishing Ltd 2015. This work is published under http://creativecommons.org/licenses/by/3.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. |
| Copyright_xml | – notice: Published under licence by IOP Publishing Ltd – notice: 2015. This work is published under http://creativecommons.org/licenses/by/3.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. |
| DBID | O3W TSCCA AAYXX CITATION 8FD 8FE 8FG ABUWG AFKRA ARAPS AZQEC BENPR BGLVJ CCPQU DWQXO H8D HCIFZ L7M P5Z P62 PHGZM PHGZT PIMPY PKEHL PQEST PQGLB PQQKQ PQUKI PRINS |
| DOI | 10.1088/1742-6596/664/9/092003 |
| DatabaseName | Institute of Physics Open Access Journal Titles IOPscience (Open Access) CrossRef Technology Research Database ProQuest SciTech Collection ProQuest Technology Collection ProQuest Central (Alumni) ProQuest Central UK/Ireland Advanced Technologies & Computer Science Collection ProQuest Central Essentials ProQuest Central - New (Subscription) Technology Collection ProQuest One Community College ProQuest Central Aerospace Database SciTech Premium Collection Advanced Technologies Database with Aerospace AAdvanced Technologies & Aerospace Database (subscription) ProQuest Advanced Technologies & Aerospace Collection ProQuest Central Premium ProQuest One Academic (New) Publicly Available Content Database (subscription) ProQuest One Academic Middle East (New) ProQuest One Academic Eastern Edition (DO NOT USE) ProQuest One Applied & Life Sciences ProQuest One Academic (retired) ProQuest One Academic UKI Edition ProQuest Central China |
| DatabaseTitle | CrossRef Publicly Available Content Database Advanced Technologies & Aerospace Collection Technology Collection Technology Research Database ProQuest One Academic Middle East (New) ProQuest Advanced Technologies & Aerospace Collection ProQuest Central Essentials ProQuest One Academic Eastern Edition ProQuest Central (Alumni Edition) SciTech Premium Collection ProQuest One Community College ProQuest Technology Collection ProQuest SciTech Collection ProQuest Central China ProQuest Central Advanced Technologies & Aerospace Database ProQuest One Applied & Life Sciences Aerospace Database ProQuest One Academic UKI Edition ProQuest Central Korea ProQuest Central (New) ProQuest One Academic Advanced Technologies Database with Aerospace ProQuest One Academic (New) |
| DatabaseTitleList | Publicly Available Content Database |
| Database_xml | – sequence: 1 dbid: O3W name: Institute of Physics Open Access Journal Titles url: http://iopscience.iop.org/ sourceTypes: Publisher – sequence: 2 dbid: PIMPY name: Publicly Available Content Database (subscription) url: http://search.proquest.com/publiccontent sourceTypes: Aggregation Database |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Physics |
| DocumentTitleAlternate | Large-Scale Merging of Histograms using Distributed In-Memory Computing |
| EISSN | 1742-6596 |
| ExternalDocumentID | 10_1088_1742_6596_664_9_092003 jpconf15_664_092003 |
| GroupedDBID | 02O 1JI 1WK 29L 2WC 4.4 5B3 5GY 5PX 5VS 7.Q AAJIO AAJKP AALHV ABHWH ACAFW ACHIP AEFHF AEJGL AFKRA AFYNE AHSEE AIYBF AKPSB ALMA_UNASSIGNED_HOLDINGS ARAPS ASPBG ATQHT AVWKF AZFZN BBWZM BENPR BGLVJ C1A CCPQU CEBXE CJUJL CRLBU CS3 DU5 E3Z EBS EDWGO EJD EQZZN F5P FEDTE FRP GROUPED_DOAJ GX1 H13 HCIFZ HH5 HVGLF IJHAN IOP IZVLO J9A JCGBZ KNG KQ8 LAP M48 N5L N9A O3W OK1 P2P PIMPY PJBAE Q02 RIN RNS RO9 ROL S3P SY9 T37 TR2 TSCCA UCJ W28 XSB ~02 AAYXX AEINN AFFHD CITATION OVT PHGZM PHGZT PQGLB 8FD 8FE 8FG ABUWG AZQEC DWQXO H8D L7M P62 PKEHL PQEST PQQKQ PQUKI PRINS |
| ID | FETCH-LOGICAL-c321t-ececb088f143739c5893c8ec80a5372c311224bbcd1b1b915914cb07bbfc9b0a3 |
| IEDL.DBID | O3W |
| ISICitedReferencesCount | 3 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000372140603071&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 1742-6588 |
| IngestDate | Fri Jul 25 07:49:58 EDT 2025 Sat Nov 29 06:29:01 EST 2025 Wed Aug 21 03:33:46 EDT 2024 |
| IsDoiOpenAccess | true |
| IsOpenAccess | true |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 9 |
| Language | English |
| License | Content from this work may be used under the terms of the Creative Commons Attribution 3.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI. http://iopscience.iop.org/info/page/text-and-data-mining http://creativecommons.org/licenses/by/3.0 |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c321t-ececb088f143739c5893c8ec80a5372c311224bbcd1b1b915914cb07bbfc9b0a3 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| OpenAccessLink | https://iopscience.iop.org/article/10.1088/1742-6596/664/9/092003 |
| PQID | 2576533482 |
| PQPubID | 4998668 |
| PageCount | 6 |
| ParticipantIDs | proquest_journals_2576533482 crossref_primary_10_1088_1742_6596_664_9_092003 iop_journals_10_1088_1742_6596_664_9_092003 |
| PublicationCentury | 2000 |
| PublicationDate | 20151223 |
| PublicationDateYYYYMMDD | 2015-12-23 |
| PublicationDate_xml | – month: 12 year: 2015 text: 20151223 day: 23 |
| PublicationDecade | 2010 |
| PublicationPlace | Bristol |
| PublicationPlace_xml | – name: Bristol |
| PublicationTitle | Journal of physics. Conference series |
| PublicationTitleAlternate | J. Phys.: Conf. Ser |
| PublicationYear | 2015 |
| Publisher | IOP Publishing |
| Publisher_xml | – name: IOP Publishing |
| References | 1 2 3 4 5 6 7 (9) 2014 White T (8) 2009 |
| References_xml | – ident: 2 – year: 2009 ident: 8 publication-title: Hadoop: The Definitive Guide – ident: 4 doi: 10.1016/S0168-9002(97)00048-X – ident: 6 doi: 10.1016/j.cpc.2011.02.008 – ident: 5 doi: 10.1016/j.cpc.2009.08.005 – ident: 1 doi: 10.1145/1965724.1965751 – ident: 7 – year: 2014 ident: 9 – ident: 3 doi: 10.1145/1327452.1327492 |
| SSID | ssj0033337 |
| Score | 2.0926704 |
| Snippet | Most high-energy physics analysis jobs are embarrassingly parallel except for the final merging of the output objects, which are typically histograms.... |
| SourceID | proquest crossref iop |
| SourceType | Aggregation Database Index Database Publisher |
| StartPage | 92003 |
| SubjectTerms | Algorithms Data storage Distributed memory Histograms Nodes Physics Run time (computers) |
| SummonAdditionalLinks | – databaseName: AAdvanced Technologies & Aerospace Database (subscription) dbid: P5Z link: http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1LS8QwEA7uquDFt7i6Sg7eJPSRpk1OIuqq4C4LKixeQpMmspd23a6C_95MHywi6MEekwbKTDozyXwzH0JnIdU0tYoR52stgcweSd3RlqgsppEROvIzvyKbSEYjPpmIcXPhVjawytYmVoY6KzTckXsQGFdlo-HF7I0AaxRkVxsKjQ5ahS4JQN0wZi-tJabuSeqCyJA4T8vbCmF36GvGROzFceQJzxeA0vrmnDrTYvbDQlduZ7D13w_eRptNwIkv6x2yg1ZMvovWK-CnLvfQ7QNAwcmjU5XBQzMHziJcWFy1DwHkVokBGv-Kr6HDLpBjmQzf52QICN1PXJNCuPl99Dy4ebq6Iw25AtE0DBbEaKOVk4QNoLmR0MwFLpobzf2U0STUNICcm1I6C1SghIt6gsgtSJSyWig_pQeomxe5OUQ4Dq3lqeIRM2nEmOIqSdMwyXyr3Vhke8hrpSpndQ8NWeW-OZegBwl6kE4PUshaDz107oQvm9-p_PPtfquB5ZKl-I9-nz5GGy4IYgBRCWkfdRfzd3OC1vTHYlrOT6sN9QVKfc2e priority: 102 providerName: ProQuest |
| Title | Large-Scale Merging of Histograms using Distributed In-Memory Computing |
| URI | https://iopscience.iop.org/article/10.1088/1742-6596/664/9/092003 https://www.proquest.com/docview/2576533482 |
| Volume | 664 |
| WOSCitedRecordID | wos000372140603071&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVIOP databaseName: Institute of Physics Open Access Journal Titles customDbUrl: eissn: 1742-6596 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0033337 issn: 1742-6588 databaseCode: O3W dateStart: 20040101 isFulltext: true titleUrlDefault: http://iopscience.iop.org/ providerName: IOP Publishing – providerCode: PRVPQU databaseName: AAdvanced Technologies & Aerospace Database (subscription) customDbUrl: eissn: 1742-6596 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0033337 issn: 1742-6588 databaseCode: P5Z dateStart: 20040801 isFulltext: true titleUrlDefault: https://search.proquest.com/hightechjournals providerName: ProQuest – providerCode: PRVPQU databaseName: ProQuest Central (NC Live) customDbUrl: eissn: 1742-6596 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0033337 issn: 1742-6588 databaseCode: BENPR dateStart: 20040801 isFulltext: true titleUrlDefault: https://www.proquest.com/central providerName: ProQuest – providerCode: PRVPQU databaseName: Publicly Available Content Database (subscription) customDbUrl: eissn: 1742-6596 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0033337 issn: 1742-6588 databaseCode: PIMPY dateStart: 20040801 isFulltext: true titleUrlDefault: http://search.proquest.com/publiccontent providerName: ProQuest |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV3dS8MwEA-6KfjitzidIw--SexHmjZ59GPTgZvFD5y-lCZNxJdurFPwv_fSdoiIiGAfSkl74fglzV3I7-4QOvSpoqmRjICtNcSe7JEUtrZEZiENtFCBm7llsYloOOSjkYhrNmEZCzOe1Ev_MTxWiYIrCGtCHHfAh_ZJyETohGHgCMcVVb7PJuVgzGFKX9OH-WJM4YqqmEgrw_k8SPjHfr7Yp0XQ4dsiXVqe3to_6LyOVmu3E59UAhtoQeebaLmkf6piC11cWUI4uYUB03igp7ZyER4bXCYRsfytAluC_DM-t3l2bYksneF-TgaWp_uOq9IQ8H4b3fe6d2eXpC6xQBT1vRnRSisJChrPpjgSioH7orhW3E0ZjXxFPXvyJqXKPOlJAb6PF4BAJKVRQrop3UGNfJzrXYRD3xieSh4wnQaMSS6jNPWjzDUK2gLTQs4c2GRSZdJIyhNwzhMLT2LhSQCeRCQVPC10BHgm9U9V_Pp1ez5OnyJ2O1UGG_t7f-psH62AZ8Qsb8WnbdSYTV_1AVpSb7OXYtpBzdPuML7plFMN7jF7gra4P4gfPwDR-s6J |
| linkProvider | IOP Publishing |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V1Lb9QwEB61WxBcyqtVX4APcEJWEjsP-4AQoi1ddXe1EkUqJxM7dtXL7rLZgvqn-I2dyUMVQqKnHsjRjiMl33hmnPlmBuCNkE6WwWYcbW3gFNnjJR5tua1ymXrt0riKm2YTxWSizs_1dA1-97kwRKvsdWKjqKu5o3_kETnGTdqo-LD4walrFEVX-xYarVic-utfeGSr3w8PEd-3QhwfnX064V1XAe6kSFbcO-8s7q2QUFUf7TK02E55p-Iyk4VwMqFgk7WuSmxiNZr7JMUFhbXBaRuXEp-7DhspCfsANqbD8fRbr_slXkWbgik42nbV5yTjMbMb03mU52mko1gTL-wPc7h-OV_8ZRMaQ3f85H_7RE9hs3Op2cd2DzyDNT97Dg8baqurX8DnEZHd-RcURs_Gfkldmdg8sKZACnHTakbk_wt2SDWEqf2Xr9hwxsfEQb5mbdsLnN-Cr_fyFtswmM1nfgdYLkJQpVVp5ss0y6yyRVmKooqDw7E07ELUo2gWbZUQ00T3lTKEuyHcDeJutGlx34V3CLbpFEZ9590HPeK3S27h3vv39Gt4dHI2HpnRcHK6D4_R5cuIkCPkAQxWyyv_Eh64n6vLevmqE2cG3-9bPG4ApUIsgw |
| linkToPdf | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1LSwMxEB5aX3jxLb7NwZvEfWQfyVGs1WJbCyr2FjbZRLy0pa2C_97M7lYRERHc07K7E4Yv2cmE-WYG4CRkmmVWxdTttZZiZI9m7mhLVZ6wyAgd-blfNJtIu13e74teDS4_cmGGo8r0n7nbslBwCWFFiOOe86FDmsQi8ZIk8oTnCyRYeaPc1mEeq5Xg4r5ljzODzNyVlnmRKMf5LFH4x7G-7FF1p8c3Q13sPs3Vf9J7DVYq95Ocl0LrUDODDVgsaKB6sglXbSSG0zs3cYZ0zBg7GJGhJUUxEeRxTQgS5Z9IA-vtYqssk5PWgHaQr_tGyhYR7v0WPDQv7y-uadVqgWoWBlNqtNHKKWkDLHUkdOzcGM2N5n4WszTULMAInFI6D1SghPOBgsgJpEpZLZSfsW2YGwwHZgdIElrLM8Wj2GRRHCuu0iwL09y32j2L7C54M3DlqKyoIYtIOOcSIZIIkXQQSSFLiHbh1GEqq59r8uvXB7O5-hTBY1WRdBzu_WmwY1jqNZqy3ere7MOyc5ZipLKE7ADmpuMXcwgL-nX6PBkfFSvuHcuWz6U |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Large-Scale+Merging+of+Histograms+using+Distributed+In-Memory+Computing&rft.jtitle=Journal+of+physics.+Conference+series&rft.au=Blomer%2C+Jakob&rft.au=Ganis%2C+Gerardo&rft.date=2015-12-23&rft.issn=1742-6588&rft.eissn=1742-6596&rft.volume=664&rft.issue=9&rft.spage=92003&rft_id=info:doi/10.1088%2F1742-6596%2F664%2F9%2F092003&rft.externalDBID=n%2Fa&rft.externalDocID=10_1088_1742_6596_664_9_092003 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1742-6588&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1742-6588&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1742-6588&client=summon |