Multi-level load balancing with an integrated runtime approach

The recent trend of increasing numbers of cores per chip has resulted in vast amounts of on-node parallelism. These high core counts result in hardware variability that introduces imbalance. Applications are also becoming more complex, resulting in dynamic load imbalance. Load imbalance of any kind...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:2018 18th IEEE ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID) s. 31 - 40
Hlavní autori: Bak, Seonmyeong, Menon, Harshitha, White, Sam, Diener, Matthias, Kale, Laxmikant
Médium: Konferenčný príspevok..
Jazyk:English
Vydavateľské údaje: Piscataway, NJ, USA IEEE Press 01.05.2018
IEEE
Edícia:ACM Conferences
Predmet:
ISBN:1538658151, 9781538658154
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Abstract The recent trend of increasing numbers of cores per chip has resulted in vast amounts of on-node parallelism. These high core counts result in hardware variability that introduces imbalance. Applications are also becoming more complex, resulting in dynamic load imbalance. Load imbalance of any kind can result in loss of performance and system utilization. We address the challenge of handling both transient and persistent load imbalances while maintaining locality with low overhead. In this paper, we propose an integrated runtime system that combines the Charm++ distributed programming model with concurrent tasks to mitigate load imbalances within and across shared memory address spaces. It utilizes a periodic assignment of work to cores based on load measurement, in combination with user created tasks to handle load imbalance. We integrate OpenMP with Charm++ to enable creation of potential tasks via OpenMP's parallel loop construct. This is also available to MPI applications through the Adaptive MPI implementation. We demonstrate the benefits of our work on three applications. We show improvements of Lassen by 29.6% on Cori and 46.5% on Theta. We also demonstrate the benefits on a Charm++ application, ChaNGa by 25.7% on Theta, as well as an MPI proxy application, Kripke, using Adaptive MPI.
AbstractList The recent trend of increasing numbers of cores per chip has resulted in vast amounts of on-node parallelism. These high core counts result in hardware variability that introduces imbalance. Applications are also becoming more complex, re-sulting in dynamic load imbalance. Load imbalance of any kind can result in loss of performance and system utilization. We address the challenge of handling both transient and persistent load imbalances while maintaining locality with low overhead. In this paper, we propose an integrated runtime system that combines the Charm++ distributed programming model with concurrent tasks to mitigate load imbalances within and across shared memory address spaces. It utilizes a periodic assignment of work to cores based on load measurement, in combination with user created tasks to handle load imbalance. We integrate OpenMP with Charm++ to enable creation of potential tasks via OpenMP's parallel loop construct. This is also available to MPI applications through the Adaptive MPI implementation. We demonstrate the benefits of our work on three applications. We show improvements of Lassen by 29.6% on Cori and 46.5% on Theta. We also demonstrate the benefits on a Charm++ application, ChaNGa by 25.7% on Theta, as well as an MPI proxy application, Kripke, using Adaptive MPI.
The recent trend of increasing numbers of cores per chip has resulted in vast amounts of on-node parallelism. These high core counts result in hardware variability that introduces imbalance. Applications are also becoming more complex, resulting in dynamic load imbalance. Load imbalance of any kind can result in loss of performance and system utilization. We address the challenge of handling both transient and persistent load imbalances while maintaining locality with low overhead. In this paper, we propose an integrated runtime system that combines the Charm++ distributed programming model with concurrent tasks to mitigate load imbalances within and across shared memory address spaces. It utilizes a periodic assignment of work to cores based on load measurement, in combination with user created tasks to handle load imbalance. We integrate OpenMP with Charm++ to enable creation of potential tasks via OpenMP's parallel loop construct. This is also available to MPI applications through the Adaptive MPI implementation. We demonstrate the benefits of our work on three applications. We show improvements of Lassen by 29.6% on Cori and 46.5% on Theta. We also demonstrate the benefits on a Charm++ application, ChaNGa by 25.7% on Theta, as well as an MPI proxy application, Kripke, using Adaptive MPI.
Author White, Sam
Diener, Matthias
Kale, Laxmikant
Bak, Seonmyeong
Menon, Harshitha
Author_xml – sequence: 1
  givenname: Seonmyeong
  surname: Bak
  fullname: Bak, Seonmyeong
  email: sbak5@illinois.edu
  organization: University of Illinois at Urbana-Champaign
– sequence: 2
  givenname: Harshitha
  surname: Menon
  fullname: Menon, Harshitha
  email: harshitha@llnl.gov
  organization: Lawrence Livermore National Laboratory
– sequence: 3
  givenname: Sam
  surname: White
  fullname: White, Sam
  email: white67@illinois.edu
  organization: University of Illinois at Urbana-Champaign
– sequence: 4
  givenname: Matthias
  surname: Diener
  fullname: Diener, Matthias
  email: mdiener@illinois.edu
  organization: University of Illinois at Urbana-Champaign
– sequence: 5
  givenname: Laxmikant
  surname: Kale
  fullname: Kale, Laxmikant
  email: kale@illinois.edu
  organization: University of Illinois at Urbana-Champaign
BookMark eNqNkEFLwzAYhiMq6OZ-gHjJWehM0qZJLoJUnYOJIHoOX9MvW7RLR9sp_ns758Gjl-_l4314D8-IHMUmIiHnnE05Z-aqKGbP89upYFxPGRvuARlxmepcai754d_nhEy67m2ARK4zJs0puX7c1n1IavzAmtYNVLSEGqILcUk_Q7-iEGmIPS5b6LGi7Tb2YY0UNpu2Abc6I8ce6g4nvzkmr_d3L8VDsniazYubRQJC6T5xzIuMKQAlHHeZ9KlAUykNmEuRCYne-KpyDEojcy6Ycb7yJlXOoFO5d-mYXOx3AyLaTRvW0H5ZnQ0CmBray30Lbm3LpnnvLGd2J8fu5didHPsjZ4DZv2FbtgF9-g3nk2Zh
CODEN IEEPAD
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/CCGRID.2018.00018
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList

Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
EISBN 1538658151
9781538658154
EndPage 40
ExternalDocumentID 8411007
Genre orig-research
GroupedDBID 6IE
6IF
6IL
6IN
AAJGR
ABLEC
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
IEGSK
OCL
RIB
RIC
RIE
RIL
AAWTH
ID FETCH-LOGICAL-a278t-c0f2407aa72c1c45f32e9d78ae652425ef9fddc0ab9561209cfdf937c9ec76fc3
IEDL.DBID RIE
ISBN 1538658151
9781538658154
ISICitedReferencesCount 13
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000494275100004&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
IngestDate Wed Aug 27 02:48:40 EDT 2025
Sat Jun 15 16:36:42 EDT 2024
Wed Jan 31 06:41:02 EST 2024
IsPeerReviewed false
IsScholarly false
Keywords hybrid programming
load balancing
OpenMP
Charm
adaptive MPI
Language English
LinkModel DirectLink
MeetingName CCGrid '18: 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing
MergedId FETCHMERGED-LOGICAL-a278t-c0f2407aa72c1c45f32e9d78ae652425ef9fddc0ab9561209cfdf937c9ec76fc3
PageCount 10
ParticipantIDs acm_books_10_1109_CCGRID_2018_00018_brief
acm_books_10_1109_CCGRID_2018_00018
ieee_primary_8411007
PublicationCentury 2000
PublicationDate 20180501
2018-May
PublicationDateYYYYMMDD 2018-05-01
PublicationDate_xml – month: 05
  year: 2018
  text: 20180501
  day: 01
PublicationDecade 2010
PublicationPlace Piscataway, NJ, USA
PublicationPlace_xml – name: Piscataway, NJ, USA
PublicationSeriesTitle ACM Conferences
PublicationTitle 2018 18th IEEE ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)
PublicationTitleAbbrev CCGRID
PublicationYear 2018
Publisher IEEE Press
IEEE
Publisher_xml – name: IEEE Press
– name: IEEE
SSID ssj0002684059
Score 1.7389249
Snippet The recent trend of increasing numbers of cores per chip has resulted in vast amounts of on-node parallelism. These high core counts result in hardware...
SourceID ieee
acm
SourceType Publisher
StartPage 31
SubjectTerms Adaptive MPI
Charm
Hardware
Hybrid Programming
Load Balancing
Load management
Load modeling
Message systems
OpenMP
Programming
Runtime
Task analysis
Title Multi-level load balancing with an integrated runtime approach
URI https://ieeexplore.ieee.org/document/8411007
WOSCitedRecordID wos000494275100004&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LSwMxEB5a8eBJpRXri4BeBNem6e4muQharRZKKUWltyWbB3joVvrw95vJrhXBg96yIWxgcpj5Zr6ZD-BCCMVUTk2kUh--xdSJyH8mkUtz1nGpy_Og1vA65KORmE7luAZXm14Ya20gn9lrXIZavpnrNabK2iLGAWe8DnXO07JXa5NPCVNLElkVLjtUtnu9x8ngHtlbSJekQdZD6dkPGZXgRfq7_7t_D5rf7XhkvHE0-1CzRQNuQutsNETSDxnOlSF3yFLU_gTB5CpRBRl8zYIwZIKSEDNLbqsZ4k146T88956iSgwhUoyLVaSpQ_ClFGe6o-PEdZmVhgtl0wRhg3XSGaOpyrFVlVGpnXE-9tDSap463T2ArWJe2EMgJmaGWo_rlDOxyROVyNT7MuHfiPmdTgvOvZ0yjPKXWQAJVGalNTO0ZihYixZc_uFUlnvE71rQQGtm7-UEjawy5NHv28ewgz8oaYUnsLVarO0pbOuP1dtycRae_RMYWKoR
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3LSsNAFL3UKuhKpRXrc0A3gtFJmsfMRtBqbTGWUqp0FybzABdNpQ-_37nTWBFc6C4ZhhBOAnMf554DcM6YCEROlSdiG76F1DDP3kaeifPAN7HJc-fW8JomvR4bjXi_AperWRittSOf6Su8dL18NZELLJVdsxAFzpI1WEfnrHJaa1VRcbolES9blz7l163W46B7j_wtJExSZ-wh5PiHkYo7R9rb_3uDHah_D-SR_uqo2YWKLmpw44ZnvRRpPySdCEXukKco7Q6C5VUiCtL9UoNQZICmEGNNbksV8Tq8tB-GrY5X2iF4IkjY3JPUYPolRBJIX4aRaQaaq4QJHUeYOGjDjVKSihyHVQPKpVHGRh-Sa5nERjb3oFpMCr0PRIWBotpmdsKoUOWRiHhsAWX2KwV2xW_AmcUpwzh_lrk0gfJsiWaGaLqWNWvAxR92ZbnN-U0Daohm9r7U0MhKIA9-Xz6Fzc7wOc3Sbu_pELbwYUuS4RFU59OFPoYN-TF_m01P3C_wCScfrVo
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2018+18th+IEEE+ACM+International+Symposium+on+Cluster%2C+Cloud+and+Grid+Computing+%28CCGRID%29&rft.atitle=Multi-Level+Load+Balancing+with+an+Integrated+Runtime+Approach&rft.au=Bak%2C+Seonmyeong&rft.au=Menon%2C+Harshitha&rft.au=White%2C+Sam&rft.au=Diener%2C+Matthias&rft.date=2018-05-01&rft.pub=IEEE&rft.spage=31&rft.epage=40&rft_id=info:doi/10.1109%2FCCGRID.2018.00018&rft.externalDocID=8411007
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9781538658154/lc.gif&client=summon&freeimage=true
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9781538658154/mc.gif&client=summon&freeimage=true
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9781538658154/sc.gif&client=summon&freeimage=true