Multi-level load balancing with an integrated runtime approach

The recent trend of increasing numbers of cores per chip has resulted in vast amounts of on-node parallelism. These high core counts result in hardware variability that introduces imbalance. Applications are also becoming more complex, resulting in dynamic load imbalance. Load imbalance of any kind...

Celý popis

Uložené v:

Podrobná bibliografia
Vydané v:	2018 18th IEEE ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID) s. 31 - 40
Hlavní autori:	Bak, Seonmyeong, Menon, Harshitha, White, Sam, Diener, Matthias, Kale, Laxmikant
Médium:	Konferenčný príspevok..
Jazyk:	English
Vydavateľské údaje:	Piscataway, NJ, USA IEEE Press 01.05.2018 IEEE
Edícia:	ACM Conferences
Predmet:	Adaptive MPI Charm Hardware Hybrid Programming Load Balancing Load management Load modeling Message systems OpenMP Programming Runtime Task analysis hybrid programming load balancing OpenMP Charm adaptive MPI
ISBN:	1538658151, 9781538658154
On-line prístup:	Získať plný text
Tagy:	Pridať tag Žiadne tagy, Buďte prvý, kto otaguje tento záznam!

Abstract	The recent trend of increasing numbers of cores per chip has resulted in vast amounts of on-node parallelism. These high core counts result in hardware variability that introduces imbalance. Applications are also becoming more complex, resulting in dynamic load imbalance. Load imbalance of any kind can result in loss of performance and system utilization. We address the challenge of handling both transient and persistent load imbalances while maintaining locality with low overhead. In this paper, we propose an integrated runtime system that combines the Charm++ distributed programming model with concurrent tasks to mitigate load imbalances within and across shared memory address spaces. It utilizes a periodic assignment of work to cores based on load measurement, in combination with user created tasks to handle load imbalance. We integrate OpenMP with Charm++ to enable creation of potential tasks via OpenMP's parallel loop construct. This is also available to MPI applications through the Adaptive MPI implementation. We demonstrate the benefits of our work on three applications. We show improvements of Lassen by 29.6% on Cori and 46.5% on Theta. We also demonstrate the benefits on a Charm++ application, ChaNGa by 25.7% on Theta, as well as an MPI proxy application, Kripke, using Adaptive MPI.
AbstractList	The recent trend of increasing numbers of cores per chip has resulted in vast amounts of on-node parallelism. These high core counts result in hardware variability that introduces imbalance. Applications are also becoming more complex, re-sulting in dynamic load imbalance. Load imbalance of any kind can result in loss of performance and system utilization. We address the challenge of handling both transient and persistent load imbalances while maintaining locality with low overhead. In this paper, we propose an integrated runtime system that combines the Charm++ distributed programming model with concurrent tasks to mitigate load imbalances within and across shared memory address spaces. It utilizes a periodic assignment of work to cores based on load measurement, in combination with user created tasks to handle load imbalance. We integrate OpenMP with Charm++ to enable creation of potential tasks via OpenMP's parallel loop construct. This is also available to MPI applications through the Adaptive MPI implementation. We demonstrate the benefits of our work on three applications. We show improvements of Lassen by 29.6% on Cori and 46.5% on Theta. We also demonstrate the benefits on a Charm++ application, ChaNGa by 25.7% on Theta, as well as an MPI proxy application, Kripke, using Adaptive MPI. The recent trend of increasing numbers of cores per chip has resulted in vast amounts of on-node parallelism. These high core counts result in hardware variability that introduces imbalance. Applications are also becoming more complex, resulting in dynamic load imbalance. Load imbalance of any kind can result in loss of performance and system utilization. We address the challenge of handling both transient and persistent load imbalances while maintaining locality with low overhead. In this paper, we propose an integrated runtime system that combines the Charm++ distributed programming model with concurrent tasks to mitigate load imbalances within and across shared memory address spaces. It utilizes a periodic assignment of work to cores based on load measurement, in combination with user created tasks to handle load imbalance. We integrate OpenMP with Charm++ to enable creation of potential tasks via OpenMP's parallel loop construct. This is also available to MPI applications through the Adaptive MPI implementation. We demonstrate the benefits of our work on three applications. We show improvements of Lassen by 29.6% on Cori and 46.5% on Theta. We also demonstrate the benefits on a Charm++ application, ChaNGa by 25.7% on Theta, as well as an MPI proxy application, Kripke, using Adaptive MPI.
Author	White, Sam Diener, Matthias Kale, Laxmikant Bak, Seonmyeong Menon, Harshitha
Author_xml	– sequence: 1 givenname: Seonmyeong surname: Bak fullname: Bak, Seonmyeong email: sbak5@illinois.edu organization: University of Illinois at Urbana-Champaign – sequence: 2 givenname: Harshitha surname: Menon fullname: Menon, Harshitha email: harshitha@llnl.gov organization: Lawrence Livermore National Laboratory – sequence: 3 givenname: Sam surname: White fullname: White, Sam email: white67@illinois.edu organization: University of Illinois at Urbana-Champaign – sequence: 4 givenname: Matthias surname: Diener fullname: Diener, Matthias email: mdiener@illinois.edu organization: University of Illinois at Urbana-Champaign – sequence: 5 givenname: Laxmikant surname: Kale fullname: Kale, Laxmikant email: kale@illinois.edu organization: University of Illinois at Urbana-Champaign
BookMark	eNqNkEFLwzAYhiMq6OZ-gHjJWehM0qZJLoJUnYOJIHoOX9MvW7RLR9sp_ns758Gjl-_l4314D8-IHMUmIiHnnE05Z-aqKGbP89upYFxPGRvuARlxmepcai754d_nhEy67m2ARK4zJs0puX7c1n1IavzAmtYNVLSEGqILcUk_Q7-iEGmIPS5b6LGi7Tb2YY0UNpu2Abc6I8ce6g4nvzkmr_d3L8VDsniazYubRQJC6T5xzIuMKQAlHHeZ9KlAUykNmEuRCYne-KpyDEojcy6Ycb7yJlXOoFO5d-mYXOx3AyLaTRvW0H5ZnQ0CmBray30Lbm3LpnnvLGd2J8fu5didHPsjZ4DZv2FbtgF9-g3nk2Zh
CODEN	IEEPAD
ContentType	Conference Proceeding
DBID	6IE 6IL CBEJK RIE RIL
DOI	10.1109/CCGRID.2018.00018
DatabaseName	IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml	– sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher
DeliveryMethod	fulltext_linktorsrc
EISBN	1538658151 9781538658154
EndPage	40
ExternalDocumentID	8411007
Genre	orig-research
GroupedDBID	6IE 6IF 6IL 6IN AAJGR ABLEC ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK IEGSK OCL RIB RIC RIE RIL AAWTH
ID	FETCH-LOGICAL-a278t-c0f2407aa72c1c45f32e9d78ae652425ef9fddc0ab9561209cfdf937c9ec76fc3
IEDL.DBID	RIE
ISBN	1538658151 9781538658154
ISICitedReferencesCount	13
ISICitedReferencesURI	http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000494275100004&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
IngestDate	Wed Aug 27 02:48:40 EDT 2025 Sat Jun 15 16:36:42 EDT 2024 Wed Jan 31 06:41:02 EST 2024
IsPeerReviewed	false
IsScholarly	false
Keywords	hybrid programming load balancing OpenMP Charm adaptive MPI
Language	English
LinkModel	DirectLink
MeetingName	CCGrid '18: 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing
MergedId	FETCHMERGED-LOGICAL-a278t-c0f2407aa72c1c45f32e9d78ae652425ef9fddc0ab9561209cfdf937c9ec76fc3
PageCount	10
ParticipantIDs	acm_books_10_1109_CCGRID_2018_00018_brief acm_books_10_1109_CCGRID_2018_00018 ieee_primary_8411007
PublicationCentury	2000
PublicationDate	20180501 2018-May
PublicationDateYYYYMMDD	2018-05-01
PublicationDate_xml	– month: 05 year: 2018 text: 20180501 day: 01
PublicationDecade	2010
PublicationPlace	Piscataway, NJ, USA
PublicationPlace_xml	– name: Piscataway, NJ, USA
PublicationSeriesTitle	ACM Conferences
PublicationTitle	2018 18th IEEE ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID)
PublicationTitleAbbrev	CCGRID
PublicationYear	2018
Publisher	IEEE Press IEEE
Publisher_xml	– name: IEEE Press – name: IEEE
SSID	ssj0002684059
Score	1.7389249
Snippet	The recent trend of increasing numbers of cores per chip has resulted in vast amounts of on-node parallelism. These high core counts result in hardware...
SourceID	ieee acm
SourceType	Publisher
StartPage	31
SubjectTerms	Adaptive MPI Charm Hardware Hybrid Programming Load Balancing Load management Load modeling Message systems OpenMP Programming Runtime Task analysis
Title	Multi-level load balancing with an integrated runtime approach
URI	https://ieeexplore.ieee.org/document/8411007
WOSCitedRecordID	wos000494275100004&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LSwMxEB5a8eBJpRXri4BeBNem6e4muQharRZKKUWltyWbB3joVvrw95vJrhXBg96yIWxgcpj5Zr6ZD-BCCMVUTk2kUh--xdSJyH8mkUtz1nGpy_Og1vA65KORmE7luAZXm14Ya20gn9lrXIZavpnrNabK2iLGAWe8DnXO07JXa5NPCVNLElkVLjtUtnu9x8ngHtlbSJekQdZD6dkPGZXgRfq7_7t_D5rf7XhkvHE0-1CzRQNuQutsNETSDxnOlSF3yFLU_gTB5CpRBRl8zYIwZIKSEDNLbqsZ4k146T88956iSgwhUoyLVaSpQ_ClFGe6o-PEdZmVhgtl0wRhg3XSGaOpyrFVlVGpnXE-9tDSap463T2ArWJe2EMgJmaGWo_rlDOxyROVyNT7MuHfiPmdTgvOvZ0yjPKXWQAJVGalNTO0ZihYixZc_uFUlnvE71rQQGtm7-UEjawy5NHv28ewgz8oaYUnsLVarO0pbOuP1dtycRae_RMYWKoR
linkProvider	IEEE
linkToHtml	http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3LSsNAFL3UKuhKpRXrc0A3gtFJmsfMRtBqbTGWUqp0FybzABdNpQ-_37nTWBFc6C4ZhhBOAnMf554DcM6YCEROlSdiG76F1DDP3kaeifPAN7HJc-fW8JomvR4bjXi_AperWRittSOf6Su8dL18NZELLJVdsxAFzpI1WEfnrHJaa1VRcbolES9blz7l163W46B7j_wtJExSZ-wh5PiHkYo7R9rb_3uDHah_D-SR_uqo2YWKLmpw44ZnvRRpPySdCEXukKco7Q6C5VUiCtL9UoNQZICmEGNNbksV8Tq8tB-GrY5X2iF4IkjY3JPUYPolRBJIX4aRaQaaq4QJHUeYOGjDjVKSihyHVQPKpVHGRh-Sa5nERjb3oFpMCr0PRIWBotpmdsKoUOWRiHhsAWX2KwV2xW_AmcUpwzh_lrk0gfJsiWaGaLqWNWvAxR92ZbnN-U0Daohm9r7U0MhKIA9-Xz6Fzc7wOc3Sbu_pELbwYUuS4RFU59OFPoYN-TF_m01P3C_wCScfrVo
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2018+18th+IEEE+ACM+International+Symposium+on+Cluster%2C+Cloud+and+Grid+Computing+%28CCGRID%29&rft.atitle=Multi-Level+Load+Balancing+with+an+Integrated+Runtime+Approach&rft.au=Bak%2C+Seonmyeong&rft.au=Menon%2C+Harshitha&rft.au=White%2C+Sam&rft.au=Diener%2C+Matthias&rft.date=2018-05-01&rft.pub=IEEE&rft.spage=31&rft.epage=40&rft_id=info:doi/10.1109%2FCCGRID.2018.00018&rft.externalDocID=8411007
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9781538658154/lc.gif&client=summon&freeimage=true
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9781538658154/mc.gif&client=summon&freeimage=true
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9781538658154/sc.gif&client=summon&freeimage=true