Automated Scheduling Algorithm Selection and Chunk Parameter Calculation in OpenMP

Increasing node and cores-per-node counts in supercomputers render scheduling and load balancing critical for exploiting parallelism. OpenMP applications can achieve high performance via careful selection of scheduling kind and chunk parameters on a per-loop, per-application, and per-system basis fr...

Full description

Saved in:
Bibliographic Details
Published in:IEEE transactions on parallel and distributed systems Vol. 33; no. 12; pp. 4383 - 4394
Main Authors: Mohammed, Ali, Korndorfer, Jonas H. Muller, Eleliemy, Ahmed, Ciorba, Florina M.
Format: Journal Article
Language:English
Published: New York IEEE 01.12.2022
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects:
ISSN:1045-9219, 1558-2183
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Abstract Increasing node and cores-per-node counts in supercomputers render scheduling and load balancing critical for exploiting parallelism. OpenMP applications can achieve high performance via careful selection of scheduling kind and chunk parameters on a per-loop, per-application, and per-system basis from a portfolio of advanced scheduling algorithms (Korndörfer et al. , 2022). This selection approach is time-consuming, challenging, and may need to change during execution. We propose Auto4OMP , a novel approach for automated load balancing of OpenMP applications. With Auto4OMP, we introduce three scheduling algorithm selection methods and an expert-defined chunk parameter for OpenMP's schedule clause's kind and chunk , respectively. Auto4OMP extends the OpenMP schedule(auto) and chunk parameter implementation in LLVM's OpenMP runtime library to automatically select a scheduling algorithm and calculate a chunk parameter during execution. Loop characteristics are inferred in Auto4OMP from the loop execution over the application's time-steps. The experiments performed in this work show that Auto4OMP improves applications performance by up to <inline-formula><tex-math notation="LaTeX">11\%</tex-math> <mml:math><mml:mrow><mml:mn>11</mml:mn><mml:mo>%</mml:mo></mml:mrow></mml:math><inline-graphic xlink:href="ciorba-ieq1-3189270.gif"/> </inline-formula> compared to LLVM's schedule(auto) implementation and outperforms manual selection. Auto4OMP improves MPI+OpenMP applications performance by explicitly minimizing thread- and implicitly reducing process-load imbalance.
AbstractList Increasing node and cores-per-node counts in supercomputers render scheduling and load balancing critical for exploiting parallelism. OpenMP applications can achieve high performance via careful selection of scheduling kind and chunk parameters on a per-loop, per-application, and per-system basis from a portfolio of advanced scheduling algorithms (Korndörfer et al. , 2022). This selection approach is time-consuming, challenging, and may need to change during execution. We propose Auto4OMP , a novel approach for automated load balancing of OpenMP applications. With Auto4OMP, we introduce three scheduling algorithm selection methods and an expert-defined chunk parameter for OpenMP's schedule clause's kind and chunk , respectively. Auto4OMP extends the OpenMP schedule(auto) and chunk parameter implementation in LLVM's OpenMP runtime library to automatically select a scheduling algorithm and calculate a chunk parameter during execution. Loop characteristics are inferred in Auto4OMP from the loop execution over the application's time-steps. The experiments performed in this work show that Auto4OMP improves applications performance by up to <inline-formula><tex-math notation="LaTeX">11\%</tex-math> <mml:math><mml:mrow><mml:mn>11</mml:mn><mml:mo>%</mml:mo></mml:mrow></mml:math><inline-graphic xlink:href="ciorba-ieq1-3189270.gif"/> </inline-formula> compared to LLVM's schedule(auto) implementation and outperforms manual selection. Auto4OMP improves MPI+OpenMP applications performance by explicitly minimizing thread- and implicitly reducing process-load imbalance.
Increasing node and cores-per-node counts in supercomputers render scheduling and load balancing critical for exploiting parallelism. OpenMP applications can achieve high performance via careful selection of scheduling kind and chunk parameters on a per-loop, per-application, and per-system basis from a portfolio of advanced scheduling algorithms (Korndörfer et al. , 2022). This selection approach is time-consuming, challenging, and may need to change during execution. We propose Auto4OMP , a novel approach for automated load balancing of OpenMP applications. With Auto4OMP, we introduce three scheduling algorithm selection methods and an expert-defined chunk parameter for OpenMP's schedule clause's kind and chunk , respectively. Auto4OMP extends the OpenMP schedule(auto) and chunk parameter implementation in LLVM's OpenMP runtime library to automatically select a scheduling algorithm and calculate a chunk parameter during execution. Loop characteristics are inferred in Auto4OMP from the loop execution over the application's time-steps. The experiments performed in this work show that Auto4OMP improves applications performance by up to [Formula Omitted] compared to LLVM's schedule(auto) implementation and outperforms manual selection. Auto4OMP improves MPI+OpenMP applications performance by explicitly minimizing thread- and implicitly reducing process-load imbalance.
Author Eleliemy, Ahmed
Ciorba, Florina M.
Mohammed, Ali
Korndorfer, Jonas H. Muller
Author_xml – sequence: 1
  givenname: Ali
  orcidid: 0000-0002-8465-0398
  surname: Mohammed
  fullname: Mohammed, Ali
  email: ali.mohammed@hpe.com
  organization: HPE's HPC/AI EMEA Research Lab (ERL), Basel, Switzerland
– sequence: 2
  givenname: Jonas H. Muller
  orcidid: 0000-0003-3014-3275
  surname: Korndorfer
  fullname: Korndorfer, Jonas H. Muller
  email: jonas.korndorfer@unibas.ch
  organization: Department of Mathematics and Computer Science, University of Basel, Basel, Switzerland
– sequence: 3
  givenname: Ahmed
  orcidid: 0000-0003-3258-1738
  surname: Eleliemy
  fullname: Eleliemy, Ahmed
  email: ahmed.eleliemy@unibas.ch
  organization: Department of Mathematics and Computer Science, University of Basel, Basel, Switzerland
– sequence: 4
  givenname: Florina M.
  orcidid: 0000-0002-2773-4499
  surname: Ciorba
  fullname: Ciorba, Florina M.
  email: florina.ciorba@unibas.ch
  organization: Department of Mathematics and Computer Science, University of Basel, Basel, Switzerland
BookMark eNp9kMtOwzAQRS0EEm3hAxCbSKxT_EzsZRWeUlErWtaRa0_alMQpjrPg70kfYsGC1Yw098zMvUN07hoHCN0QPCYEq_vl_GExppjSMSNS0RSfoQERQsaUSHbe95iLWFGiLtGwbbcYEy4wH6D3SReaWgew0cJswHZV6dbRpFo3vgybOlpABSaUjYu0s1G26dxnNNde1xDAR5muTFfpw7x00WwH7m1-hS4KXbVwfaoj9PH0uMxe4uns-TWbTGPDWBLiVCZpwqhNBQMDgmhZqNQqrEkhbGKp5gVLYKUtLziBRHGljCQ8YXJFTKolG6G7496db746aEO-bTrv-pN5719IQSVnvSo9qoxv2tZDkZsyHD4OXpdVTnC-DzDfB5jvA8xPAfYk-UPufFlr__0vc3tkSgD41StJRdL7_AFFDX1v
CODEN ITDSEO
CitedBy_id crossref_primary_10_1016_j_future_2024_07_005
crossref_primary_10_1016_j_cageo_2025_105932
crossref_primary_10_1109_ACCESS_2025_3602234
Cites_doi 10.1002/cpe.5170
10.1007/978-3-030-28596-8_4
10.1007/978-3-319-98521-3_2
10.1002/cpe.5648
10.1108/02652320610642317
10.1109/99.660313
10.1145/2934661
10.1109/TSE.1985.231547
10.1051/0004-6361/201630208
10.1016/j.jpdc.2014.06.008
10.1109/ISPDC.2019.00026
10.1023/A:1023588520138
10.1137/1.9781611976137.7
10.1109/TPDS.2021.3107775
10.1109/IPDPSW.2014.183
10.1145/135226.135232
10.1109/91.811231
10.1007/978-3-540-74466-5_17
10.1007/978-3-642-11970-5_16
10.1007/978-3-642-30961-8_7
10.1142/9789814261302_0021
10.1111/j.1749-6632.1980.tb29690.x
10.1109/TSMC.1973.5408575
10.1109/IPDPS.2005.386
10.1145/143095.143134
10.1145/237502.237576
10.1109/ISPDC.2017.23
10.1109/ISPDC.2017.9
10.1016/S0065-2458(08)60520-3
10.1006/jpdc.1997.1411
ContentType Journal Article
Copyright Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2022
Copyright_xml – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2022
DBID 97E
ESBDL
RIA
RIE
AAYXX
CITATION
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
DOI 10.1109/TPDS.2022.3189270
DatabaseName IEEE Xplore (IEEE)
IEEE Xplore Open Access Journals
IEEE All-Society Periodicals Package (ASPP) 1998–Present
IEEE Electronic Library (IEL)
CrossRef
Computer and Information Systems Abstracts
Electronics & Communications Abstracts
Technology Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
DatabaseTitle CrossRef
Technology Research Database
Computer and Information Systems Abstracts – Academic
Electronics & Communications Abstracts
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts Professional
DatabaseTitleList
Technology Research Database
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
Computer Science
EISSN 1558-2183
EndPage 4394
ExternalDocumentID 10_1109_TPDS_2022_3189270
9825675
Genre orig-research
GrantInformation_xml – fundername: European Union's Horizon 2020 research and innovation programme
  grantid: 957407
– fundername: Swiss Platform for Advanced Scientific Computing
– fundername: DAPHNE
– fundername: Swiss National Science Foundation
  grantid: 169123
GroupedDBID --Z
-~X
.DC
0R~
29I
4.4
5GY
6IK
97E
AAJGR
AARMG
AASAJ
AAWTH
ABAZT
ABQJQ
ABVLG
ACGFO
ACIWK
AENEX
AGQYO
AHBIQ
AKJIK
AKQYR
ALMA_UNASSIGNED_HOLDINGS
ASUFR
ATWAV
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CS3
DU5
EBS
EJD
ESBDL
HZ~
IEDLZ
IFIPE
IPLJI
JAVBF
LAI
M43
MS~
O9-
OCL
P2P
PQQKQ
RIA
RIE
RNS
TN5
TWZ
UHB
AAYXX
CITATION
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
ID FETCH-LOGICAL-c336t-7867632d753ece51a8f97d90a1f5d6d2a4f36ebad4f41e69499c814638b1c7a83
IEDL.DBID RIE
ISICitedReferencesCount 4
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000844140200031&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 1045-9219
IngestDate Mon Jun 30 06:30:50 EDT 2025
Sat Nov 29 06:06:49 EST 2025
Tue Nov 18 21:24:08 EST 2025
Wed Aug 27 02:28:35 EDT 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 12
Language English
License https://creativecommons.org/licenses/by/4.0/legalcode
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c336t-7867632d753ece51a8f97d90a1f5d6d2a4f36ebad4f41e69499c814638b1c7a83
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ORCID 0000-0003-3014-3275
0000-0002-8465-0398
0000-0002-2773-4499
0000-0003-3258-1738
OpenAccessLink https://ieeexplore.ieee.org/document/9825675
PQID 2705852843
PQPubID 85437
PageCount 12
ParticipantIDs crossref_primary_10_1109_TPDS_2022_3189270
ieee_primary_9825675
crossref_citationtrail_10_1109_TPDS_2022_3189270
proquest_journals_2705852843
PublicationCentury 2000
PublicationDate 2022-12-01
PublicationDateYYYYMMDD 2022-12-01
PublicationDate_xml – month: 12
  year: 2022
  text: 2022-12-01
  day: 01
PublicationDecade 2020
PublicationPlace New York
PublicationPlace_xml – name: New York
PublicationTitle IEEE transactions on parallel and distributed systems
PublicationTitleAbbrev TPDS
PublicationYear 2022
Publisher IEEE
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml – name: IEEE
– name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References ref35
mohammed (ref29) 2018
banicescu (ref31) 2013
ref34
ref36
ref30
ref33
ref11
board (ref13) 2021
ref32
ref1
ref39
ref17
(ref2) 2022
abraham (ref16) 2015; 1
ref19
bergman (ref3) 2008; 15
booth (ref45) 2020
ref24
müller (ref18) 2012
ref23
ref25
(ref14) 2021
ref20
ref42
ref41
ref22
ref44
ref21
cabezón (ref37) 2019
ref28
board (ref12) 2021
ref27
rice (ref10) 1976
banicescu (ref26) 2000
livio (ref38) 2008
ref8
ref7
ref9
ref4
klir (ref43) 1995
ref6
ref5
ref40
vazquez (ref15) 2014
References_xml – ident: ref8
  doi: 10.1002/cpe.5170
– start-page: 456
  year: 2018
  ident: ref29
  article-title: SiL: An approach for adjusting applications to heterogeneous systems under perturbations
  publication-title: Proc Int Workshop Algorithms Models Tools Parallel Comput Heterogeneous Platforms 24th Int Eur Conf Parallel Distrib Comput
– ident: ref28
  doi: 10.1007/978-3-030-28596-8_4
– year: 2019
  ident: ref37
  article-title: SPHYNX website
– ident: ref7
  doi: 10.1007/978-3-319-98521-3_2
– ident: ref30
  doi: 10.1002/cpe.5648
– year: 2008
  ident: ref38
  publication-title: The Golden Ratio The Story of Phi The World's Most Astonishing Number
– year: 2021
  ident: ref12
  article-title: OpenMP application programming interface standard v.3.0
– ident: ref11
  doi: 10.1108/02652320610642317
– ident: ref6
  doi: 10.1109/99.660313
– ident: ref4
  doi: 10.1145/2934661
– ident: ref20
  doi: 10.1109/TSE.1985.231547
– ident: ref19
  doi: 10.1051/0004-6361/201630208
– ident: ref33
  doi: 10.1016/j.jpdc.2014.06.008
– ident: ref9
  doi: 10.1109/ISPDC.2019.00026
– ident: ref25
  doi: 10.1023/A:1023588520138
– ident: ref5
  doi: 10.1137/1.9781611976137.7
– ident: ref1
  doi: 10.1109/TPDS.2021.3107775
– ident: ref34
  doi: 10.1109/IPDPSW.2014.183
– year: 1995
  ident: ref43
  publication-title: Fuzzy Sets and Fuzzy Logic
– ident: ref21
  doi: 10.1145/135226.135232
– year: 2021
  ident: ref14
– start-page: 437
  year: 2013
  ident: ref31
  publication-title: Scalable Computing and Communications Theory and Practice
– volume: 1
  start-page: 19
  year: 2015
  ident: ref16
  article-title: GROMACS: High performance molecular simulations through multi-level parallelism from laptops to supercomputers
  publication-title: Software-X
– ident: ref42
  doi: 10.1109/91.811231
– ident: ref39
  doi: 10.1007/978-3-540-74466-5_17
– ident: ref36
  doi: 10.1007/978-3-642-11970-5_16
– ident: ref35
  doi: 10.1007/978-3-642-30961-8_7
– ident: ref40
  doi: 10.1142/9789814261302_0021
– ident: ref17
  doi: 10.1111/j.1749-6632.1980.tb29690.x
– ident: ref41
  doi: 10.1109/TSMC.1973.5408575
– ident: ref27
  doi: 10.1109/IPDPS.2005.386
– start-page: 122
  year: 2000
  ident: ref26
  article-title: Adaptive factoring: A dynamic scheduling method tuned to the rate of weight changes
  publication-title: Proc Symp High-Perform Comput Arch
– ident: ref22
  doi: 10.1145/143095.143134
– ident: ref23
  doi: 10.1145/237502.237576
– year: 2021
  ident: ref13
  article-title: OpenMP application programming interface standard v.5.0
– volume: 15
  year: 2008
  ident: ref3
  article-title: Exascale computing study: Technology challenges in achieving exascale systems
  publication-title: Defense Adv Res Projects Agency Informat Process Techn Office
– year: 2020
  ident: ref45
  article-title: An adaptive self-scheduling loop scheduler
– ident: ref44
  doi: 10.1109/ISPDC.2017.23
– year: 2022
  ident: ref2
  article-title: Top500 list
– ident: ref32
  doi: 10.1109/ISPDC.2017.9
– start-page: 65
  year: 1976
  ident: ref10
  article-title: The algorithm selection problem
  publication-title: Advances in Computers
  doi: 10.1016/S0065-2458(08)60520-3
– year: 2014
  ident: ref15
  article-title: Alya: Towards exascale for engineering simulation codes
– start-page: 223
  year: 2012
  ident: ref18
  article-title: SPEC OMP2012â€
  publication-title: Proc Int Workshop OpenMP
– ident: ref24
  doi: 10.1006/jpdc.1997.1411
SSID ssj0014504
Score 2.398448
Snippet Increasing node and cores-per-node counts in supercomputers render scheduling and load balancing critical for exploiting parallelism. OpenMP applications can...
SourceID proquest
crossref
ieee
SourceType Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 4383
SubjectTerms algorithm selection problem
Algorithms
Automatic selection
Automation
dynamic load balancing
Dynamic scheduling
Heuristic algorithms
Load balancing
Load management
Mathematical analysis
multithreaded programming
OpenMP
Parallel processing
Parameters
Runtime library
Schedules
Scheduling
Scheduling algorithms
self-scheduling
shared-memory systems
Supercomputers
Title Automated Scheduling Algorithm Selection and Chunk Parameter Calculation in OpenMP
URI https://ieeexplore.ieee.org/document/9825675
https://www.proquest.com/docview/2705852843
Volume 33
WOSCitedRecordID wos000844140200031&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVIEE
  databaseName: IEEE Electronic Library (IEL)
  customDbUrl:
  eissn: 1558-2183
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0014504
  issn: 1045-9219
  databaseCode: RIE
  dateStart: 19900101
  isFulltext: true
  titleUrlDefault: https://ieeexplore.ieee.org/
  providerName: IEEE
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV07T8MwED4VxAADb0ShIA9MiECcuHE8VgXEAKjiIbFFjh-0oqSoTfn9nF23QgIhsWU4W1G-2Hef73wfwAljiaVCOm7CTIT-1ka5ZGVES2oT7vI02nfXv-X39_nLi-g14GxxF8YY44vPzLl79Ll8PVJTd1R2IZDOYIC7BEucZ7O7WouMAWt7qUBkF-1I4DIMGUwai4un3uUjMsEkQYKai8TpEn_zQV5U5cdO7N3L9cb_XmwT1kMYSToz3LegYapt2JhLNJCwYrdh7Vu_wR146EzrEYaoRqNFH52Mu4tOOsPX0XhQ99_JoxfFQaSIrDTp9qfVG-lJV77lJu3KoQpiX2RQEVeKctfbhefrq6fuTRREFSKVplkd8TzDLSXRSFOMMm0qcyu4FrGktq0znUhm08yUUjPLqMlc7xrljgnTvKSKyzzdg-VqVJl9IFLmDNGNMQJUTMVcmpJRKrU1NLGasybE889cqNBx3AlfDAvPPGJROGQKh0wRkGnC6WLIx6zdxl_GOw6KhWFAoQmtOZZFWJCTAs2RGKEvTg9-H3UIq27uWaVKC5br8dQcwYr6rAeT8bH_174A1lHQKw
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LT9tAEB7RUKlw4I0IBLoHTlVdvPb6sccoBYEaooikEjdrvQ8SERyUOP39nd1sIqSiStx8GHstf96d-XZm5wO4ZCwylAvLTZgO0N-aIBesDGhJTZTZPI1y3fW7Wa-XPz7y_gZ8X5-F0Vq74jP9w166XL6ayoXdKrviSGcwwP0EmwmOES5Pa61zBixxYoHIL5KA40T0OUwa8qth_-cAuWAUIUXNeWSVid94ISer8s9a7BzMze7HXm0PdnwgSdpL5PdhQ1cHsLsSaSB-zh7A9puOg4fw0F7UUwxStUKLEboZexqdtCdP09m4Hr2QgZPFQayIqBTpjBbVM-kLW8BlH9oRE-nlvsi4IrYY5b5_BL9vroed28DLKgQyjtM6yPIUF5VIIVHRUidU5IZnioeCmkSlKhLMxKkuhWKGUZ3a7jXSbhTGeUllJvL4GBrVtNInQITIGeIbYgwomQwzoUtGqVBG08iojDUhXH3mQvqe41b6YlI47hHywiJTWGQKj0wTvq1veV023Pif8aGFYm3oUWhCa4Vl4afkvEBzpEbojePT9-_6Cl9uh_fdonvX-3UGW3acZd1KCxr1bKHP4bP8U4_nswv33_0F04jTcg
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Automated+Scheduling+Algorithm+Selection+and+Chunk+Parameter+Calculation+in+OpenMP&rft.jtitle=IEEE+transactions+on+parallel+and+distributed+systems&rft.au=Ali%2C+Mohammed&rft.au=Muller+Korndorfer%2C+Jonas+H&rft.au=Eleliemy%2C+Ahmed&rft.au=Ciorba%2C+Florina+M&rft.date=2022-12-01&rft.pub=The+Institute+of+Electrical+and+Electronics+Engineers%2C+Inc.+%28IEEE%29&rft.issn=1045-9219&rft.eissn=1558-2183&rft.volume=33&rft.issue=12&rft.spage=4383&rft_id=info:doi/10.1109%2FTPDS.2022.3189270&rft.externalDBID=NO_FULL_TEXT
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1045-9219&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1045-9219&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1045-9219&client=summon