Automated Scheduling Algorithm Selection and Chunk Parameter Calculation in OpenMP
Increasing node and cores-per-node counts in supercomputers render scheduling and load balancing critical for exploiting parallelism. OpenMP applications can achieve high performance via careful selection of scheduling kind and chunk parameters on a per-loop, per-application, and per-system basis fr...
Saved in:
| Published in: | IEEE transactions on parallel and distributed systems Vol. 33; no. 12; pp. 4383 - 4394 |
|---|---|
| Main Authors: | , , , |
| Format: | Journal Article |
| Language: | English |
| Published: |
New York
IEEE
01.12.2022
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Subjects: | |
| ISSN: | 1045-9219, 1558-2183 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Abstract | Increasing node and cores-per-node counts in supercomputers render scheduling and load balancing critical for exploiting parallelism. OpenMP applications can achieve high performance via careful selection of scheduling kind and chunk parameters on a per-loop, per-application, and per-system basis from a portfolio of advanced scheduling algorithms (Korndörfer et al. , 2022). This selection approach is time-consuming, challenging, and may need to change during execution. We propose Auto4OMP , a novel approach for automated load balancing of OpenMP applications. With Auto4OMP, we introduce three scheduling algorithm selection methods and an expert-defined chunk parameter for OpenMP's schedule clause's kind and chunk , respectively. Auto4OMP extends the OpenMP schedule(auto) and chunk parameter implementation in LLVM's OpenMP runtime library to automatically select a scheduling algorithm and calculate a chunk parameter during execution. Loop characteristics are inferred in Auto4OMP from the loop execution over the application's time-steps. The experiments performed in this work show that Auto4OMP improves applications performance by up to <inline-formula><tex-math notation="LaTeX">11\%</tex-math> <mml:math><mml:mrow><mml:mn>11</mml:mn><mml:mo>%</mml:mo></mml:mrow></mml:math><inline-graphic xlink:href="ciorba-ieq1-3189270.gif"/> </inline-formula> compared to LLVM's schedule(auto) implementation and outperforms manual selection. Auto4OMP improves MPI+OpenMP applications performance by explicitly minimizing thread- and implicitly reducing process-load imbalance. |
|---|---|
| AbstractList | Increasing node and cores-per-node counts in supercomputers render scheduling and load balancing critical for exploiting parallelism. OpenMP applications can achieve high performance via careful selection of scheduling kind and chunk parameters on a per-loop, per-application, and per-system basis from a portfolio of advanced scheduling algorithms (Korndörfer et al. , 2022). This selection approach is time-consuming, challenging, and may need to change during execution. We propose Auto4OMP , a novel approach for automated load balancing of OpenMP applications. With Auto4OMP, we introduce three scheduling algorithm selection methods and an expert-defined chunk parameter for OpenMP's schedule clause's kind and chunk , respectively. Auto4OMP extends the OpenMP schedule(auto) and chunk parameter implementation in LLVM's OpenMP runtime library to automatically select a scheduling algorithm and calculate a chunk parameter during execution. Loop characteristics are inferred in Auto4OMP from the loop execution over the application's time-steps. The experiments performed in this work show that Auto4OMP improves applications performance by up to <inline-formula><tex-math notation="LaTeX">11\%</tex-math> <mml:math><mml:mrow><mml:mn>11</mml:mn><mml:mo>%</mml:mo></mml:mrow></mml:math><inline-graphic xlink:href="ciorba-ieq1-3189270.gif"/> </inline-formula> compared to LLVM's schedule(auto) implementation and outperforms manual selection. Auto4OMP improves MPI+OpenMP applications performance by explicitly minimizing thread- and implicitly reducing process-load imbalance. Increasing node and cores-per-node counts in supercomputers render scheduling and load balancing critical for exploiting parallelism. OpenMP applications can achieve high performance via careful selection of scheduling kind and chunk parameters on a per-loop, per-application, and per-system basis from a portfolio of advanced scheduling algorithms (Korndörfer et al. , 2022). This selection approach is time-consuming, challenging, and may need to change during execution. We propose Auto4OMP , a novel approach for automated load balancing of OpenMP applications. With Auto4OMP, we introduce three scheduling algorithm selection methods and an expert-defined chunk parameter for OpenMP's schedule clause's kind and chunk , respectively. Auto4OMP extends the OpenMP schedule(auto) and chunk parameter implementation in LLVM's OpenMP runtime library to automatically select a scheduling algorithm and calculate a chunk parameter during execution. Loop characteristics are inferred in Auto4OMP from the loop execution over the application's time-steps. The experiments performed in this work show that Auto4OMP improves applications performance by up to [Formula Omitted] compared to LLVM's schedule(auto) implementation and outperforms manual selection. Auto4OMP improves MPI+OpenMP applications performance by explicitly minimizing thread- and implicitly reducing process-load imbalance. |
| Author | Eleliemy, Ahmed Ciorba, Florina M. Mohammed, Ali Korndorfer, Jonas H. Muller |
| Author_xml | – sequence: 1 givenname: Ali orcidid: 0000-0002-8465-0398 surname: Mohammed fullname: Mohammed, Ali email: ali.mohammed@hpe.com organization: HPE's HPC/AI EMEA Research Lab (ERL), Basel, Switzerland – sequence: 2 givenname: Jonas H. Muller orcidid: 0000-0003-3014-3275 surname: Korndorfer fullname: Korndorfer, Jonas H. Muller email: jonas.korndorfer@unibas.ch organization: Department of Mathematics and Computer Science, University of Basel, Basel, Switzerland – sequence: 3 givenname: Ahmed orcidid: 0000-0003-3258-1738 surname: Eleliemy fullname: Eleliemy, Ahmed email: ahmed.eleliemy@unibas.ch organization: Department of Mathematics and Computer Science, University of Basel, Basel, Switzerland – sequence: 4 givenname: Florina M. orcidid: 0000-0002-2773-4499 surname: Ciorba fullname: Ciorba, Florina M. email: florina.ciorba@unibas.ch organization: Department of Mathematics and Computer Science, University of Basel, Basel, Switzerland |
| BookMark | eNp9kMtOwzAQRS0EEm3hAxCbSKxT_EzsZRWeUlErWtaRa0_alMQpjrPg70kfYsGC1Yw098zMvUN07hoHCN0QPCYEq_vl_GExppjSMSNS0RSfoQERQsaUSHbe95iLWFGiLtGwbbcYEy4wH6D3SReaWgew0cJswHZV6dbRpFo3vgybOlpABSaUjYu0s1G26dxnNNde1xDAR5muTFfpw7x00WwH7m1-hS4KXbVwfaoj9PH0uMxe4uns-TWbTGPDWBLiVCZpwqhNBQMDgmhZqNQqrEkhbGKp5gVLYKUtLziBRHGljCQ8YXJFTKolG6G7496db746aEO-bTrv-pN5719IQSVnvSo9qoxv2tZDkZsyHD4OXpdVTnC-DzDfB5jvA8xPAfYk-UPufFlr__0vc3tkSgD41StJRdL7_AFFDX1v |
| CODEN | ITDSEO |
| CitedBy_id | crossref_primary_10_1016_j_future_2024_07_005 crossref_primary_10_1016_j_cageo_2025_105932 crossref_primary_10_1109_ACCESS_2025_3602234 |
| Cites_doi | 10.1002/cpe.5170 10.1007/978-3-030-28596-8_4 10.1007/978-3-319-98521-3_2 10.1002/cpe.5648 10.1108/02652320610642317 10.1109/99.660313 10.1145/2934661 10.1109/TSE.1985.231547 10.1051/0004-6361/201630208 10.1016/j.jpdc.2014.06.008 10.1109/ISPDC.2019.00026 10.1023/A:1023588520138 10.1137/1.9781611976137.7 10.1109/TPDS.2021.3107775 10.1109/IPDPSW.2014.183 10.1145/135226.135232 10.1109/91.811231 10.1007/978-3-540-74466-5_17 10.1007/978-3-642-11970-5_16 10.1007/978-3-642-30961-8_7 10.1142/9789814261302_0021 10.1111/j.1749-6632.1980.tb29690.x 10.1109/TSMC.1973.5408575 10.1109/IPDPS.2005.386 10.1145/143095.143134 10.1145/237502.237576 10.1109/ISPDC.2017.23 10.1109/ISPDC.2017.9 10.1016/S0065-2458(08)60520-3 10.1006/jpdc.1997.1411 |
| ContentType | Journal Article |
| Copyright | Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2022 |
| Copyright_xml | – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2022 |
| DBID | 97E ESBDL RIA RIE AAYXX CITATION 7SC 7SP 8FD JQ2 L7M L~C L~D |
| DOI | 10.1109/TPDS.2022.3189270 |
| DatabaseName | IEEE Xplore (IEEE) IEEE Xplore Open Access Journals IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Electronic Library (IEL) CrossRef Computer and Information Systems Abstracts Electronics & Communications Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional |
| DatabaseTitle | CrossRef Technology Research Database Computer and Information Systems Abstracts – Academic Electronics & Communications Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Professional |
| DatabaseTitleList | Technology Research Database |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Engineering Computer Science |
| EISSN | 1558-2183 |
| EndPage | 4394 |
| ExternalDocumentID | 10_1109_TPDS_2022_3189270 9825675 |
| Genre | orig-research |
| GrantInformation_xml | – fundername: European Union's Horizon 2020 research and innovation programme grantid: 957407 – fundername: Swiss Platform for Advanced Scientific Computing – fundername: DAPHNE – fundername: Swiss National Science Foundation grantid: 169123 |
| GroupedDBID | --Z -~X .DC 0R~ 29I 4.4 5GY 6IK 97E AAJGR AARMG AASAJ AAWTH ABAZT ABQJQ ABVLG ACGFO ACIWK AENEX AGQYO AHBIQ AKJIK AKQYR ALMA_UNASSIGNED_HOLDINGS ASUFR ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ CS3 DU5 EBS EJD ESBDL HZ~ IEDLZ IFIPE IPLJI JAVBF LAI M43 MS~ O9- OCL P2P PQQKQ RIA RIE RNS TN5 TWZ UHB AAYXX CITATION 7SC 7SP 8FD JQ2 L7M L~C L~D |
| ID | FETCH-LOGICAL-c336t-7867632d753ece51a8f97d90a1f5d6d2a4f36ebad4f41e69499c814638b1c7a83 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 4 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000844140200031&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 1045-9219 |
| IngestDate | Mon Jun 30 06:30:50 EDT 2025 Sat Nov 29 06:06:49 EST 2025 Tue Nov 18 21:24:08 EST 2025 Wed Aug 27 02:28:35 EDT 2025 |
| IsDoiOpenAccess | true |
| IsOpenAccess | true |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 12 |
| Language | English |
| License | https://creativecommons.org/licenses/by/4.0/legalcode |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c336t-7867632d753ece51a8f97d90a1f5d6d2a4f36ebad4f41e69499c814638b1c7a83 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ORCID | 0000-0003-3014-3275 0000-0002-8465-0398 0000-0002-2773-4499 0000-0003-3258-1738 |
| OpenAccessLink | https://ieeexplore.ieee.org/document/9825675 |
| PQID | 2705852843 |
| PQPubID | 85437 |
| PageCount | 12 |
| ParticipantIDs | crossref_primary_10_1109_TPDS_2022_3189270 ieee_primary_9825675 crossref_citationtrail_10_1109_TPDS_2022_3189270 proquest_journals_2705852843 |
| PublicationCentury | 2000 |
| PublicationDate | 2022-12-01 |
| PublicationDateYYYYMMDD | 2022-12-01 |
| PublicationDate_xml | – month: 12 year: 2022 text: 2022-12-01 day: 01 |
| PublicationDecade | 2020 |
| PublicationPlace | New York |
| PublicationPlace_xml | – name: New York |
| PublicationTitle | IEEE transactions on parallel and distributed systems |
| PublicationTitleAbbrev | TPDS |
| PublicationYear | 2022 |
| Publisher | IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Publisher_xml | – name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| References | ref35 mohammed (ref29) 2018 banicescu (ref31) 2013 ref34 ref36 ref30 ref33 ref11 board (ref13) 2021 ref32 ref1 ref39 ref17 (ref2) 2022 abraham (ref16) 2015; 1 ref19 bergman (ref3) 2008; 15 booth (ref45) 2020 ref24 müller (ref18) 2012 ref23 ref25 (ref14) 2021 ref20 ref42 ref41 ref22 ref44 ref21 cabezón (ref37) 2019 ref28 board (ref12) 2021 ref27 rice (ref10) 1976 banicescu (ref26) 2000 livio (ref38) 2008 ref8 ref7 ref9 ref4 klir (ref43) 1995 ref6 ref5 ref40 vazquez (ref15) 2014 |
| References_xml | – ident: ref8 doi: 10.1002/cpe.5170 – start-page: 456 year: 2018 ident: ref29 article-title: SiL: An approach for adjusting applications to heterogeneous systems under perturbations publication-title: Proc Int Workshop Algorithms Models Tools Parallel Comput Heterogeneous Platforms 24th Int Eur Conf Parallel Distrib Comput – ident: ref28 doi: 10.1007/978-3-030-28596-8_4 – year: 2019 ident: ref37 article-title: SPHYNX website – ident: ref7 doi: 10.1007/978-3-319-98521-3_2 – ident: ref30 doi: 10.1002/cpe.5648 – year: 2008 ident: ref38 publication-title: The Golden Ratio The Story of Phi The World's Most Astonishing Number – year: 2021 ident: ref12 article-title: OpenMP application programming interface standard v.3.0 – ident: ref11 doi: 10.1108/02652320610642317 – ident: ref6 doi: 10.1109/99.660313 – ident: ref4 doi: 10.1145/2934661 – ident: ref20 doi: 10.1109/TSE.1985.231547 – ident: ref19 doi: 10.1051/0004-6361/201630208 – ident: ref33 doi: 10.1016/j.jpdc.2014.06.008 – ident: ref9 doi: 10.1109/ISPDC.2019.00026 – ident: ref25 doi: 10.1023/A:1023588520138 – ident: ref5 doi: 10.1137/1.9781611976137.7 – ident: ref1 doi: 10.1109/TPDS.2021.3107775 – ident: ref34 doi: 10.1109/IPDPSW.2014.183 – year: 1995 ident: ref43 publication-title: Fuzzy Sets and Fuzzy Logic – ident: ref21 doi: 10.1145/135226.135232 – year: 2021 ident: ref14 – start-page: 437 year: 2013 ident: ref31 publication-title: Scalable Computing and Communications Theory and Practice – volume: 1 start-page: 19 year: 2015 ident: ref16 article-title: GROMACS: High performance molecular simulations through multi-level parallelism from laptops to supercomputers publication-title: Software-X – ident: ref42 doi: 10.1109/91.811231 – ident: ref39 doi: 10.1007/978-3-540-74466-5_17 – ident: ref36 doi: 10.1007/978-3-642-11970-5_16 – ident: ref35 doi: 10.1007/978-3-642-30961-8_7 – ident: ref40 doi: 10.1142/9789814261302_0021 – ident: ref17 doi: 10.1111/j.1749-6632.1980.tb29690.x – ident: ref41 doi: 10.1109/TSMC.1973.5408575 – ident: ref27 doi: 10.1109/IPDPS.2005.386 – start-page: 122 year: 2000 ident: ref26 article-title: Adaptive factoring: A dynamic scheduling method tuned to the rate of weight changes publication-title: Proc Symp High-Perform Comput Arch – ident: ref22 doi: 10.1145/143095.143134 – ident: ref23 doi: 10.1145/237502.237576 – year: 2021 ident: ref13 article-title: OpenMP application programming interface standard v.5.0 – volume: 15 year: 2008 ident: ref3 article-title: Exascale computing study: Technology challenges in achieving exascale systems publication-title: Defense Adv Res Projects Agency Informat Process Techn Office – year: 2020 ident: ref45 article-title: An adaptive self-scheduling loop scheduler – ident: ref44 doi: 10.1109/ISPDC.2017.23 – year: 2022 ident: ref2 article-title: Top500 list – ident: ref32 doi: 10.1109/ISPDC.2017.9 – start-page: 65 year: 1976 ident: ref10 article-title: The algorithm selection problem publication-title: Advances in Computers doi: 10.1016/S0065-2458(08)60520-3 – year: 2014 ident: ref15 article-title: Alya: Towards exascale for engineering simulation codes – start-page: 223 year: 2012 ident: ref18 article-title: SPEC OMP2012†publication-title: Proc Int Workshop OpenMP – ident: ref24 doi: 10.1006/jpdc.1997.1411 |
| SSID | ssj0014504 |
| Score | 2.398448 |
| Snippet | Increasing node and cores-per-node counts in supercomputers render scheduling and load balancing critical for exploiting parallelism. OpenMP applications can... |
| SourceID | proquest crossref ieee |
| SourceType | Aggregation Database Enrichment Source Index Database Publisher |
| StartPage | 4383 |
| SubjectTerms | algorithm selection problem Algorithms Automatic selection Automation dynamic load balancing Dynamic scheduling Heuristic algorithms Load balancing Load management Mathematical analysis multithreaded programming OpenMP Parallel processing Parameters Runtime library Schedules Scheduling Scheduling algorithms self-scheduling shared-memory systems Supercomputers |
| Title | Automated Scheduling Algorithm Selection and Chunk Parameter Calculation in OpenMP |
| URI | https://ieeexplore.ieee.org/document/9825675 https://www.proquest.com/docview/2705852843 |
| Volume | 33 |
| WOSCitedRecordID | wos000844140200031&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVIEE databaseName: IEEE Electronic Library (IEL) customDbUrl: eissn: 1558-2183 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0014504 issn: 1045-9219 databaseCode: RIE dateStart: 19900101 isFulltext: true titleUrlDefault: https://ieeexplore.ieee.org/ providerName: IEEE |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV07T8MwED4VxAADb0ShIA9MiECcuHE8VgXEAKjiIbFFjh-0oqSoTfn9nF23QgIhsWU4W1G-2Hef73wfwAljiaVCOm7CTIT-1ka5ZGVES2oT7vI02nfXv-X39_nLi-g14GxxF8YY44vPzLl79Ll8PVJTd1R2IZDOYIC7BEucZ7O7WouMAWt7qUBkF-1I4DIMGUwai4un3uUjMsEkQYKai8TpEn_zQV5U5cdO7N3L9cb_XmwT1kMYSToz3LegYapt2JhLNJCwYrdh7Vu_wR146EzrEYaoRqNFH52Mu4tOOsPX0XhQ99_JoxfFQaSIrDTp9qfVG-lJV77lJu3KoQpiX2RQEVeKctfbhefrq6fuTRREFSKVplkd8TzDLSXRSFOMMm0qcyu4FrGktq0znUhm08yUUjPLqMlc7xrljgnTvKSKyzzdg-VqVJl9IFLmDNGNMQJUTMVcmpJRKrU1NLGasybE889cqNBx3AlfDAvPPGJROGQKh0wRkGnC6WLIx6zdxl_GOw6KhWFAoQmtOZZFWJCTAs2RGKEvTg9-H3UIq27uWaVKC5br8dQcwYr6rAeT8bH_174A1lHQKw |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LT9tAEB7RUKlw4I0IBLoHTlVdvPb6sccoBYEaooikEjdrvQ8SERyUOP39nd1sIqSiStx8GHstf96d-XZm5wO4ZCwylAvLTZgO0N-aIBesDGhJTZTZPI1y3fW7Wa-XPz7y_gZ8X5-F0Vq74jP9w166XL6ayoXdKrviSGcwwP0EmwmOES5Pa61zBixxYoHIL5KA40T0OUwa8qth_-cAuWAUIUXNeWSVid94ISer8s9a7BzMze7HXm0PdnwgSdpL5PdhQ1cHsLsSaSB-zh7A9puOg4fw0F7UUwxStUKLEboZexqdtCdP09m4Hr2QgZPFQayIqBTpjBbVM-kLW8BlH9oRE-nlvsi4IrYY5b5_BL9vroed28DLKgQyjtM6yPIUF5VIIVHRUidU5IZnioeCmkSlKhLMxKkuhWKGUZ3a7jXSbhTGeUllJvL4GBrVtNInQITIGeIbYgwomQwzoUtGqVBG08iojDUhXH3mQvqe41b6YlI47hHywiJTWGQKj0wTvq1veV023Pif8aGFYm3oUWhCa4Vl4afkvEBzpEbojePT9-_6Cl9uh_fdonvX-3UGW3acZd1KCxr1bKHP4bP8U4_nswv33_0F04jTcg |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Automated+Scheduling+Algorithm+Selection+and+Chunk+Parameter+Calculation+in+OpenMP&rft.jtitle=IEEE+transactions+on+parallel+and+distributed+systems&rft.au=Ali%2C+Mohammed&rft.au=Muller+Korndorfer%2C+Jonas+H&rft.au=Eleliemy%2C+Ahmed&rft.au=Ciorba%2C+Florina+M&rft.date=2022-12-01&rft.pub=The+Institute+of+Electrical+and+Electronics+Engineers%2C+Inc.+%28IEEE%29&rft.issn=1045-9219&rft.eissn=1558-2183&rft.volume=33&rft.issue=12&rft.spage=4383&rft_id=info:doi/10.1109%2FTPDS.2022.3189270&rft.externalDBID=NO_FULL_TEXT |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1045-9219&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1045-9219&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1045-9219&client=summon |