A New Execution Model and Executor for Adaptively Optimizing the Performance of Parallel Algorithms Using HPX Runtime System
Developing parallel algorithms efficiently requires careful management of concurrency across diverse hardware architectures. C++ executors provide a standardized interface that simplifies the development process, allowing developers to write portable and uniform code. However, in some cases, they ma...
Uložené v:
| Vydané v: | SN computer science Ročník 6; číslo 8; s. 911 |
|---|---|
| Hlavní autori: | , , |
| Médium: | Journal Article |
| Jazyk: | English |
| Vydavateľské údaje: |
Singapore
Springer Nature Singapore
01.12.2025
Springer Nature B.V |
| Predmet: | |
| ISSN: | 2661-8907, 2662-995X, 2661-8907 |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Abstract | Developing parallel algorithms efficiently requires careful management of concurrency across diverse hardware architectures. C++ executors provide a standardized interface that simplifies the development process, allowing developers to write portable and uniform code. However, in some cases, they may not fully leverage hardware capabilities or optimally allocate resources for specific workloads, leading to potential performance inefficiencies. Building on our earlier conference paper [Adaptively Optimizing the Performance of HPX's Parallel Algorithms], which introduced a preliminary strategy based on cores and chunking (workload), and integrated it into HPX’s executor API, that dynamically optimizes for workload distribution and resource allocation, based on runtime metrics and overheads, this paper, introduces a more detailed model of that strategy. It evaluates the efficiency of this implementation (as an HPX executor) across a wide range of compute-bound and memory-bound workloads on different architectures and with different algorithms. The results show consistent speedups across all tests, configurations, and workloads studied, offering improved performance through a familiar and user-friendly C
++
executor API. Additionally, the paper highlights how runtime-driven executor adaptations can simplify performance optimization without increasing the complexity of algorithm development. |
|---|---|
| AbstractList | Developing parallel algorithms efficiently requires careful management of concurrency across diverse hardware architectures. C++ executors provide a standardized interface that simplifies the development process, allowing developers to write portable and uniform code. However, in some cases, they may not fully leverage hardware capabilities or optimally allocate resources for specific workloads, leading to potential performance inefficiencies. Building on our earlier conference paper [Adaptively Optimizing the Performance of HPX's Parallel Algorithms], which introduced a preliminary strategy based on cores and chunking (workload), and integrated it into HPX’s executor API, that dynamically optimizes for workload distribution and resource allocation, based on runtime metrics and overheads, this paper, introduces a more detailed model of that strategy. It evaluates the efficiency of this implementation (as an HPX executor) across a wide range of compute-bound and memory-bound workloads on different architectures and with different algorithms. The results show consistent speedups across all tests, configurations, and workloads studied, offering improved performance through a familiar and user-friendly C
++
executor API. Additionally, the paper highlights how runtime-driven executor adaptations can simplify performance optimization without increasing the complexity of algorithm development. Developing parallel algorithms efficiently requires careful management of concurrency across diverse hardware architectures. C++ executors provide a standardized interface that simplifies the development process, allowing developers to write portable and uniform code. However, in some cases, they may not fully leverage hardware capabilities or optimally allocate resources for specific workloads, leading to potential performance inefficiencies. Building on our earlier conference paper [Adaptively Optimizing the Performance of HPX's Parallel Algorithms], which introduced a preliminary strategy based on cores and chunking (workload), and integrated it into HPX’s executor API, that dynamically optimizes for workload distribution and resource allocation, based on runtime metrics and overheads, this paper, introduces a more detailed model of that strategy. It evaluates the efficiency of this implementation (as an HPX executor) across a wide range of compute-bound and memory-bound workloads on different architectures and with different algorithms. The results show consistent speedups across all tests, configurations, and workloads studied, offering improved performance through a familiar and user-friendly C++ executor API. Additionally, the paper highlights how runtime-driven executor adaptations can simplify performance optimization without increasing the complexity of algorithm development. |
| ArticleNumber | 911 |
| Author | Mohammadiporshokooh, Karame Brandt, Steven R. Kaiser, Hartmut |
| Author_xml | – sequence: 1 givenname: Karame orcidid: 0009-0000-8349-3389 surname: Mohammadiporshokooh fullname: Mohammadiporshokooh, Karame email: kmoham6@lsu.edu organization: Center of Computation and Technology, Louisiana State University, Department of Computer Science, Louisiana State University – sequence: 2 givenname: Steven R. orcidid: 0000-0002-7979-2906 surname: Brandt fullname: Brandt, Steven R. organization: Center of Computation and Technology, Louisiana State University, Department of Computer Science, Louisiana State University – sequence: 3 givenname: Hartmut orcidid: 0000-0002-8712-2806 surname: Kaiser fullname: Kaiser, Hartmut organization: Center of Computation and Technology, Louisiana State University, Department of Computer Science, Louisiana State University |
| BookMark | eNp9kE9LAzEQxYMoWGu_gKeA59Vs9v-xlGqFaota8BayyWy7ZTepya664oc3dQt68jDMMPzeG-adoWOlFSB04ZMrn5Dk2oY0SzKP0MgjYRhSrztCAxrHvpdmJDn-M5-ikbVbQhzqyDgaoK8xfoB3PP0A0TalVvheS6gwV_Kw0wYXrsaS75ryDaoOL9xQl5-lWuNmA3gJxgE1VwKwLvCSG15VzmJcrbUpm01t8cru4dnyBT-2yokBP3W2gfocnRS8sjA69CFa3UyfJzNvvri9m4znnqCB33mxiDIucz-XQSRJkoo8zVOSZqEUmXsEJIE8zkUcpjRJ8iLIOOQJiDAsoiQgkgdDdNn77ox-bcE2bKtbo9xJFtDYz9I0ooGjaE8Jo601ULCdKWtuOuYTtg-a9UEzlx77CZp1ThT0IutgtQbza_2P6hvw3ISh |
| Cites_doi | 10.1007/s11227-017-2023-9 10.21105/joss.02352 10.1145/3624062.3624230 10.1109/HIPS.2004.1299190 10.1109/CLUSTER.2015.119 10.1145/3152041.3152084 10.1109/ICPPW.2009.14 10.1145/3620665.3640405 10.1007/978-3-031-97196-9_3 10.1145/3318170.3318191 10.1007/978-3-031-31209-0_1 10.1145/42411.42415 10.1007/978-3-031-41673-6_5 10.1145/1465482.1465560 10.1007/978-3-031-97196-9_6 10.1016/j.jocs.2020.101284 |
| ContentType | Journal Article |
| Copyright | The Author(s) 2025 The Author(s) 2025. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. |
| Copyright_xml | – notice: The Author(s) 2025 – notice: The Author(s) 2025. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. |
| DBID | C6C AAYXX CITATION JQ2 |
| DOI | 10.1007/s42979-025-04442-y |
| DatabaseName | SpringerOpen Free (Free internet resource, activated by CARLI) CrossRef ProQuest Computer Science Collection |
| DatabaseTitle | CrossRef ProQuest Computer Science Collection |
| DatabaseTitleList | ProQuest Computer Science Collection CrossRef |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Computer Science |
| EISSN | 2661-8907 |
| ExternalDocumentID | 10_1007_s42979_025_04442_y |
| GroupedDBID | 0R~ 2JN 406 AACDK AAHNG AAJBT AASML AATNV AAUYE ABAKF ABBRH ABDBE ABECU ABFSG ABHQN ABJNI ABMQK ABRTQ ABTEG ABTKH ABWNU ACAOD ACDTI ACHSB ACOKC ACPIV ACSTC ACZOJ ADKFA ADKNI ADTPH ADYFF AEFQL AEMSY AESKC AEZWR AFBBN AFDZB AFHIU AFOHR AFQWF AGMZJ AGQEE AGRTI AHPBZ AHWEU AIGIU AILAN AIXLP AJZVZ ALMA_UNASSIGNED_HOLDINGS AMXSW AMYLF ATHPR AYFIA BAPOH BSONS C6C DPUIP EBLON EBS FIGPU FNLPD GGCAI GNWQR IKXTQ IWAJR JZLTJ LLZTM NPVJJ NQJWS PT4 ROL RSV SJYHP SNE SOJ SRMVM SSLCW UOJIU UTJUX ZMTXR AAYXX CITATION KOV JQ2 |
| ID | FETCH-LOGICAL-c231y-6c59adb1bd35d078cb8b80894dc9250ed0eb6bc648277bf39aeb7ec44f5730da3 |
| IEDL.DBID | RSV |
| ISSN | 2661-8907 2662-995X |
| IngestDate | Wed Nov 05 14:46:05 EST 2025 Sat Nov 29 07:06:07 EST 2025 Sat Oct 18 23:02:08 EDT 2025 |
| IsDoiOpenAccess | true |
| IsOpenAccess | true |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 8 |
| Keywords | HPX Executors Performance Asynchronous many-task (AMT) Parallel algorithms Optimization |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c231y-6c59adb1bd35d078cb8b80894dc9250ed0eb6bc648277bf39aeb7ec44f5730da3 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ORCID | 0000-0002-7979-2906 0000-0002-8712-2806 0009-0000-8349-3389 |
| OpenAccessLink | https://link.springer.com/10.1007/s42979-025-04442-y |
| PQID | 3261988523 |
| PQPubID | 6623307 |
| ParticipantIDs | proquest_journals_3261988523 crossref_primary_10_1007_s42979_025_04442_y springer_journals_10_1007_s42979_025_04442_y |
| PublicationCentury | 2000 |
| PublicationDate | 2025-12-01 |
| PublicationDateYYYYMMDD | 2025-12-01 |
| PublicationDate_xml | – month: 12 year: 2025 text: 2025-12-01 day: 01 |
| PublicationDecade | 2020 |
| PublicationPlace | Singapore |
| PublicationPlace_xml | – name: Singapore – name: Kolkata |
| PublicationTitle | SN computer science |
| PublicationTitleAbbrev | SN COMPUT. SCI |
| PublicationYear | 2025 |
| Publisher | Springer Nature Singapore Springer Nature B.V |
| Publisher_xml | – name: Springer Nature Singapore – name: Springer Nature B.V |
| References | 4442_CR15 4442_CR8 4442_CR16 S Höfinger (4442_CR23) 2017; 73 4442_CR17 4442_CR18 4442_CR11 4442_CR12 4442_CR13 4442_CR14 4442_CR1 4442_CR2 4442_CR5 4442_CR19 4442_CR4 4442_CR7 4442_CR6 A Eleliemy (4442_CR22) 2021; 51 H Kaiser (4442_CR9) 2020; 5 JL Gustafson (4442_CR3) 1988; 31 4442_CR20 4442_CR10 4442_CR21 |
| References_xml | – ident: 4442_CR15 – volume: 73 start-page: 4390 issue: 10 year: 2017 ident: 4442_CR23 publication-title: J Supercomput doi: 10.1007/s11227-017-2023-9 – volume: 5 start-page: 2352 issue: 53 year: 2020 ident: 4442_CR9 publication-title: J Open Source Softw doi: 10.21105/joss.02352 – ident: 4442_CR16 – ident: 4442_CR17 doi: 10.1145/3624062.3624230 – ident: 4442_CR19 doi: 10.1109/HIPS.2004.1299190 – ident: 4442_CR14 – ident: 4442_CR13 – ident: 4442_CR8 doi: 10.1109/CLUSTER.2015.119 – ident: 4442_CR1 doi: 10.1145/3152041.3152084 – ident: 4442_CR6 doi: 10.1109/HIPS.2004.1299190 – ident: 4442_CR11 doi: 10.1109/ICPPW.2009.14 – ident: 4442_CR20 doi: 10.1145/3620665.3640405 – ident: 4442_CR21 – ident: 4442_CR5 doi: 10.1007/978-3-031-97196-9_3 – ident: 4442_CR7 doi: 10.1145/3318170.3318191 – ident: 4442_CR12 doi: 10.1007/978-3-031-31209-0_1 – volume: 31 start-page: 532 issue: 5 year: 1988 ident: 4442_CR3 publication-title: Commun ACM doi: 10.1145/42411.42415 – ident: 4442_CR10 – ident: 4442_CR18 doi: 10.1007/978-3-031-41673-6_5 – ident: 4442_CR2 doi: 10.1145/1465482.1465560 – ident: 4442_CR4 doi: 10.1007/978-3-031-97196-9_6 – volume: 51 year: 2021 ident: 4442_CR22 publication-title: J. Comput. Sci. doi: 10.1016/j.jocs.2020.101284 |
| SSID | ssj0002504465 |
| Score | 2.3108978 |
| Snippet | Developing parallel algorithms efficiently requires careful management of concurrency across diverse hardware architectures. C++ executors provide a... |
| SourceID | proquest crossref springer |
| SourceType | Aggregation Database Index Database Publisher |
| StartPage | 911 |
| SubjectTerms | Algorithms Application programming interface C plus plus C++ (programming language) Computer Imaging Computer Science Computer Systems Organization and Communication Networks Data Structures and Information Theory Hardware Information Systems and Communication Service Libraries Optimization Original Research Pattern Recognition and Graphics Resource allocation Run time (computers) Software Engineering/Programming and Operating Systems Vision Workload Workloads |
| Title | A New Execution Model and Executor for Adaptively Optimizing the Performance of Parallel Algorithms Using HPX Runtime System |
| URI | https://link.springer.com/article/10.1007/s42979-025-04442-y https://www.proquest.com/docview/3261988523 |
| Volume | 6 |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVAVX databaseName: SpringerLINK Contemporary 1997-Present customDbUrl: eissn: 2661-8907 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0002504465 issn: 2661-8907 databaseCode: RSV dateStart: 20190101 isFulltext: true titleUrlDefault: https://link.springer.com/search?facet-content-type=%22Journal%22 providerName: Springer Nature – providerCode: PRVAVX databaseName: SpringerLINK Contemporary 1997-Present customDbUrl: eissn: 2661-8907 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0002504465 issn: 2661-8907 databaseCode: RSV dateStart: 20200101 isFulltext: true titleUrlDefault: https://link.springer.com/search?facet-content-type=%22Journal%22 providerName: Springer Nature |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3JTsMwELWgcODCjiib5sANLKWpk9jHCoF6QFCVRb1F3gKVuiktiCI-nrGbtALBAW6Rs1kz9swb2_OGkFMmTRiHoaBRLZaURUmdclMTtKZCEwWac2U8Zf51cnPDOx3RKpLCxuVp93JL0lvqebIbWs5EUFd-1XGchXS6TFbQ3XE3Hdt3j_OVFUfKxeKoyJD5-dWvXmgBLb_thnonc7Xxv-5tkvUCVEJjNgq2yJIdbJONsmADFPN3h3w0AI0aXL5Z7ccbuEpoPZADU7QNc0AQCw0jR84M9qZwixf97jv2BBAqQmuRZwDDDFoyd7VY8Ne9p2HenTz3x-APIUCz1YG2K0PRtzBjRd8lD1eX9xdNWpRfoBpB35TGOhLSqBpqKzKIJLTiigdcMKMFytiawKpY6dgRiSYqqwtpVWI1YxlqPDCyvkcqg-HA7hNAFMliKRJ0hhkzMpChypjQgmM7RmhhlZyV6khHM5aNdM6n7AWbomBTL9h0WiVHpcbSYsaN07oLBTnHuLpKzksNLW7__rWDvz1-SNZCp2R_ouWIVCb5iz0mq_p10h3nJ34kfgKNm9uH |
| linkProvider | Springer Nature |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LTxsxEB4VWqlcoC9EeLRz6K21tNl4d-1jhEBBpGlEaZXbyq-FSHmgTUAE8eMZO7uJWrWHclt5X9aMPfPZnvkG4DNXNk7jWLKkmSrGk6zFhG1K1tSxTSIjhLaBMr-b9XpiMJD9KilsVke710eSwVKvkt3IcmaS-fKrnuMsZosNeMnJY_lAvosfv1Y7K56Ui6dJlSHz91d_90JraPnHaWhwMqc7z-veG9iuQCW2l6PgLbxwk3ewUxdswGr-vofHNpJRw5N7Z8J4Q18JbYRqYqu2aYkEYrFt1Y03g6MFfqeL8fCBeoIEFbG_zjPAaYF9VfpaLPTr0dW0HM6vxzMMQQjY6Q_wwpehGDtcsqJ_gJ-nJ5fHHVaVX2CGQN-CpSaRyuomaSuxhCSMFlpEQnJrJMnY2cjpVJvUE4lmumhJ5XTmDOcFaTyyqrULm5PpxO0BEorkqZIZOcOCWxWpWBdcGimonVZocQO-1OrIb5YsG_mKTzkINifB5kGw-aIBh7XG8mrGzfKWXwoKQevqBnytNbS-_e-v7f_f45_gdefyWzfvnvXOD2Ar9goP0S2HsDkvb90RvDJ38-Gs_BhG5RPhIt5r |
| linkToPdf | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1Lb9NAEB5BQYgLLRREoJQ5cKOrOs7a3j1G0KioVbB4KTdrXy6REidy0oogfjyzGzsB1B5Qb9b6tZoZ737j3fk-gDdc2TiNY8mSbqoYT7IeE7YrWVfHNomMENoGyvzzbDgUo5HM_6jiD7vd2yXJdU2DZ2mqlsdzWx5vCt9oFM0k81Ksnu8sZqu7cI970SCfr3_-tvnL4gm6eJo01TLX3_r3jLSFmf-sjIYJZ7B7-67uwaMGbGJ_HR2P4Y6rnsBuK-SAzXe9D7_6SIMdnvxwJsQheoW0CarKNm2zGgncYt-quR8eJyv8SAfT8U_qFRKExHxbf4CzEnNVe40WevXkYlaPl9-nCwybE_A0H-EnL08xdbhmS38KXwcnX96dskaWgRkCgyuWmkQqq7vkxcQSwjBaaBEJya2RZG9nI6dTbVJPMJrpsieV05kznJcUCZFVvWewU80q9xyQ0CVPlcxokiy5VZGKdcmlkYLaKXOLO_C2dU0xX7NvFBue5WDYggxbBMMWqw4ctN4rmi9xUfR8iigE5dsdOGq9tT1989Ne_N_lr-FB_n5QnH8Ynr2Eh7H3d9j0cgA7y_rSvYL75mo5XtSHIUB_A8P-508 |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+New+Execution+Model+and+Executor+for+Adaptively+Optimizing+the+Performance+of+Parallel+Algorithms+Using+HPX+Runtime+System&rft.jtitle=SN+computer+science&rft.au=Mohammadiporshokooh%2C+Karame&rft.au=Brandt%2C+Steven+R.&rft.au=Kaiser%2C+Hartmut&rft.date=2025-12-01&rft.issn=2661-8907&rft.eissn=2661-8907&rft.volume=6&rft.issue=8&rft_id=info:doi/10.1007%2Fs42979-025-04442-y&rft.externalDBID=n%2Fa&rft.externalDocID=10_1007_s42979_025_04442_y |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2661-8907&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2661-8907&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2661-8907&client=summon |