A New Execution Model and Executor for Adaptively Optimizing the Performance of Parallel Algorithms Using HPX Runtime System

Developing parallel algorithms efficiently requires careful management of concurrency across diverse hardware architectures. C++ executors provide a standardized interface that simplifies the development process, allowing developers to write portable and uniform code. However, in some cases, they ma...

Celý popis

Uložené v:

Podrobná bibliografia
Vydané v:	SN computer science Ročník 6; číslo 8; s. 911
Hlavní autori:	Mohammadiporshokooh, Karame, Brandt, Steven R., Kaiser, Hartmut
Médium:	Journal Article
Jazyk:	English
Vydavateľské údaje:	Singapore Springer Nature Singapore 01.12.2025 Springer Nature B.V
Predmet:	Algorithms Application programming interface C plus plus C++ (programming language) Computer Imaging Computer Science Computer Systems Organization and Communication Networks Data Structures and Information Theory Hardware Information Systems and Communication Service Libraries Optimization Original Research Pattern Recognition and Graphics Resource allocation Run time (computers) Software Engineering/Programming and Operating Systems Vision Workload Workloads HPX Executors Performance Asynchronous many-task (AMT) Parallel algorithms Optimization
ISSN:	2661-8907, 2662-995X, 2661-8907
On-line prístup:	Získať plný text
Tagy:	Pridať tag Žiadne tagy, Buďte prvý, kto otaguje tento záznam!

Abstract	Developing parallel algorithms efficiently requires careful management of concurrency across diverse hardware architectures. C++ executors provide a standardized interface that simplifies the development process, allowing developers to write portable and uniform code. However, in some cases, they may not fully leverage hardware capabilities or optimally allocate resources for specific workloads, leading to potential performance inefficiencies. Building on our earlier conference paper [Adaptively Optimizing the Performance of HPX's Parallel Algorithms], which introduced a preliminary strategy based on cores and chunking (workload), and integrated it into HPX’s executor API, that dynamically optimizes for workload distribution and resource allocation, based on runtime metrics and overheads, this paper, introduces a more detailed model of that strategy. It evaluates the efficiency of this implementation (as an HPX executor) across a wide range of compute-bound and memory-bound workloads on different architectures and with different algorithms. The results show consistent speedups across all tests, configurations, and workloads studied, offering improved performance through a familiar and user-friendly C ++ executor API. Additionally, the paper highlights how runtime-driven executor adaptations can simplify performance optimization without increasing the complexity of algorithm development.
AbstractList	Developing parallel algorithms efficiently requires careful management of concurrency across diverse hardware architectures. C++ executors provide a standardized interface that simplifies the development process, allowing developers to write portable and uniform code. However, in some cases, they may not fully leverage hardware capabilities or optimally allocate resources for specific workloads, leading to potential performance inefficiencies. Building on our earlier conference paper [Adaptively Optimizing the Performance of HPX's Parallel Algorithms], which introduced a preliminary strategy based on cores and chunking (workload), and integrated it into HPX’s executor API, that dynamically optimizes for workload distribution and resource allocation, based on runtime metrics and overheads, this paper, introduces a more detailed model of that strategy. It evaluates the efficiency of this implementation (as an HPX executor) across a wide range of compute-bound and memory-bound workloads on different architectures and with different algorithms. The results show consistent speedups across all tests, configurations, and workloads studied, offering improved performance through a familiar and user-friendly C ++ executor API. Additionally, the paper highlights how runtime-driven executor adaptations can simplify performance optimization without increasing the complexity of algorithm development. Developing parallel algorithms efficiently requires careful management of concurrency across diverse hardware architectures. C++ executors provide a standardized interface that simplifies the development process, allowing developers to write portable and uniform code. However, in some cases, they may not fully leverage hardware capabilities or optimally allocate resources for specific workloads, leading to potential performance inefficiencies. Building on our earlier conference paper [Adaptively Optimizing the Performance of HPX's Parallel Algorithms], which introduced a preliminary strategy based on cores and chunking (workload), and integrated it into HPX’s executor API, that dynamically optimizes for workload distribution and resource allocation, based on runtime metrics and overheads, this paper, introduces a more detailed model of that strategy. It evaluates the efficiency of this implementation (as an HPX executor) across a wide range of compute-bound and memory-bound workloads on different architectures and with different algorithms. The results show consistent speedups across all tests, configurations, and workloads studied, offering improved performance through a familiar and user-friendly C++ executor API. Additionally, the paper highlights how runtime-driven executor adaptations can simplify performance optimization without increasing the complexity of algorithm development.
ArticleNumber	911
Author	Mohammadiporshokooh, Karame Brandt, Steven R. Kaiser, Hartmut
Author_xml	– sequence: 1 givenname: Karame orcidid: 0009-0000-8349-3389 surname: Mohammadiporshokooh fullname: Mohammadiporshokooh, Karame email: kmoham6@lsu.edu organization: Center of Computation and Technology, Louisiana State University, Department of Computer Science, Louisiana State University – sequence: 2 givenname: Steven R. orcidid: 0000-0002-7979-2906 surname: Brandt fullname: Brandt, Steven R. organization: Center of Computation and Technology, Louisiana State University, Department of Computer Science, Louisiana State University – sequence: 3 givenname: Hartmut orcidid: 0000-0002-8712-2806 surname: Kaiser fullname: Kaiser, Hartmut organization: Center of Computation and Technology, Louisiana State University, Department of Computer Science, Louisiana State University
BookMark	eNp9kE9LAzEQxYMoWGu_gKeA59Vs9v-xlGqFaota8BayyWy7ZTepya664oc3dQt68jDMMPzeG-adoWOlFSB04ZMrn5Dk2oY0SzKP0MgjYRhSrztCAxrHvpdmJDn-M5-ikbVbQhzqyDgaoK8xfoB3PP0A0TalVvheS6gwV_Kw0wYXrsaS75ryDaoOL9xQl5-lWuNmA3gJxgE1VwKwLvCSG15VzmJcrbUpm01t8cru4dnyBT-2yokBP3W2gfocnRS8sjA69CFa3UyfJzNvvri9m4znnqCB33mxiDIucz-XQSRJkoo8zVOSZqEUmXsEJIE8zkUcpjRJ8iLIOOQJiDAsoiQgkgdDdNn77ox-bcE2bKtbo9xJFtDYz9I0ooGjaE8Jo601ULCdKWtuOuYTtg-a9UEzlx77CZp1ThT0IutgtQbza_2P6hvw3ISh
Cites_doi	10.1007/s11227-017-2023-9 10.21105/joss.02352 10.1145/3624062.3624230 10.1109/HIPS.2004.1299190 10.1109/CLUSTER.2015.119 10.1145/3152041.3152084 10.1109/ICPPW.2009.14 10.1145/3620665.3640405 10.1007/978-3-031-97196-9_3 10.1145/3318170.3318191 10.1007/978-3-031-31209-0_1 10.1145/42411.42415 10.1007/978-3-031-41673-6_5 10.1145/1465482.1465560 10.1007/978-3-031-97196-9_6 10.1016/j.jocs.2020.101284
ContentType	Journal Article
Copyright	The Author(s) 2025 The Author(s) 2025. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Copyright_xml	– notice: The Author(s) 2025 – notice: The Author(s) 2025. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
DBID	C6C AAYXX CITATION JQ2
DOI	10.1007/s42979-025-04442-y
DatabaseName	SpringerOpen Free (Free internet resource, activated by CARLI) CrossRef ProQuest Computer Science Collection
DatabaseTitle	CrossRef ProQuest Computer Science Collection
DatabaseTitleList	ProQuest Computer Science Collection CrossRef
DeliveryMethod	fulltext_linktorsrc
Discipline	Computer Science
EISSN	2661-8907
ExternalDocumentID	10_1007_s42979_025_04442_y
GroupedDBID	0R~ 2JN 406 AACDK AAHNG AAJBT AASML AATNV AAUYE ABAKF ABBRH ABDBE ABECU ABFSG ABHQN ABJNI ABMQK ABRTQ ABTEG ABTKH ABWNU ACAOD ACDTI ACHSB ACOKC ACPIV ACSTC ACZOJ ADKFA ADKNI ADTPH ADYFF AEFQL AEMSY AESKC AEZWR AFBBN AFDZB AFHIU AFOHR AFQWF AGMZJ AGQEE AGRTI AHPBZ AHWEU AIGIU AILAN AIXLP AJZVZ ALMA_UNASSIGNED_HOLDINGS AMXSW AMYLF ATHPR AYFIA BAPOH BSONS C6C DPUIP EBLON EBS FIGPU FNLPD GGCAI GNWQR IKXTQ IWAJR JZLTJ LLZTM NPVJJ NQJWS PT4 ROL RSV SJYHP SNE SOJ SRMVM SSLCW UOJIU UTJUX ZMTXR AAYXX CITATION KOV JQ2
ID	FETCH-LOGICAL-c231y-6c59adb1bd35d078cb8b80894dc9250ed0eb6bc648277bf39aeb7ec44f5730da3
IEDL.DBID	RSV
ISSN	2661-8907 2662-995X
IngestDate	Wed Nov 05 14:46:05 EST 2025 Sat Nov 29 07:06:07 EST 2025 Sat Oct 18 23:02:08 EDT 2025
IsDoiOpenAccess	true
IsOpenAccess	true
IsPeerReviewed	true
IsScholarly	true
Issue	8
Keywords	HPX Executors Performance Asynchronous many-task (AMT) Parallel algorithms Optimization
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-c231y-6c59adb1bd35d078cb8b80894dc9250ed0eb6bc648277bf39aeb7ec44f5730da3
Notes	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ORCID	0000-0002-7979-2906 0000-0002-8712-2806 0009-0000-8349-3389
OpenAccessLink	https://link.springer.com/10.1007/s42979-025-04442-y
PQID	3261988523
PQPubID	6623307
ParticipantIDs	proquest_journals_3261988523 crossref_primary_10_1007_s42979_025_04442_y springer_journals_10_1007_s42979_025_04442_y
PublicationCentury	2000
PublicationDate	2025-12-01
PublicationDateYYYYMMDD	2025-12-01
PublicationDate_xml	– month: 12 year: 2025 text: 2025-12-01 day: 01
PublicationDecade	2020
PublicationPlace	Singapore
PublicationPlace_xml	– name: Singapore – name: Kolkata
PublicationTitle	SN computer science
PublicationTitleAbbrev	SN COMPUT. SCI
PublicationYear	2025
Publisher	Springer Nature Singapore Springer Nature B.V
Publisher_xml	– name: Springer Nature Singapore – name: Springer Nature B.V
References	4442_CR15 4442_CR8 4442_CR16 S Höfinger (4442_CR23) 2017; 73 4442_CR17 4442_CR18 4442_CR11 4442_CR12 4442_CR13 4442_CR14 4442_CR1 4442_CR2 4442_CR5 4442_CR19 4442_CR4 4442_CR7 4442_CR6 A Eleliemy (4442_CR22) 2021; 51 H Kaiser (4442_CR9) 2020; 5 JL Gustafson (4442_CR3) 1988; 31 4442_CR20 4442_CR10 4442_CR21
References_xml	– ident: 4442_CR15 – volume: 73 start-page: 4390 issue: 10 year: 2017 ident: 4442_CR23 publication-title: J Supercomput doi: 10.1007/s11227-017-2023-9 – volume: 5 start-page: 2352 issue: 53 year: 2020 ident: 4442_CR9 publication-title: J Open Source Softw doi: 10.21105/joss.02352 – ident: 4442_CR16 – ident: 4442_CR17 doi: 10.1145/3624062.3624230 – ident: 4442_CR19 doi: 10.1109/HIPS.2004.1299190 – ident: 4442_CR14 – ident: 4442_CR13 – ident: 4442_CR8 doi: 10.1109/CLUSTER.2015.119 – ident: 4442_CR1 doi: 10.1145/3152041.3152084 – ident: 4442_CR6 doi: 10.1109/HIPS.2004.1299190 – ident: 4442_CR11 doi: 10.1109/ICPPW.2009.14 – ident: 4442_CR20 doi: 10.1145/3620665.3640405 – ident: 4442_CR21 – ident: 4442_CR5 doi: 10.1007/978-3-031-97196-9_3 – ident: 4442_CR7 doi: 10.1145/3318170.3318191 – ident: 4442_CR12 doi: 10.1007/978-3-031-31209-0_1 – volume: 31 start-page: 532 issue: 5 year: 1988 ident: 4442_CR3 publication-title: Commun ACM doi: 10.1145/42411.42415 – ident: 4442_CR10 – ident: 4442_CR18 doi: 10.1007/978-3-031-41673-6_5 – ident: 4442_CR2 doi: 10.1145/1465482.1465560 – ident: 4442_CR4 doi: 10.1007/978-3-031-97196-9_6 – volume: 51 year: 2021 ident: 4442_CR22 publication-title: J. Comput. Sci. doi: 10.1016/j.jocs.2020.101284
SSID	ssj0002504465
Score	2.3108978
Snippet	Developing parallel algorithms efficiently requires careful management of concurrency across diverse hardware architectures. C++ executors provide a...
SourceID	proquest crossref springer
SourceType	Aggregation Database Index Database Publisher
StartPage	911
SubjectTerms	Algorithms Application programming interface C plus plus C++ (programming language) Computer Imaging Computer Science Computer Systems Organization and Communication Networks Data Structures and Information Theory Hardware Information Systems and Communication Service Libraries Optimization Original Research Pattern Recognition and Graphics Resource allocation Run time (computers) Software Engineering/Programming and Operating Systems Vision Workload Workloads
Title	A New Execution Model and Executor for Adaptively Optimizing the Performance of Parallel Algorithms Using HPX Runtime System
URI	https://link.springer.com/article/10.1007/s42979-025-04442-y https://www.proquest.com/docview/3261988523
Volume	6
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
journalDatabaseRights	– providerCode: PRVAVX databaseName: SpringerLINK Contemporary 1997-Present customDbUrl: eissn: 2661-8907 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0002504465 issn: 2661-8907 databaseCode: RSV dateStart: 20190101 isFulltext: true titleUrlDefault: https://link.springer.com/search?facet-content-type=%22Journal%22 providerName: Springer Nature – providerCode: PRVAVX databaseName: SpringerLINK Contemporary 1997-Present customDbUrl: eissn: 2661-8907 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0002504465 issn: 2661-8907 databaseCode: RSV dateStart: 20200101 isFulltext: true titleUrlDefault: https://link.springer.com/search?facet-content-type=%22Journal%22 providerName: Springer Nature
link	http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3JTsMwELWgcODCjiib5sANLKWpk9jHCoF6QFCVRb1F3gKVuiktiCI-nrGbtALBAW6Rs1kz9swb2_OGkFMmTRiHoaBRLZaURUmdclMTtKZCEwWac2U8Zf51cnPDOx3RKpLCxuVp93JL0lvqebIbWs5EUFd-1XGchXS6TFbQ3XE3Hdt3j_OVFUfKxeKoyJD5-dWvXmgBLb_thnonc7Xxv-5tkvUCVEJjNgq2yJIdbJONsmADFPN3h3w0AI0aXL5Z7ccbuEpoPZADU7QNc0AQCw0jR84M9qZwixf97jv2BBAqQmuRZwDDDFoyd7VY8Ne9p2HenTz3x-APIUCz1YG2K0PRtzBjRd8lD1eX9xdNWpRfoBpB35TGOhLSqBpqKzKIJLTiigdcMKMFytiawKpY6dgRiSYqqwtpVWI1YxlqPDCyvkcqg-HA7hNAFMliKRJ0hhkzMpChypjQgmM7RmhhlZyV6khHM5aNdM6n7AWbomBTL9h0WiVHpcbSYsaN07oLBTnHuLpKzksNLW7__rWDvz1-SNZCp2R_ouWIVCb5iz0mq_p10h3nJ34kfgKNm9uH
linkProvider	Springer Nature
linkToHtml	http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LTxsxEB4VWqlcoC9EeLRz6K21tNl4d-1jhEBBpGlEaZXbyq-FSHmgTUAE8eMZO7uJWrWHclt5X9aMPfPZnvkG4DNXNk7jWLKkmSrGk6zFhG1K1tSxTSIjhLaBMr-b9XpiMJD9KilsVke710eSwVKvkt3IcmaS-fKrnuMsZosNeMnJY_lAvosfv1Y7K56Ui6dJlSHz91d_90JraPnHaWhwMqc7z-veG9iuQCW2l6PgLbxwk3ewUxdswGr-vofHNpJRw5N7Z8J4Q18JbYRqYqu2aYkEYrFt1Y03g6MFfqeL8fCBeoIEFbG_zjPAaYF9VfpaLPTr0dW0HM6vxzMMQQjY6Q_wwpehGDtcsqJ_gJ-nJ5fHHVaVX2CGQN-CpSaRyuomaSuxhCSMFlpEQnJrJMnY2cjpVJvUE4lmumhJ5XTmDOcFaTyyqrULm5PpxO0BEorkqZIZOcOCWxWpWBdcGimonVZocQO-1OrIb5YsG_mKTzkINifB5kGw-aIBh7XG8mrGzfKWXwoKQevqBnytNbS-_e-v7f_f45_gdefyWzfvnvXOD2Ar9goP0S2HsDkvb90RvDJ38-Gs_BhG5RPhIt5r
linkToPdf	http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1Lb9NAEB5BQYgLLRREoJQ5cKOrOs7a3j1G0KioVbB4KTdrXy6REidy0oogfjyzGzsB1B5Qb9b6tZoZ737j3fk-gDdc2TiNY8mSbqoYT7IeE7YrWVfHNomMENoGyvzzbDgUo5HM_6jiD7vd2yXJdU2DZ2mqlsdzWx5vCt9oFM0k81Ksnu8sZqu7cI970SCfr3_-tvnL4gm6eJo01TLX3_r3jLSFmf-sjIYJZ7B7-67uwaMGbGJ_HR2P4Y6rnsBuK-SAzXe9D7_6SIMdnvxwJsQheoW0CarKNm2zGgncYt-quR8eJyv8SAfT8U_qFRKExHxbf4CzEnNVe40WevXkYlaPl9-nCwybE_A0H-EnL08xdbhmS38KXwcnX96dskaWgRkCgyuWmkQqq7vkxcQSwjBaaBEJya2RZG9nI6dTbVJPMJrpsieV05kznJcUCZFVvWewU80q9xyQ0CVPlcxokiy5VZGKdcmlkYLaKXOLO_C2dU0xX7NvFBue5WDYggxbBMMWqw4ctN4rmi9xUfR8iigE5dsdOGq9tT1989Ne_N_lr-FB_n5QnH8Ynr2Eh7H3d9j0cgA7y_rSvYL75mo5XtSHIUB_A8P-508
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+New+Execution+Model+and+Executor+for+Adaptively+Optimizing+the+Performance+of+Parallel+Algorithms+Using+HPX+Runtime+System&rft.jtitle=SN+computer+science&rft.au=Mohammadiporshokooh%2C+Karame&rft.au=Brandt%2C+Steven+R.&rft.au=Kaiser%2C+Hartmut&rft.date=2025-12-01&rft.issn=2661-8907&rft.eissn=2661-8907&rft.volume=6&rft.issue=8&rft_id=info:doi/10.1007%2Fs42979-025-04442-y&rft.externalDBID=n%2Fa&rft.externalDocID=10_1007_s42979_025_04442_y
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2661-8907&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2661-8907&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2661-8907&client=summon