From reactive to proactive load balancing for task‐based parallel applications in distributed memory machines
Load balancing is often a challenge in task‐parallel applications. The balancing problems are divided into static and dynamic. “Static” means that we have some prior knowledge about load information and perform balancing before execution, while “dynamic” must rely on partial information of the execu...
Saved in:
| Published in: | Concurrency and computation Vol. 35; no. 24 |
|---|---|
| Main Authors: | , , , |
| Format: | Journal Article |
| Language: | English |
| Published: |
Hoboken
Wiley Subscription Services, Inc
01.11.2023
|
| Subjects: | |
| ISSN: | 1532-0626, 1532-0634 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Abstract | Load balancing is often a challenge in task‐parallel applications. The balancing problems are divided into static and dynamic. “Static” means that we have some prior knowledge about load information and perform balancing before execution, while “dynamic” must rely on partial information of the execution status to balance the load at runtime. Conventionally, work stealing is a practical approach used in almost all shared memory systems. In distributed memory systems, the communication overhead can make stealing tasks too late. To improve, people have proposed a reactive approach to relax communication in balancing load. The approach leaves one dedicated thread per process to monitor the queue status and offload tasks reactively from a slow to a fast process. However, reactive decisions might be mistaken in high imbalance cases. First, this article proposes a performance model to analyze reactive balancing behaviors and understand the bound leading to incorrect decisions. Second, we introduce a proactive approach to improve further balancing tasks at runtime. The approach exploits task‐based programming models with a dedicated thread as well, namely . Nevertheless, the main idea is to force not only to monitor load; it will characterize tasks and train load prediction models by online learning. “Proactive” indicates offloading tasks before each execution phase proactively with an appropriate number of tasks at once to a potential victim (denoted by an underloaded/fast process). The experimental results confirm speedup improvements from to in important use cases compared to the previous solutions. Furthermore, this approach can support co‐scheduling tasks across multiple applications. |
|---|---|
| AbstractList | Load balancing is often a challenge in task‐parallel applications. The balancing problems are divided into static and dynamic. “Static” means that we have some prior knowledge about load information and perform balancing before execution, while “dynamic” must rely on partial information of the execution status to balance the load at runtime. Conventionally, work stealing is a practical approach used in almost all shared memory systems. In distributed memory systems, the communication overhead can make stealing tasks too late. To improve, people have proposed a reactive approach to relax communication in balancing load. The approach leaves one dedicated thread per process to monitor the queue status and offload tasks reactively from a slow to a fast process. However, reactive decisions might be mistaken in high imbalance cases. First, this article proposes a performance model to analyze reactive balancing behaviors and understand the bound leading to incorrect decisions. Second, we introduce a proactive approach to improve further balancing tasks at runtime. The approach exploits task‐based programming models with a dedicated thread as well, namely Tcomm$$ Tcomm $$. Nevertheless, the main idea is to force Tcomm$$ Tcomm $$ not only to monitor load; it will characterize tasks and train load prediction models by online learning. “Proactive” indicates offloading tasks before each execution phase proactively with an appropriate number of tasks at once to a potential victim (denoted by an underloaded/fast process). The experimental results confirm speedup improvements from 1.5×$$ 1.5\times $$ to 3.4×$$ 3.4\times $$ in important use cases compared to the previous solutions. Furthermore, this approach can support co‐scheduling tasks across multiple applications. Load balancing is often a challenge in task‐parallel applications. The balancing problems are divided into static and dynamic. “Static” means that we have some prior knowledge about load information and perform balancing before execution, while “dynamic” must rely on partial information of the execution status to balance the load at runtime. Conventionally, work stealing is a practical approach used in almost all shared memory systems. In distributed memory systems, the communication overhead can make stealing tasks too late. To improve, people have proposed a reactive approach to relax communication in balancing load. The approach leaves one dedicated thread per process to monitor the queue status and offload tasks reactively from a slow to a fast process. However, reactive decisions might be mistaken in high imbalance cases. First, this article proposes a performance model to analyze reactive balancing behaviors and understand the bound leading to incorrect decisions. Second, we introduce a proactive approach to improve further balancing tasks at runtime. The approach exploits task‐based programming models with a dedicated thread as well, namely . Nevertheless, the main idea is to force not only to monitor load; it will characterize tasks and train load prediction models by online learning. “Proactive” indicates offloading tasks before each execution phase proactively with an appropriate number of tasks at once to a potential victim (denoted by an underloaded/fast process). The experimental results confirm speedup improvements from to in important use cases compared to the previous solutions. Furthermore, this approach can support co‐scheduling tasks across multiple applications. |
| Author | Kranzlmüller, Dieter Fürlinger, Karl Weidendorfer, Josef Thanh Chung, Minh |
| Author_xml | – sequence: 1 givenname: Minh orcidid: 0000-0001-6119-3852 surname: Thanh Chung fullname: Thanh Chung, Minh organization: MNM‐Team Ludwig‐Maximilians‐Universität München Munich Germany – sequence: 2 givenname: Josef orcidid: 0000-0001-7159-1432 surname: Weidendorfer fullname: Weidendorfer, Josef organization: Leibniz Supercomputing Centre (LRZ) Garching Germany – sequence: 3 givenname: Karl orcidid: 0000-0003-0398-4087 surname: Fürlinger fullname: Fürlinger, Karl organization: MNM‐Team Ludwig‐Maximilians‐Universität München Munich Germany – sequence: 4 givenname: Dieter orcidid: 0000-0002-8319-0123 surname: Kranzlmüller fullname: Kranzlmüller, Dieter organization: MNM‐Team Ludwig‐Maximilians‐Universität München Munich Germany, Leibniz Supercomputing Centre (LRZ) Garching Germany |
| BookMark | eNo9kMtKAzEUhoNUsK2CjxBw42ZqbpOZLqV4g4IbXQ8nmURTZ5IxSYXufASf0SdxSourcw58nJ__m6GJD94gdEnJghLCbvRgFlXN6hM0pSVnBZFcTP53Js_QLKUNIZQSTqco3MfQ42hAZ_dlcA54iOF4dAFarKADr51_wzZEnCF9_H7_KEimxQNE6DrTYRiGzmnILviEncetSzk6tc0j1Js-xB3uQb87b9I5OrXQJXNxnHP0en_3snos1s8PT6vbdaFZSXLBqbBStG0tayEkBapALRUlphQ1h1JApSvGLOdaQclbqQShApacUWutVpLP0dXh71jnc2tSbjZhG_0Y2bC6qiStSrqnrg-UjiGlaGwzRNdD3DWUNHudzaiz2evkf6oybDY |
| Cites_doi | 10.5555/263953 10.1006/jpdc.1996.0107 10.1016/j.sysarc.2015.07.004 10.1109/ICPADS51040.2020.00018 10.1007/BFb0097937 10.1145/3404397.3404440 10.1145/2503210.2503284 10.1177/1094342011434065 10.1016/j.jpdc.2004.05.003 10.1016/j.jnca.2017.01.016 10.1109/HOTCHIPS.2007.7482491 10.1007/s10766‐016‐0484‐8 10.1109/SPDP.1991.218196 10.2172/1875218 10.1145/3502181.3531457 10.1007/s44150‐021‐00015‐8 10.5555/3571885.3571987 10.1145/2287076.2287103 10.1109/IPDPS.2009.5161057 10.1109/CLUSTER.2018.00051 10.1145/3337821.3337878 10.1109/ISPASS.2009.4919641 10.1016/j.jpdc.2020.12.005 10.1145/3337821.3337912 10.1145/281035.281049 10.1109/ACOMP53746.2021.00020 10.1145/2947668 10.1109/TCST.2005.854339 10.1007/978‐3‐642‐40698‐0_13 10.1109/CLUSTER.2010.20 10.1109/MM.2004.1268994 10.1016/j.jpdc.2019.12.005 10.1145/1103845.1094852 10.1145/3468267.3470574 10.1109/32.4634 10.1145/1654059.1654113 10.1109/ExaMPI.2014.6 10.1109/TPDS.2022.3215947 10.1109/IISWC.2005.1526010 10.1007/b102252 10.1145/782814.782855 10.1109/TPDS.2018.2870403 10.1145/2644865.2541941 10.1007/s11227‐018‐2238‐4 10.1145/324133.324234 10.1109/SC.2005.33 10.1109/IPDPS.2007.370258 10.1145/3149.3156 10.1177/1094342007078442 10.1007/978-3-031-30442-2_20 10.1007/978‐3‐642‐32820‐6_85 10.1145/2597652.2597658 |
| ContentType | Journal Article |
| Copyright | 2023. This article is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. |
| Copyright_xml | – notice: 2023. This article is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. |
| DBID | AAYXX CITATION 7SC 8FD JQ2 L7M L~C L~D |
| DOI | 10.1002/cpe.7828 |
| DatabaseName | CrossRef Computer and Information Systems Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional |
| DatabaseTitle | CrossRef Computer and Information Systems Abstracts Technology Research Database Computer and Information Systems Abstracts – Academic Advanced Technologies Database with Aerospace ProQuest Computer Science Collection Computer and Information Systems Abstracts Professional |
| DatabaseTitleList | Computer and Information Systems Abstracts CrossRef |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Computer Science |
| EISSN | 1532-0634 |
| ExternalDocumentID | 10_1002_cpe_7828 |
| GroupedDBID | .3N .DC .GA .Y3 05W 0R~ 10A 1L6 1OC 31~~ HGLYW HHY HVGLF HZ~ IX1 JPC KQQ LATKE LAW LC2 LC3 LEEKS LH4 LITHE LOXES LP6 LP7 LUTES LW6 LYRES MEWTI MK4 MRFUL MRSTM MSFUL MSSTM MXFUL MXSTM N04 N05 N9A O66 O8X O9- OIG P2W P2X P4D PQQKQ Q.N Q11 QB0 QRW R.K ROL RX1 SUPJJ TN5 UB1 V2E W8V W99 WBKPD WIH WIK WOHZO WQJ WXSBR WYISQ WZISG XG1 XV2 ~IA ~WT 7SC 8FD JQ2 L7M L~C L~D |
| ID | FETCH-LOGICAL-c250t-314f64dd8684461a1bab9b10e5483a54a7c722f33cba53d6b4014a9321fffcb63 |
| ISICitedReferencesCount | 1 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001016058600001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 1532-0626 |
| IngestDate | Sun Nov 09 08:24:58 EST 2025 Sat Nov 29 03:49:54 EST 2025 |
| IsDoiOpenAccess | false |
| IsOpenAccess | true |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 24 |
| Language | English |
| LinkModel | OpenURL |
| MergedId | FETCHMERGED-LOGICAL-c250t-314f64dd8684461a1bab9b10e5483a54a7c722f33cba53d6b4014a9321fffcb63 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ORCID | 0000-0003-0398-4087 0000-0001-6119-3852 0000-0002-8319-0123 0000-0001-7159-1432 |
| OpenAccessLink | https://onlinelibrary.wiley.com/doi/pdfdirect/10.1002/cpe.7828 |
| PQID | 2877617516 |
| PQPubID | 2045170 |
| ParticipantIDs | proquest_journals_2877617516 crossref_primary_10_1002_cpe_7828 |
| PublicationCentury | 2000 |
| PublicationDate | 2023-11-00 20231101 |
| PublicationDateYYYYMMDD | 2023-11-01 |
| PublicationDate_xml | – month: 11 year: 2023 text: 2023-11-00 |
| PublicationDecade | 2020 |
| PublicationPlace | Hoboken |
| PublicationPlace_xml | – name: Hoboken |
| PublicationTitle | Concurrency and computation |
| PublicationYear | 2023 |
| Publisher | Wiley Subscription Services, Inc |
| Publisher_xml | – name: Wiley Subscription Services, Inc |
| References | e_1_2_13_24_1 e_1_2_13_49_1 e_1_2_13_26_1 e_1_2_13_47_1 e_1_2_13_20_1 e_1_2_13_45_1 e_1_2_13_22_1 e_1_2_13_43_1 e_1_2_13_8_1 Shanley T (e_1_2_13_9_1) 2003 e_1_2_13_41_1 e_1_2_13_6_1 e_1_2_13_17_1 e_1_2_13_19_1 e_1_2_13_13_1 e_1_2_13_36_1 e_1_2_13_15_1 e_1_2_13_38_1 e_1_2_13_57_1 e_1_2_13_32_1 e_1_2_13_55_1 Bonaccorso G (e_1_2_13_56_1) 2017 e_1_2_13_11_1 e_1_2_13_34_1 e_1_2_13_53_1 e_1_2_13_51_1 e_1_2_13_30_1 Kurose J (e_1_2_13_50_1) 2021 e_1_2_13_4_1 e_1_2_13_2_1 e_1_2_13_29_1 e_1_2_13_25_1 e_1_2_13_48_1 e_1_2_13_27_1 e_1_2_13_46_1 e_1_2_13_21_1 e_1_2_13_44_1 e_1_2_13_23_1 e_1_2_13_42_1 e_1_2_13_40_1 e_1_2_13_7_1 e_1_2_13_18_1 e_1_2_13_39_1 e_1_2_13_14_1 e_1_2_13_35_1 e_1_2_13_16_1 e_1_2_13_37_1 e_1_2_13_10_1 e_1_2_13_31_1 e_1_2_13_12_1 e_1_2_13_33_1 e_1_2_13_54_1 e_1_2_13_52_1 e_1_2_13_5_1 e_1_2_13_3_1 e_1_2_13_28_1 |
| References_xml | – ident: e_1_2_13_46_1 doi: 10.5555/263953 – ident: e_1_2_13_26_1 doi: 10.1006/jpdc.1996.0107 – volume-title: InfiniBand Network Architecture year: 2003 ident: e_1_2_13_9_1 – ident: e_1_2_13_28_1 doi: 10.1016/j.sysarc.2015.07.004 – volume-title: Computer Networking: A Top‐Down Approach year: 2021 ident: e_1_2_13_50_1 – ident: e_1_2_13_29_1 doi: 10.1109/ICPADS51040.2020.00018 – ident: e_1_2_13_33_1 doi: 10.1007/BFb0097937 – ident: e_1_2_13_44_1 doi: 10.1145/3404397.3404440 – ident: e_1_2_13_5_1 doi: 10.1145/2503210.2503284 – ident: e_1_2_13_27_1 doi: 10.1177/1094342011434065 – ident: e_1_2_13_4_1 doi: 10.1016/j.jpdc.2004.05.003 – ident: e_1_2_13_39_1 doi: 10.1016/j.jnca.2017.01.016 – ident: e_1_2_13_47_1 doi: 10.1109/HOTCHIPS.2007.7482491 – volume-title: Machine Learning Algorithms year: 2017 ident: e_1_2_13_56_1 – ident: e_1_2_13_8_1 doi: 10.1007/s10766‐016‐0484‐8 – ident: e_1_2_13_20_1 doi: 10.1109/SPDP.1991.218196 – ident: e_1_2_13_35_1 doi: 10.2172/1875218 – ident: e_1_2_13_54_1 doi: 10.1145/3502181.3531457 – ident: e_1_2_13_55_1 doi: 10.1007/s44150‐021‐00015‐8 – ident: e_1_2_13_22_1 – ident: e_1_2_13_15_1 doi: 10.5555/3571885.3571987 – ident: e_1_2_13_37_1 doi: 10.1145/2287076.2287103 – ident: e_1_2_13_42_1 doi: 10.1109/IPDPS.2009.5161057 – ident: e_1_2_13_16_1 doi: 10.1109/CLUSTER.2018.00051 – ident: e_1_2_13_36_1 doi: 10.1145/3337821.3337878 – ident: e_1_2_13_45_1 doi: 10.1109/ISPASS.2009.4919641 – ident: e_1_2_13_38_1 doi: 10.1016/j.jpdc.2020.12.005 – ident: e_1_2_13_32_1 doi: 10.1145/3337821.3337912 – ident: e_1_2_13_43_1 doi: 10.1145/281035.281049 – ident: e_1_2_13_53_1 doi: 10.1109/ACOMP53746.2021.00020 – ident: e_1_2_13_18_1 doi: 10.1145/2947668 – ident: e_1_2_13_51_1 doi: 10.1109/TCST.2005.854339 – ident: e_1_2_13_57_1 doi: 10.1007/978‐3‐642‐40698‐0_13 – ident: e_1_2_13_10_1 doi: 10.1109/CLUSTER.2010.20 – ident: e_1_2_13_52_1 doi: 10.1109/MM.2004.1268994 – ident: e_1_2_13_12_1 doi: 10.1016/j.jpdc.2019.12.005 – ident: e_1_2_13_30_1 doi: 10.1145/1103845.1094852 – ident: e_1_2_13_17_1 doi: 10.1145/3468267.3470574 – ident: e_1_2_13_3_1 doi: 10.1109/32.4634 – ident: e_1_2_13_34_1 doi: 10.1145/1654059.1654113 – ident: e_1_2_13_13_1 doi: 10.1109/ExaMPI.2014.6 – ident: e_1_2_13_25_1 doi: 10.1109/TPDS.2022.3215947 – ident: e_1_2_13_6_1 doi: 10.1109/IISWC.2005.1526010 – ident: e_1_2_13_2_1 doi: 10.1007/b102252 – ident: e_1_2_13_11_1 doi: 10.1145/782814.782855 – ident: e_1_2_13_24_1 doi: 10.1109/TPDS.2018.2870403 – ident: e_1_2_13_40_1 doi: 10.1145/2644865.2541941 – ident: e_1_2_13_49_1 doi: 10.1007/s11227‐018‐2238‐4 – ident: e_1_2_13_7_1 doi: 10.1145/324133.324234 – ident: e_1_2_13_41_1 doi: 10.1109/SC.2005.33 – ident: e_1_2_13_23_1 doi: 10.1109/IPDPS.2007.370258 – ident: e_1_2_13_21_1 doi: 10.1145/3149.3156 – ident: e_1_2_13_31_1 doi: 10.1177/1094342007078442 – ident: e_1_2_13_19_1 doi: 10.1007/978-3-031-30442-2_20 – ident: e_1_2_13_48_1 doi: 10.1007/978‐3‐642‐32820‐6_85 – ident: e_1_2_13_14_1 doi: 10.1145/2597652.2597658 |
| SSID | ssj0011031 |
| Score | 2.3705132 |
| Snippet | Load balancing is often a challenge in task‐parallel applications. The balancing problems are divided into static and dynamic. “Static” means that we have some... |
| SourceID | proquest crossref |
| SourceType | Aggregation Database Index Database |
| SubjectTerms | Decisions Distributed memory Load balancing Prediction models Run time (computers) Task scheduling |
| Title | From reactive to proactive load balancing for task‐based parallel applications in distributed memory machines |
| URI | https://www.proquest.com/docview/2877617516 |
| Volume | 35 |
| WOSCitedRecordID | wos001016058600001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVWIB databaseName: Wiley Online Library Full Collection 2020 customDbUrl: eissn: 1532-0634 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0011031 issn: 1532-0626 databaseCode: DRFUL dateStart: 20010101 isFulltext: true titleUrlDefault: https://onlinelibrary.wiley.com providerName: Wiley-Blackwell |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV3dbtMwFLZKxwU3jF8xNpCRuKsCjfNj5xI6KiRgQqgTu4tsJ1krUqdqs2njikfgQXgqnoTj-GfpuBkX3ESJFadVzhefY5_vfEboJWVjzkUoApFlPIjhWwwE0TQxmQE-GITMMe82m6BHR-zkJPs8GPxytTDnNVWKXVxkq_9qamgDY-vS2X8wt38oNMA5GB2OYHY43sjwU10wAqFgN5Dp0LIrmuou6oZDvKnJjNIRKFu--eYJD9qlFSOtBl7XZT3qJ7c71qwW2dX7Y8FNS83QvRwtOy6m5SE6wYNGyU71SV66qrnV2XbGfzbnaj6azO1I82mh_Kr0Vy28pYpmXRkw6RRF5VGm8_pvJ1qh69SxQdaeI_IB_O73emnucSWOhwvPQLaLGySyVX798ZgE45RYtex-m10DtYO40TyxYDVV2X85ByM2K1flKwiL2JUDdEn_a37RsxWNsjPJoWeue95CO4QmGRuincMv0-OPPmult8ww-rzmTzux4zF57X51O_zZ9v5dSDO7h-7auQh-YzB0Hw1K9QDtun0-sB32H6JGQwo7SOG2wR5SWEMKe0hhgBTWkPr942cHJuzAhPtgwguFe2DCBkzYgekROp6-m03eB3ajjkBCBN2CH4-rNC4KlrI4TkMeCi4yEY5LmA5HPIk5lZSQKoqk4ElUpAIm9TGHmUNYVZUUafQYDVWjyicIJ0Ukk3LMihLiRio5g240LgtRhRCMFtkeeuHeX74yeiz5dfvsoQP3YnP7YW5ywiiFaD0J06c3eMQ-unMFxwM0bNdn5TN0W563i836ubX7H9gjj6k |
| linkProvider | Wiley-Blackwell |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=From+reactive+to+proactive+load+balancing+for+task%E2%80%90based+parallel+applications+in+distributed+memory+machines&rft.jtitle=Concurrency+and+computation&rft.au=Thanh+Chung%2C+Minh&rft.au=Weidendorfer%2C+Josef&rft.au=F%C3%BCrlinger%2C+Karl&rft.au=Kranzlm%C3%BCller%2C+Dieter&rft.date=2023-11-01&rft.issn=1532-0626&rft.eissn=1532-0634&rft.volume=35&rft.issue=24&rft_id=info:doi/10.1002%2Fcpe.7828&rft.externalDBID=n%2Fa&rft.externalDocID=10_1002_cpe_7828 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1532-0626&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1532-0626&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1532-0626&client=summon |