An Integrated Approach to Locality-Conscious Processor Allocation and Scheduling of Mixed-Parallel Applications
Complex parallel applications can often be modeled as directed acyclic graphs of coarse-grained application tasks with dependences. These applications exhibit both task and data parallelism, and combining these two (also called mixed parallelism) has been shown to be an effective model for their exe...
Uloženo v:
| Vydáno v: | IEEE transactions on parallel and distributed systems Ročník 20; číslo 8; s. 1158 - 1172 |
|---|---|
| Hlavní autoři: | , , , , , , |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
New York
IEEE
01.08.2009
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Témata: | |
| ISSN: | 1045-9219, 1558-2183 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Abstract | Complex parallel applications can often be modeled as directed acyclic graphs of coarse-grained application tasks with dependences. These applications exhibit both task and data parallelism, and combining these two (also called mixed parallelism) has been shown to be an effective model for their execution. In this paper, we present an algorithm to compute the appropriate mix of task and data parallelism required to minimize the parallel completion time (makespan) of these applications. In other words, our algorithm determines the set of tasks that should be run concurrently and the number of processors to be allocated to each task. The processor allocation and scheduling decisions are made in an integrated manner and are based on several factors such as the structure of the task graph, the runtime estimates and scalability characteristics of the tasks, and the intertask data communication volumes. A locality-conscious scheduling strategy is used to improve intertask data reuse. Evaluation through simulations and actual executions of task graphs derived from real applications and synthetic graphs shows that our algorithm consistently generates schedules with a lower makespan as compared to Critical Path Reduction (CPR) and Critical Path and Allocation (CPA), two previously proposed scheduling algorithms. Our algorithm also produces schedules that have a lower makespan than pure task- and data-parallel schedules. For task graphs with known optimal schedules or lower bounds on the makespan, our algorithm generates schedules that are closer to the optima than other scheduling approaches. |
|---|---|
| AbstractList | Complex parallel applications can often be modeled as directed acyclic graphs of coarse-grained application tasks with dependences. These applications exhibit both task and data parallelism, and combining these two (also called mixed parallelism) has been shown to be an effective model for their execution. In this paper, we present an algorithm to compute the appropriate mix of task and data parallelism required to minimize the parallel completion time (makespan) of these applications. In other words, our algorithm determines the set of tasks that should be run concurrently and the number of processors to be allocated to each task. The processor allocation and scheduling decisions are made in an integrated manner and are based on several factors such as the structure of the task graph, the runtime estimates and scalability characteristics of the tasks, and the intertask data communication volumes. A locality-conscious scheduling strategy is used to improve intertask data reuse. Evaluation through simulations and actual executions of task graphs derived from real applications and synthetic graphs shows that our algorithm consistently generates schedules with a lower makespan as compared to Critical Path Reduction (CPR) and Critical Path and Allocation (CPA), two previously proposed scheduling algorithms. Our algorithm also produces schedules that have a lower makespan than pure task- and data-parallel schedules. For task graphs with known optimal schedules or lower bounds on the makespan, our algorithm generates schedules that are closer to the optima than other scheduling approaches. Evaluation through simulations and actual executions of task graphs derived from real applications and synthetic graphs shows that our algorithm consistently generates schedules with a lower makespan as compared to Critical Path Reduction (CPR) and Critical Path and Allocation (CPA), two previously proposed scheduling algorithms. |
| Author | Saltz, J.H. Sadayappan, P. Kurc, T. Krishnamoorthy, S. Catalyurek, U.V. Sabin, G.M. Vydyanathan, N. |
| Author_xml | – sequence: 1 givenname: N. surname: Vydyanathan fullname: Vydyanathan, N. organization: Siemens Corp. Technol., Bangalore – sequence: 2 givenname: S. surname: Krishnamoorthy fullname: Krishnamoorthy, S. – sequence: 3 givenname: G.M. surname: Sabin fullname: Sabin, G.M. – sequence: 4 givenname: U.V. surname: Catalyurek fullname: Catalyurek, U.V. – sequence: 5 givenname: T. surname: Kurc fullname: Kurc, T. – sequence: 6 givenname: P. surname: Sadayappan fullname: Sadayappan, P. – sequence: 7 givenname: J.H. surname: Saltz fullname: Saltz, J.H. |
| BookMark | eNp1kctrHDEMxoeSQPPosadeTC89zVZ-rT3HZfsKbMlCkrPxeDSJg2NvbS8k_3293dJDoCcJ6SehT995dxJTxK57T2FBKQyfb7dfbhYMQC8YHd50Z1RK3TOq-UnLQch-aPW33XkpjwBUSBBnXVpFchUr3mdbcSKr3S4n6x5ITWSTnA2-vvTrFIvzaV_INieHpaRMViG0dvUpEhsncuMecNoHH-9JmslP_4xTv7XZhoDhsDT4I1wuu9PZhoLv_saL7u7b19v1j35z_f1qvdr0ToCuvdKSIhOjHjlTigk2g7ZiAjGNs8UJJ1Dz0tF5OcxKA-UMlAI7ApM4DlJxftF9Ou5ten7tsVTz5IvDEGzEpsRoJYErKlkjP74iH9M-x3acGSgDDlxDg_gRcjmVknE2ztc_imq2PhgK5mCBOVhgDhaY9uo21b-a2mX_ZPPLf_kPR94j4j9WLAUdqOS_AVFXkvU |
| CODEN | ITDSEO |
| CitedBy_id | crossref_primary_10_1007_s10846_014_0154_2 crossref_primary_10_1587_transinf_E96_D_2268 crossref_primary_10_1016_j_jss_2012_10_029 crossref_primary_10_1002_cpe_7757 crossref_primary_10_1002_cpe_3219 crossref_primary_10_1016_j_jpdc_2012_09_017 crossref_primary_10_1177_1063293X12446664 |
| Cites_doi | 10.1109/12.817403 10.1109/IPDPS.2001.924977 10.1109/ICPPW.2002.1039773 10.1109/HPDC.1997.622368 10.1007/s00453-001-0085-8 10.1145/215399.215423 10.1145/295656.295658 10.1109/CLUSTR.2006.311861 10.1109/71.273050 10.1006/jpdc.1997.1351 10.1109/SC.2002.10056 10.1142/S0129626495000473 10.1145/237502.237508 10.1109/ICPP.2001.952048 10.1109/ICDCS.1999.776502 10.21236/ADA637068 10.1145/140901.141909 10.1109/71.642945 10.1145/113379.113399 10.1137/0219021 10.1023/B:ANOR.0000030682.25673.c0 10.1137/0402042 10.1109/JRA.1985.1087004 10.1145/344588.344618 10.1145/1159892.1159899 10.1016/S1383-7621(98)00019-8 10.1109/IPDPS.2003.1213127 10.1109/ICPP.2006.22 10.1142/S0129054102001308 10.1109/CLUSTR.2005.347024 |
| ContentType | Journal Article |
| Copyright | Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2009 |
| Copyright_xml | – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2009 |
| DBID | 97E RIA RIE AAYXX CITATION 7SC 7SP 8FD JQ2 L7M L~C L~D F28 FR3 |
| DOI | 10.1109/TPDS.2008.219 |
| DatabaseName | IEEE All-Society Periodicals Package (ASPP) 2005–Present IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Electronic Library (IEL) CrossRef Computer and Information Systems Abstracts Electronics & Communications Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional ANTE: Abstracts in New Technology & Engineering Engineering Research Database |
| DatabaseTitle | CrossRef Technology Research Database Computer and Information Systems Abstracts – Academic Electronics & Communications Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Professional Engineering Research Database ANTE: Abstracts in New Technology & Engineering |
| DatabaseTitleList | Technology Research Database Technology Research Database |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Engineering Computer Science |
| EISSN | 1558-2183 |
| EndPage | 1172 |
| ExternalDocumentID | 2543463551 10_1109_TPDS_2008_219 4641915 |
| Genre | orig-research |
| GroupedDBID | --Z -~X .DC 0R~ 29I 4.4 5GY 5VS 6IK 97E AAJGR AARMG AASAJ AAWTH ABAZT ABFSI ABQJQ ABVLG ACGFO ACIWK AENEX AETIX AGQYO AGSQL AHBIQ AI. AIBXA AKJIK AKQYR ALLEH ALMA_UNASSIGNED_HOLDINGS ASUFR ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ CS3 DU5 E.L EBS EJD HZ~ H~9 ICLAB IEDLZ IFIPE IFJZH IPLJI JAVBF LAI M43 MS~ O9- OCL P2P PQQKQ RIA RIE RNI RNS RZB TN5 TWZ UHB VH1 AAYXX CITATION 7SC 7SP 8FD JQ2 L7M L~C L~D RIG F28 FR3 |
| ID | FETCH-LOGICAL-c408t-7851e24b8b3277242f08a4d04dbfaeded07f6c1f69f7801320770ab025eb95733 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 28 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000267050700007&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 1045-9219 |
| IngestDate | Sat Sep 27 19:12:25 EDT 2025 Mon Jun 30 05:00:39 EDT 2025 Sat Nov 29 08:07:57 EST 2025 Tue Nov 18 21:44:59 EST 2025 Wed Aug 27 02:52:20 EDT 2025 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 8 |
| Keywords | Processor Allocation and Scheduling Mixed-parallelism Scheduling and task partitioning Load balancing and task assignment |
| Language | English |
| License | https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c408t-7851e24b8b3277242f08a4d04dbfaeded07f6c1f69f7801320770ab025eb95733 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 ObjectType-Article-2 ObjectType-Feature-1 content type line 23 |
| PQID | 912030380 |
| PQPubID | 85437 |
| PageCount | 15 |
| ParticipantIDs | proquest_journals_912030380 proquest_miscellaneous_875037152 crossref_citationtrail_10_1109_TPDS_2008_219 crossref_primary_10_1109_TPDS_2008_219 ieee_primary_4641915 |
| PublicationCentury | 2000 |
| PublicationDate | 2009-08-01 |
| PublicationDateYYYYMMDD | 2009-08-01 |
| PublicationDate_xml | – month: 08 year: 2009 text: 2009-08-01 day: 01 |
| PublicationDecade | 2000 |
| PublicationPlace | New York |
| PublicationPlace_xml | – name: New York |
| PublicationTitle | IEEE transactions on parallel and distributed systems |
| PublicationTitleAbbrev | TPDS |
| PublicationYear | 2009 |
| Publisher | IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Publisher_xml | – name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| References | ref13 ref34 Golub (ref14) 1996 ref15 ref36 ref31 ref30 ref11 ref10 ref32 ref17 ref16 ref19 Downey (ref35) 1997 Kumar (ref1) 1994 ref18 (ref12) 2008 Vydyanathan (ref33) 2008 (ref37) 2008 Quinn (ref2) 1994 ref24 ref26 ref25 ref20 ref22 ref21 ref28 ref27 Li (ref23) 2005; 21 ref29 ref8 ref7 ref9 ref4 ref3 ref6 ref5 |
| References_xml | – year: 2008 ident: ref33 article-title: An Integrated Approach to Locality Conscious Processor Allocation and Scheduling of Mixed Parallel Applications – ident: ref28 doi: 10.1109/12.817403 – ident: ref9 doi: 10.1109/IPDPS.2001.924977 – ident: ref32 doi: 10.1109/ICPPW.2002.1039773 – ident: ref36 doi: 10.1109/HPDC.1997.622368 – ident: ref18 doi: 10.1007/s00453-001-0085-8 – ident: ref5 doi: 10.1145/215399.215423 – volume-title: Parallel Computing: Theory and Practice year: 1994 ident: ref2 – ident: ref6 doi: 10.1145/295656.295658 – ident: ref8 doi: 10.1109/CLUSTR.2006.311861 – ident: ref26 doi: 10.1109/71.273050 – volume-title: Task Graphs for Free year: 2008 ident: ref37 – ident: ref31 doi: 10.1006/jpdc.1997.1351 – ident: ref13 doi: 10.1109/SC.2002.10056 – volume-title: Matrix Computations year: 1996 ident: ref14 – ident: ref29 doi: 10.1142/S0129626495000473 – volume-title: Introduction to Parallel Computing: Design and Analysis of Algorithms year: 1994 ident: ref1 – ident: ref25 doi: 10.1145/237502.237508 – ident: ref10 doi: 10.1109/ICPP.2001.952048 – ident: ref30 doi: 10.1109/ICDCS.1999.776502 – volume-title: Standard Task Graph Set year: 2008 ident: ref12 – year: 1997 ident: ref35 article-title: A Model for Speedup of Parallel Programs doi: 10.21236/ADA637068 – ident: ref17 doi: 10.1145/140901.141909 – ident: ref4 doi: 10.1109/71.642945 – ident: ref27 doi: 10.1145/113379.113399 – ident: ref15 doi: 10.1137/0219021 – ident: ref19 doi: 10.1023/B:ANOR.0000030682.25673.c0 – ident: ref16 doi: 10.1137/0402042 – ident: ref34 doi: 10.1109/JRA.1985.1087004 – ident: ref3 doi: 10.1145/344588.344618 – ident: ref21 doi: 10.1145/1159892.1159899 – ident: ref11 doi: 10.1016/S1383-7621(98)00019-8 – volume: 21 start-page: 309 issue: 2 year: 2005 ident: ref23 article-title: Scheduling Precedence Constrained Parallel Tasks on Multiprocessors Using the Harmonic System Partitioning Scheme publication-title: J. Information Sciences and Eng. – ident: ref22 doi: 10.1109/IPDPS.2003.1213127 – ident: ref7 doi: 10.1109/ICPP.2006.22 – ident: ref20 doi: 10.1142/S0129054102001308 – ident: ref24 doi: 10.1109/CLUSTR.2005.347024 |
| SSID | ssj0014504 |
| Score | 2.1070933 |
| Snippet | Complex parallel applications can often be modeled as directed acyclic graphs of coarse-grained application tasks with dependences. These applications exhibit... Evaluation through simulations and actual executions of task graphs derived from real applications and synthetic graphs shows that our algorithm consistently... |
| SourceID | proquest crossref ieee |
| SourceType | Aggregation Database Enrichment Source Index Database Publisher |
| StartPage | 1158 |
| SubjectTerms | Algorithm design and analysis Algorithms Allocations Concurrent computing Critical path Data communication data-flow graphs Decisions Graphs Heuristic locality-conscious scheduling Microprocessors mixed parallelism Optimal scheduling Parallel processing Power system modeling Processor allocation Processor scheduling Runtime Scalability Schedules Scheduling Scheduling algorithm Studies Tasks |
| Title | An Integrated Approach to Locality-Conscious Processor Allocation and Scheduling of Mixed-Parallel Applications |
| URI | https://ieeexplore.ieee.org/document/4641915 https://www.proquest.com/docview/912030380 https://www.proquest.com/docview/875037152 |
| Volume | 20 |
| WOSCitedRecordID | wos000267050700007&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVIEE databaseName: IEEE Electronic Library (IEL) customDbUrl: eissn: 1558-2183 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0014504 issn: 1045-9219 databaseCode: RIE dateStart: 19900101 isFulltext: true titleUrlDefault: https://ieeexplore.ieee.org/ providerName: IEEE |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NT9wwEB0B4gCHbvkS2y2VD6gnDE7WWdvHqAW1EqCVSiVukR3bAmmVoN2l6s_vTJJNVwIO3KJkFEWZjGecN_MewKlMgjTOjrmcWM-ljQnX1houohclnktsaHhmr9Xtrb6_N9MNOOtnYUIITfNZOKfDBsv3dflMv8ou5ETi9iLbhE2lVDur1SMGMmukAnF3kXGDYfifT_Pibvr9V9s1mRKhzlr-aQRVXqzCTWq5GrzvoT7Ch66EZHnr8z3YCNU-DFbyDKyL1n3YXeMaPIA6r9jPFTeEZ3nHJc6WNbumfIbVOCf1zpKaYlk3P1DPWT6jdEfuY7byePcHTE40w87qyG4e_wbPp3ZOiiwzlq-h4Yfw--ry7tsP3qkt8FIKveQKa6-QSqfdOMWSW6ZRaCu9kN5FG3zwQsVJmcSJiUoTQiOUEtZhzRScIVbFI9iq6iocA0OLRHrtsbjKpJPauMzGQBBs6bJxZodwtvJBUXZU5KSIMSuaLYkwBbmsVchElw3ha2_-1HJwvGV4QP7pjTrXDGG0cnDRReiiMEmK69tYiyGw_iqGFuEltgr4qgtNGK_CAufT6_cdwU6LLVE74GfYWs6fwwlsl3-Wj4v5l-bz_Ad9n-Sl |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1Ra9RAEB5qFbQPVlvFs1X3QXzq2k1uctl9DNbS4vU48IS-hd3sLhaORK5X8ed3JsnFA_XBt5AMIWQyO7P5Zr4P4D0mAY2zY4kT6yXamEhtrZEqelXRucSGlmd2ms9m-vrazHfgZJiFCSG0zWfhIx-2WL5vqjv-VXaKE6TtRfYAHmaIadJNaw2YAWatWCDtLzJpKBB_M2qeLuZnX7u-yZQpdbYyUCup8sc63CaX8_3_e6xn8LQvIkXRef057IT6APY3Ag2ij9cD2NtiGzyEpqjF5YYdwouiZxMX60ZMOaNRPS5Zv7PitljRTxA0K1EsOeGxA4WtPd39O6UnnmIXTRRXN7-Cl3O7Yk2WpSi28PAX8O388-LThez1FmSFSq9lTtVXSNFpN06p6MY0Km3RK_Qu2uCDV3mcVEmcmJhrxmhUnivrqGoKzjCv4kvYrZs6vAJBFgl67am8ytChNi6zMTAIW7lsnNkRnGx8UFY9GTlrYizLdlOiTMku6zQyyWUj-DCY_-hYOP5leMj-GYx614zgaOPgso_R29IkKa1wY61GIIarFFyMmNg60KsuNaO8OZU4r_9-33fw-GJxNS2nl7MvR_CkQ5q4OfAYdteru_AGHlU_1ze3q7ftp3oPhLrn7A |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=An+Integrated+Approach+to+Locality-Conscious+Processor+Allocation+and+Scheduling+of+Mixed-Parallel+Applications&rft.jtitle=IEEE+transactions+on+parallel+and+distributed+systems&rft.au=Vydyanathan%2C+N&rft.au=Krishnamoorthy%2C+S&rft.au=Sabin%2C+G+M&rft.au=Catalyurek%2C+U+V&rft.date=2009-08-01&rft.issn=1045-9219&rft.volume=20&rft.issue=8&rft_id=info:doi/10.1109%2FTPDS.2008.219&rft.externalDBID=NO_FULL_TEXT |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1045-9219&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1045-9219&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1045-9219&client=summon |