Pushing the Boundaries of Small Tasks: Scalable Low-Overhead Data-Flow Programming in TTG
Shared memory parallel programming models strive to provide low-overhead execution environments. Task-based programming models, in particular, are well-suited to cope with the ubiquitous multi- and many-core systems since they allow applications to express all available concurrency to a scheduler, w...
Uložené v:
| Vydané v: | Proceedings / IEEE International Conference on Cluster Computing s. 117 - 128 |
|---|---|
| Hlavní autori: | , , , , |
| Médium: | Konferenčný príspevok.. |
| Jazyk: | English |
| Vydavateľské údaje: |
IEEE
01.09.2022
|
| Predmet: | |
| ISSN: | 2168-9253 |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Abstract | Shared memory parallel programming models strive to provide low-overhead execution environments. Task-based programming models, in particular, are well-suited to cope with the ubiquitous multi- and many-core systems since they allow applications to express all available concurrency to a scheduler, which is tasked with exploiting the available hardware resources. It is general consensus that atomic operations should be preferred over locks and mutexes to avoid inter-thread serialization and the resulting loss in efficiency. However, even atomic operations may serialize threads if not used judiciously. In this work, we will discuss several optimizations applied to TTG and the underlying PaRSEC runtime system aiming at removing contentious atomic operations to reduce the overhead of task management to a few hundred clock cycles. The result is an optimized data-flow programming system that seamlessly scales from a single node to distributed execution and which is able to compete with OpenMP in shared memory. |
|---|---|
| AbstractList | Shared memory parallel programming models strive to provide low-overhead execution environments. Task-based programming models, in particular, are well-suited to cope with the ubiquitous multi- and many-core systems since they allow applications to express all available concurrency to a scheduler, which is tasked with exploiting the available hardware resources. It is general consensus that atomic operations should be preferred over locks and mutexes to avoid inter-thread serialization and the resulting loss in efficiency. However, even atomic operations may serialize threads if not used judiciously. In this work, we will discuss several optimizations applied to TTG and the underlying PaRSEC runtime system aiming at removing contentious atomic operations to reduce the overhead of task management to a few hundred clock cycles. The result is an optimized data-flow programming system that seamlessly scales from a single node to distributed execution and which is able to compete with OpenMP in shared memory. |
| Author | Schuchart, Joseph Bosilca, George Herault, Thomas Valeev, Edward F. Nookala, Poornima |
| Author_xml | – sequence: 1 givenname: Joseph surname: Schuchart fullname: Schuchart, Joseph email: schuchart@icl.utk.edu organization: The University of Tennessee,Innovative Computing Laboratory,Knoxville,TN,USA – sequence: 2 givenname: Poornima surname: Nookala fullname: Nookala, Poornima organization: Institute for Advanced Computational Science, Stony Brook University Stony,Brook,NY,USA – sequence: 3 givenname: Thomas surname: Herault fullname: Herault, Thomas organization: The University of Tennessee,Innovative Computing Laboratory,Knoxville,TN,USA – sequence: 4 givenname: Edward F. surname: Valeev fullname: Valeev, Edward F. organization: Virginia Polytechnic Institute and State University,Department of Chemistry,Blacksburg,VA,USA – sequence: 5 givenname: George surname: Bosilca fullname: Bosilca, George organization: The University of Tennessee,Innovative Computing Laboratory,Knoxville,TN,USA |
| BookMark | eNotzEtOwzAUQFGDQKIprIAB3kCKP7FjM4PSFqRIrWg6YFS9OM9tIB8Up1TsHhCM7uieiJy1XYuE3HA24ZzZ22m2WeezF8UTLieCCTFhjAl9QiKutUqsUZqdkpHg2sRWKHlBohDeGJOpZHpEXleHsK_aHR32SB-6Q1tCX2GgnafrBuqa5hDewx1dO6ihqJFm3TFefmK_RyjpIwwQz-vuSFd9t-uhaX6pqqV5vrgk5x7qgFf_HZPNfJZPn-JsuXie3mdxJZgcYuMS6ZTTvAQEJUtjrPMahRAqEdqYNGWmUIDSeOchtWi9k4Ux3P58gns5Jtd_boWI24--aqD_2lrLRcoS-Q3_S1QQ |
| CODEN | IEEPAD |
| ContentType | Conference Proceeding |
| DBID | 6IE 6IL CBEJK RIE RIL |
| DOI | 10.1109/CLUSTER51413.2022.00026 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE/IET Electronic Library (IEL) (UW System Shared) IEEE Proceedings Order Plans (POP All) 1998-Present |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Xplore url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Computer Science |
| EISBN | 1665498560 9781665498562 |
| EISSN | 2168-9253 |
| EndPage | 128 |
| ExternalDocumentID | 9912704 |
| Genre | orig-research |
| GrantInformation_xml | – fundername: NSF grantid: 1450300,1450344,1450262 funderid: 10.13039/100000001 |
| GroupedDBID | 29O 6IE 6IF 6IH 6IK 6IL 6IN AAJGR AAWTH ABLEC ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK IPLJI M43 OCL RIE RIL RNS |
| ID | FETCH-LOGICAL-i203t-8c43c5c61daea53d889cf6e2225426887708b5ae38fcfa79e9fc3b881943c21f3 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 5 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000920273100011&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| IngestDate | Wed Aug 27 02:18:29 EDT 2025 |
| IsPeerReviewed | false |
| IsScholarly | true |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-i203t-8c43c5c61daea53d889cf6e2225426887708b5ae38fcfa79e9fc3b881943c21f3 |
| PageCount | 12 |
| ParticipantIDs | ieee_primary_9912704 |
| PublicationCentury | 2000 |
| PublicationDate | 2022-Sept. |
| PublicationDateYYYYMMDD | 2022-09-01 |
| PublicationDate_xml | – month: 09 year: 2022 text: 2022-Sept. |
| PublicationDecade | 2020 |
| PublicationTitle | Proceedings / IEEE International Conference on Cluster Computing |
| PublicationTitleAbbrev | CLUSTER |
| PublicationYear | 2022 |
| Publisher | IEEE |
| Publisher_xml | – name: IEEE |
| SSID | ssj0037306 |
| Score | 2.228217 |
| Snippet | Shared memory parallel programming models strive to provide low-overhead execution environments. Task-based programming models, in particular, are well-suited... |
| SourceID | ieee |
| SourceType | Publisher |
| StartPage | 117 |
| SubjectTerms | Dataflow graph Hardware Instruction sets Memory management PaR-SEC Parallel programming Runtime Scalability Task analysis Task-Based Programming Template Task Graph TTG |
| Title | Pushing the Boundaries of Small Tasks: Scalable Low-Overhead Data-Flow Programming in TTG |
| URI | https://ieeexplore.ieee.org/document/9912704 |
| WOSCitedRecordID | wos000920273100011&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3NT8IwFG-QePCECsbv9ODRyrZ2a-tRFD0QJGEYPJGuaxPi2Axs8u_bloEevHjrmnRN-vHe7_V9_AC4SQSTCTEXSUiSIktngZLIjxAl1ivjqUA6-ra3AR0O2XTKRw1wu8uFUUq54DN1Z5vOl58WsrJPZV2DZQJqi3_uURptcrW2UhebkxrV8Vu-x7u9wWRs8KCBAz42VmDgynLaCgq_OFScCum3_jf5Iej85OLB0U7LHIGGyo9Ba0vGAOu72Qbvo8q9JkGD6OCDI0uyVjAsNBwvRJbBWKw-VvdmgMhsuhQcFGv0ag6ykcYpfBSlQP2sWNu5bMDWwv5qnsM4fu6ASf8p7r2gmjcBzQMPl4hJgmUoIz8VSoQ4ZYxLHSlr2Rl9bKQK9VgSCoWZllpQrriWOGEGG5hxga_xCWjmRa5OAVTcfHOtcEJSEmmSaBYySbH1_oWK8DPQtis1-9yUxpjVi3T-d_cFOLBbsQnRugTNclmpK7Avv8r5annt9vMbU3Ogqw |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3NT8IwFG8ImugJFYzf9uDRyrZ2W-tRFDFOJGEYPJGuaxPiYAaG_Pu2ZaAHL966Jl2Tfrz3e30fPwCuEk5FQvRF4oKkyNBZoCRwAxQS45VxpCcsfdtbFHa7dDhkvQq43uTCSClt8Jm8MU3ry09zsTBPZU2NZbzQFP_c8gnxnFW21lruYn1WgzKCy3VYsxUN-hoRakDgYm0HerYwp6mh8ItFxSqRdu1_0--Bxk82Huxt9Mw-qMjpAait6RhgeTvr4L23sO9JUGM6eGfpkowdDHMF-xOeZTDm84_5rR7AM5MwBaN8iV71UdbyOIX3vOConeVLM5cJ2ZqYX42nMI4fG2DQfohbHVQyJ6Cx5-ACUUGw8EXgplxyH6eUMqECaWw7rZG1XAkdmvhcYqqE4iGTTAmcUI0O9DjPVfgQVKf5VB4BKJn-ZkrihKQkUCRR1KcixMb_50vCjkHdrNToc1UcY1Qu0snf3ZdgpxO_RKPoqft8CnbNtqwCts5AtZgt5DnYFl_FeD67sHv7Dd3bo_I |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=proceeding&rft.title=Proceedings+%2F+IEEE+International+Conference+on+Cluster+Computing&rft.atitle=Pushing+the+Boundaries+of+Small+Tasks%3A+Scalable+Low-Overhead+Data-Flow+Programming+in+TTG&rft.au=Schuchart%2C+Joseph&rft.au=Nookala%2C+Poornima&rft.au=Herault%2C+Thomas&rft.au=Valeev%2C+Edward+F.&rft.date=2022-09-01&rft.pub=IEEE&rft.eissn=2168-9253&rft.spage=117&rft.epage=128&rft_id=info:doi/10.1109%2FCLUSTER51413.2022.00026&rft.externalDocID=9912704 |