Pushing the Boundaries of Small Tasks: Scalable Low-Overhead Data-Flow Programming in TTG

Shared memory parallel programming models strive to provide low-overhead execution environments. Task-based programming models, in particular, are well-suited to cope with the ubiquitous multi- and many-core systems since they allow applications to express all available concurrency to a scheduler, w...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:Proceedings / IEEE International Conference on Cluster Computing s. 117 - 128
Hlavní autori: Schuchart, Joseph, Nookala, Poornima, Herault, Thomas, Valeev, Edward F., Bosilca, George
Médium: Konferenčný príspevok..
Jazyk:English
Vydavateľské údaje: IEEE 01.09.2022
Predmet:
ISSN:2168-9253
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Abstract Shared memory parallel programming models strive to provide low-overhead execution environments. Task-based programming models, in particular, are well-suited to cope with the ubiquitous multi- and many-core systems since they allow applications to express all available concurrency to a scheduler, which is tasked with exploiting the available hardware resources. It is general consensus that atomic operations should be preferred over locks and mutexes to avoid inter-thread serialization and the resulting loss in efficiency. However, even atomic operations may serialize threads if not used judiciously. In this work, we will discuss several optimizations applied to TTG and the underlying PaRSEC runtime system aiming at removing contentious atomic operations to reduce the overhead of task management to a few hundred clock cycles. The result is an optimized data-flow programming system that seamlessly scales from a single node to distributed execution and which is able to compete with OpenMP in shared memory.
AbstractList Shared memory parallel programming models strive to provide low-overhead execution environments. Task-based programming models, in particular, are well-suited to cope with the ubiquitous multi- and many-core systems since they allow applications to express all available concurrency to a scheduler, which is tasked with exploiting the available hardware resources. It is general consensus that atomic operations should be preferred over locks and mutexes to avoid inter-thread serialization and the resulting loss in efficiency. However, even atomic operations may serialize threads if not used judiciously. In this work, we will discuss several optimizations applied to TTG and the underlying PaRSEC runtime system aiming at removing contentious atomic operations to reduce the overhead of task management to a few hundred clock cycles. The result is an optimized data-flow programming system that seamlessly scales from a single node to distributed execution and which is able to compete with OpenMP in shared memory.
Author Schuchart, Joseph
Bosilca, George
Herault, Thomas
Valeev, Edward F.
Nookala, Poornima
Author_xml – sequence: 1
  givenname: Joseph
  surname: Schuchart
  fullname: Schuchart, Joseph
  email: schuchart@icl.utk.edu
  organization: The University of Tennessee,Innovative Computing Laboratory,Knoxville,TN,USA
– sequence: 2
  givenname: Poornima
  surname: Nookala
  fullname: Nookala, Poornima
  organization: Institute for Advanced Computational Science, Stony Brook University Stony,Brook,NY,USA
– sequence: 3
  givenname: Thomas
  surname: Herault
  fullname: Herault, Thomas
  organization: The University of Tennessee,Innovative Computing Laboratory,Knoxville,TN,USA
– sequence: 4
  givenname: Edward F.
  surname: Valeev
  fullname: Valeev, Edward F.
  organization: Virginia Polytechnic Institute and State University,Department of Chemistry,Blacksburg,VA,USA
– sequence: 5
  givenname: George
  surname: Bosilca
  fullname: Bosilca, George
  organization: The University of Tennessee,Innovative Computing Laboratory,Knoxville,TN,USA
BookMark eNotzEtOwzAUQFGDQKIprIAB3kCKP7FjM4PSFqRIrWg6YFS9OM9tIB8Up1TsHhCM7uieiJy1XYuE3HA24ZzZ22m2WeezF8UTLieCCTFhjAl9QiKutUqsUZqdkpHg2sRWKHlBohDeGJOpZHpEXleHsK_aHR32SB-6Q1tCX2GgnafrBuqa5hDewx1dO6ihqJFm3TFefmK_RyjpIwwQz-vuSFd9t-uhaX6pqqV5vrgk5x7qgFf_HZPNfJZPn-JsuXie3mdxJZgcYuMS6ZTTvAQEJUtjrPMahRAqEdqYNGWmUIDSeOchtWi9k4Ux3P58gns5Jtd_boWI24--aqD_2lrLRcoS-Q3_S1QQ
CODEN IEEPAD
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/CLUSTER51413.2022.00026
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE/IET Electronic Library (IEL) (UW System Shared)
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Xplore
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISBN 1665498560
9781665498562
EISSN 2168-9253
EndPage 128
ExternalDocumentID 9912704
Genre orig-research
GrantInformation_xml – fundername: NSF
  grantid: 1450300,1450344,1450262
  funderid: 10.13039/100000001
GroupedDBID 29O
6IE
6IF
6IH
6IK
6IL
6IN
AAJGR
AAWTH
ABLEC
ADZIZ
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IEGSK
IPLJI
M43
OCL
RIE
RIL
RNS
ID FETCH-LOGICAL-i203t-8c43c5c61daea53d889cf6e2225426887708b5ae38fcfa79e9fc3b881943c21f3
IEDL.DBID RIE
ISICitedReferencesCount 5
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000920273100011&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
IngestDate Wed Aug 27 02:18:29 EDT 2025
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i203t-8c43c5c61daea53d889cf6e2225426887708b5ae38fcfa79e9fc3b881943c21f3
PageCount 12
ParticipantIDs ieee_primary_9912704
PublicationCentury 2000
PublicationDate 2022-Sept.
PublicationDateYYYYMMDD 2022-09-01
PublicationDate_xml – month: 09
  year: 2022
  text: 2022-Sept.
PublicationDecade 2020
PublicationTitle Proceedings / IEEE International Conference on Cluster Computing
PublicationTitleAbbrev CLUSTER
PublicationYear 2022
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0037306
Score 2.228217
Snippet Shared memory parallel programming models strive to provide low-overhead execution environments. Task-based programming models, in particular, are well-suited...
SourceID ieee
SourceType Publisher
StartPage 117
SubjectTerms Dataflow graph
Hardware
Instruction sets
Memory management
PaR-SEC
Parallel programming
Runtime
Scalability
Task analysis
Task-Based Programming
Template Task Graph
TTG
Title Pushing the Boundaries of Small Tasks: Scalable Low-Overhead Data-Flow Programming in TTG
URI https://ieeexplore.ieee.org/document/9912704
WOSCitedRecordID wos000920273100011&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3NT8IwFG-QePCECsbv9ODRyrZ2a-tRFD0QJGEYPJGuaxPi2Axs8u_bloEevHjrmnRN-vHe7_V9_AC4SQSTCTEXSUiSIktngZLIjxAl1ivjqUA6-ra3AR0O2XTKRw1wu8uFUUq54DN1Z5vOl58WsrJPZV2DZQJqi3_uURptcrW2UhebkxrV8Vu-x7u9wWRs8KCBAz42VmDgynLaCgq_OFScCum3_jf5Iej85OLB0U7LHIGGyo9Ba0vGAOu72Qbvo8q9JkGD6OCDI0uyVjAsNBwvRJbBWKw-VvdmgMhsuhQcFGv0ag6ykcYpfBSlQP2sWNu5bMDWwv5qnsM4fu6ASf8p7r2gmjcBzQMPl4hJgmUoIz8VSoQ4ZYxLHSlr2Rl9bKQK9VgSCoWZllpQrriWOGEGG5hxga_xCWjmRa5OAVTcfHOtcEJSEmmSaBYySbH1_oWK8DPQtis1-9yUxpjVi3T-d_cFOLBbsQnRugTNclmpK7Avv8r5annt9vMbU3Ogqw
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3NT8IwFG8ImugJFYzf9uDRyrZ2W-tRFDFOJGEYPJGuaxPiYAaG_Pu2ZaAHL966Jl2Tfrz3e30fPwCuEk5FQvRF4oKkyNBZoCRwAxQS45VxpCcsfdtbFHa7dDhkvQq43uTCSClt8Jm8MU3ry09zsTBPZU2NZbzQFP_c8gnxnFW21lruYn1WgzKCy3VYsxUN-hoRakDgYm0HerYwp6mh8ItFxSqRdu1_0--Bxk82Huxt9Mw-qMjpAait6RhgeTvr4L23sO9JUGM6eGfpkowdDHMF-xOeZTDm84_5rR7AM5MwBaN8iV71UdbyOIX3vOConeVLM5cJ2ZqYX42nMI4fG2DQfohbHVQyJ6Cx5-ACUUGw8EXgplxyH6eUMqECaWw7rZG1XAkdmvhcYqqE4iGTTAmcUI0O9DjPVfgQVKf5VB4BKJn-ZkrihKQkUCRR1KcixMb_50vCjkHdrNToc1UcY1Qu0snf3ZdgpxO_RKPoqft8CnbNtqwCts5AtZgt5DnYFl_FeD67sHv7Dd3bo_I
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=proceeding&rft.title=Proceedings+%2F+IEEE+International+Conference+on+Cluster+Computing&rft.atitle=Pushing+the+Boundaries+of+Small+Tasks%3A+Scalable+Low-Overhead+Data-Flow+Programming+in+TTG&rft.au=Schuchart%2C+Joseph&rft.au=Nookala%2C+Poornima&rft.au=Herault%2C+Thomas&rft.au=Valeev%2C+Edward+F.&rft.date=2022-09-01&rft.pub=IEEE&rft.eissn=2168-9253&rft.spage=117&rft.epage=128&rft_id=info:doi/10.1109%2FCLUSTER51413.2022.00026&rft.externalDocID=9912704