Productive Programming of GPU Clusters with OmpSs

Clusters of GPUs are emerging as a new computational scenario. Programming them requires the use of hybrid models that increase the complexity of the applications, reducing the productivity of programmers. We present the implementation of OmpSs for clusters of GPUs, which supports asynchrony and het...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:2012 IEEE 26th International Parallel and Distributed Processing Symposium s. 557 - 568
Hlavní autoři: Bueno, J., Planas, J., Duran, A., Badia, R. M., Martorell, X., Ayguade, E., Labarta, J.
Médium: Konferenční příspěvek Publikace
Jazyk:angličtina
Vydáno: IEEE 01.05.2012
Témata:
ISBN:1467309753, 9781467309752
ISSN:1530-2075
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Abstract Clusters of GPUs are emerging as a new computational scenario. Programming them requires the use of hybrid models that increase the complexity of the applications, reducing the productivity of programmers. We present the implementation of OmpSs for clusters of GPUs, which supports asynchrony and heterogeneity for task parallelism. It is based on annotating a serial application with directives that are translated by the compiler. With it, the same program that runs sequentially in a node with a single GPU can run in parallel in multiple GPUs either local (single node) or remote (cluster of GPUs). Besides performing a task-based parallelization, the runtime system moves the data as needed between the different nodes and GPUs minimizing the impact of communication by using affinity scheduling, caching, and by overlapping communication with the computational task. We show several applications programmed with OmpSs and their performance with multiple GPUs in a local node and in remote nodes. The results show good tradeoff between performance and effort from the programmer.
AbstractList Clusters of GPUs are emerging as a new computational scenario. Programming them requires the use of hybrid models that increase the complexity of the applications, reducing the productivity of programmers. We present the implementation of OmpSs for clusters of GPUs, which supports asynchrony and heterogeneity for task parallelism. It is based on annotating a serial application with directives that are translated by the compiler. With it, the same program that runs sequentially in a node with a single GPU can run in parallel in multiple GPUs either local (single node) or remote (cluster of GPUs). Besides performing a task-based parallelization, the runtime system moves the data as needed between the different nodes and GPUs minimizing the impact of communication by using affinity scheduling, caching, and by overlapping communication with the computational task. We show several applications programmed with OmpSs and their performance with multiple GPUs in a local node and in remote nodes. The results show good tradeoff between performance and effort from the programmer.
Clusters of GPUs are emerging as a new computational scenario. Programming them requires the use of hybrid models that increase the complexity of the applications, reducing the productivity of programmers. We present the implementation of OmpSs for clusters of GPUs, which supports asynchrony and heterogeneity for task parallelism. It is based on annotating a serial application with directives that are translated by the compiler. With it, the same program that runs sequentially in a node with a single GPU can run in parallel in multiple GPUs either local (single node) or remote (cluster of GPUs). Besides performing a task-based parallelization, the runtime system moves the data as needed between the different nodes and GPUs minimizing the impact of communication by using affinity scheduling, caching, and by overlapping communication with the computational task. We show several applicactions programmed with OmpSs and their performance with multiple GPUs in a local node and in remote nodes. The results show good tradeoff between performance and effort from the programmer. Peer Reviewed
Author Bueno, J.
Labarta, J.
Planas, J.
Martorell, X.
Ayguade, E.
Duran, A.
Badia, R. M.
Author_xml – sequence: 1
  givenname: J.
  surname: Bueno
  fullname: Bueno, J.
  email: javier.bueno@bsc.es
  organization: Barcelona Supercomput. Center, Barcelona, Spain
– sequence: 2
  givenname: J.
  surname: Planas
  fullname: Planas, J.
  email: judit.planas@bsc.es
  organization: Barcelona Supercomput. Center, Barcelona, Spain
– sequence: 3
  givenname: A.
  surname: Duran
  fullname: Duran, A.
  email: alex.duran@bsc.es
  organization: Barcelona Supercomput. Center, Barcelona, Spain
– sequence: 4
  givenname: R. M.
  surname: Badia
  fullname: Badia, R. M.
  email: rosa.m.badia@bsc.es
  organization: Barcelona Supercomput. Center, Artificial Intell. Res. Inst. (IIIA), Barcelona, Spain
– sequence: 5
  givenname: X.
  surname: Martorell
  fullname: Martorell, X.
  email: xavier.martorell@bsc.es
  organization: Barcelona Supercomput. Center, Univ. Politec. de Catalunya, Barcelona, Spain
– sequence: 6
  givenname: E.
  surname: Ayguade
  fullname: Ayguade, E.
  email: eduard.ayguade@bsc.es
  organization: Barcelona Supercomput. Center, Univ. Politec. de Catalunya, Barcelona, Spain
– sequence: 7
  givenname: J.
  surname: Labarta
  fullname: Labarta, J.
  email: jesus.labarta@bsc.es
  organization: Barcelona Supercomput. Center, Univ. Politec. de Catalunya, Barcelona, Spain
BookMark eNpFTstOwkAUHSMmCrJ05aY_ULx33rM0qEhCQhNk3UynU6yhlMwUjX_vGExcnNfinJwxGR36gyfkDmGGCOZhWTwVmxkFpDOhL8jUKA1KGsGlEuaSjDEZBkYJNiI3KBjkFJS4JuMYPwAoMG5uCBahr09uaD99luwu2K5rD7usb7JFsc3m-1McfIjZVzu8Z-vuuIm35Kqx--infzoh25fnt_lrvlovlvPHVe4oqiG3CirqGDiduKoU0rpG4RhPJ7iopOeNqlnjOVikXKZ_TW2s0FhVloNzbELwvOviyZXBOx-cHcretv_hF2mPJqJa6tS5P3da7315DG1nw3cpqVRaaPYDZ7JXvA
CODEN IEEPAD
ContentType Conference Proceeding
Publication
Contributor Universitat Politècnica de Catalunya. CAP - Grup de Computació d'Altes Prestacions
Universitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors
Contributor_xml – sequence: 1
  fullname: Universitat Politècnica de Catalunya. Departament d'Arquitectura de Computadors
– sequence: 2
  fullname: Universitat Politècnica de Catalunya. CAP - Grup de Computació d'Altes Prestacions
Copyright info:eu-repo/semantics/openAccess
Copyright_xml – notice: info:eu-repo/semantics/openAccess
DBID 6IE
6IL
CBEJK
RIE
RIL
XX2
DOI 10.1109/IPDPS.2012.58
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP All) 1998-Present
Recercat
DatabaseTitleList

Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISBN 9780769546759
0769546757
EndPage 568
ExternalDocumentID oai_recercat_cat_2072_202868
6267858
Genre orig-research
GroupedDBID 29O
6IE
6IF
6IH
6IK
6IL
6IN
AAJGR
AAWTH
ABLEC
ADZIZ
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IEGSK
IPLJI
OCL
RIE
RIL
ADFMO
IERZE
XX2
ID FETCH-LOGICAL-c217t-a70b2c30c82c3bb712dd15c3420745b6e4f7d3fe40a1246153fd9a581bba40cc3
IEDL.DBID RIE
ISBN 1467309753
9781467309752
ISICitedReferencesCount 88
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000309131900049&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 1530-2075
IngestDate Fri Nov 07 13:57:18 EST 2025
Wed Aug 27 04:45:00 EDT 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c217t-a70b2c30c82c3bb712dd15c3420745b6e4f7d3fe40a1246153fd9a581bba40cc3
OpenAccessLink https://recercat.cat/handle/2072/202868
PageCount 12
ParticipantIDs ieee_primary_6267858
csuc_recercat_oai_recercat_cat_2072_202868
PublicationCentury 2000
PublicationDate 2012-May
2012
PublicationDateYYYYMMDD 2012-05-01
2012-01-01
PublicationDate_xml – month: 05
  year: 2012
  text: 2012-May
PublicationDecade 2010
PublicationTitle 2012 IEEE 26th International Parallel and Distributed Processing Symposium
PublicationTitleAbbrev ipdps
PublicationYear 2012
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0020349
ssj0000781219
Score 1.851894
Snippet Clusters of GPUs are emerging as a new computational scenario. Programming them requires the use of hybrid models that increase the complexity of the...
Clusters of GPUs are emerging as a new computational scenario. Programming them requires the use of hybrid models that increase the complexity of the...
SourceID csuc
ieee
SourceType Open Access Repository
Publisher
StartPage 557
SubjectTerms accelerators
Arquitectura de computadors
Arquitectures distribuïdes
Cluster programming
Coherence
Computació distribuïda
Computational grids (Computer systems)
Computer architecture
GPGPU computing
Graphics processing unit
Informàtica
Kernel
Message systems
OpenMP
Programming
Runtime
Àrees temàtiques de la UPC
Title Productive Programming of GPU Clusters with OmpSs
URI https://ieeexplore.ieee.org/document/6267858
https://recercat.cat/handle/2072/202868
WOSCitedRecordID wos000309131900049&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LT8MwDLa2iQMnHhtivJQDJ0S3vpOeBwMuo9KYtFvVpImExB5aW34_dtt1QuLCoVFTNWplu47t2p8B7iXPPC1tYSkRGMtPhbRE5gSWkFxFaCEoY2TVbILPZmK5jOIOPLa1MFrrKvlMj-i0-pefbVRJobIxGt9cBKILXc55XavVxlMItMYlaLLG2SLclRor1UZJ4EFV1BWiPFMl6R7rqZm7B_DN8Vv8FM8p5csdURv4nspL9avzSrXxTE_-98qnMDhU8LG43ZvOoKPX53Cyb-HAmi-6D05cI76izqP7KVVrhQvYxrCXeMEmXyUBKeSMwrXsfbWd5wNYTJ8_Jq9W00XBUuhuFFbKbekqz1YCRym542bIDOX5SBU_kKH2DfLLaN9OHQKXCzyTRWmA5qxMkVfKu4DeerPWl8DCCL0lI9EnMwaXRtLXbii1zBRpDj8dwgORK0FdrXcqLRLCrm4ndOAzXRxcEYoh9IlcybZG1UgaSl39ffkajokndarhDfSKXalv4Uh9F5_57q4ShR9Lt6z3
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3dT4NADG_mNNGnqZtxfvLgkxEHxwHH83RucU6SzcQ3wh13iYn7yMb8-22BzZj44gMXjnCBtKXXlvZXgBsZZp6WjrCV8I3NUyFtkbm-LWSoIrQQlDGyaDYRjkbi_T2Ka3C3rYXRWhfJZ_qeTot_-dlcrSlU1kHjOxS-2IFdn3PmltVa24gKwdYwAier3C1CXinRUh2UhdAvyroClGiqJd2gPVVz9gO_2RnED_GYkr7YPTWCr6vVWv3qvVJsPb3G_176EFo_NXxWvN2djqCmZ8fQ2DRxsKpvugluXGK-otaj-ylZa4oLrLmxnuI3q_u5JiiFlUUBW-t1uhivWvDWe5x0-3bVR8FW6HDkdho6kinPUQJHKUOXZcgO5XGkCvdloLlBjhnNndQleDnfM1mU-mjQyhS5pbwTqM_mM30KVhChv2QkemXG4NJIcs0CqWWmSHfwtA23RK4EtbVeqjRPCL16O6EDn8lwYCIQbWgSuZJFiauRVJQ6-_vyNez3Jy_DZDgYPZ_DAfGnTDy8gHq-XOtL2FNf-cdqeVWIxTeBdbA-
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2012+IEEE+26th+International+Parallel+and+Distributed+Processing+Symposium&rft.atitle=Productive+Programming+of+GPU+Clusters+with+OmpSs&rft.au=Bueno%2C+J.&rft.au=Planas%2C+J.&rft.au=Duran%2C+A.&rft.au=Badia%2C+R.+M.&rft.date=2012-05-01&rft.pub=IEEE&rft.isbn=9781467309752&rft.issn=1530-2075&rft.spage=557&rft.epage=568&rft_id=info:doi/10.1109%2FIPDPS.2012.58&rft.externalDocID=6267858
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1530-2075&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1530-2075&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1530-2075&client=summon