Managing GPU Concurrency in Heterogeneous Architectures

Heterogeneous architectures consisting of general-purpose CPUs and throughput-optimized GPUs are projected to be the dominant computing platforms for many classes of applications. The design of such systems is more complex than that of homogeneous architectures because maximizing resource utilizatio...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:2014 47th Annual IEEE/ACM International Symposium on Microarchitecture S. 114 - 126
Hauptverfasser: Kayiran, Onur, Nachiappan, Nachiappan Chidambaram, Jog, Adwait, Ausavarungnirun, Rachata, Kandemir, Mahmut T., Loh, Gabriel H., Mutlu, Onur, Das, Chita R.
Format: Tagungsbericht
Sprache:Englisch
Veröffentlicht: IEEE 01.12.2014
Schlagworte:
ISSN:1072-4451
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Abstract Heterogeneous architectures consisting of general-purpose CPUs and throughput-optimized GPUs are projected to be the dominant computing platforms for many classes of applications. The design of such systems is more complex than that of homogeneous architectures because maximizing resource utilization while minimizing shared resource interference between CPU and GPU applications is difficult. We show that GPU applications tend to monopolize the shared hardware resources, such as memory and network, because of their high thread-level parallelism (TLP), and discuss the limitations of existing GPU-based concurrency management techniques when employed in heterogeneous systems. To solve this problem, we propose an integrated concurrency management strategy that modulates the TLP in GPUs to control the performance of both CPU and GPU applications. This mechanism considers both GPU core state and system-wide memory and network congestion information to dynamically decide on the level of GPU concurrency to maximize system performance. We propose and evaluate two schemes: one (CM-CPU) for boosting CPU performance in the presence of GPU interference, the other (CM-BAL) for improving both CPU and GPU performance in a balanced manner and thus overall system performance. Our evaluations show that the first scheme improves average CPU performance by 24%, while reducing average GPU performance by 11%. The second scheme provides 7% average performance improvement for both CPU and GPU applications. We also show that our solution allows the user to control performance trade-offs between CPUs and GPUs.
AbstractList Heterogeneous architectures consisting of general-purpose CPUs and throughput-optimized GPUs are projected to be the dominant computing platforms for many classes of applications. The design of such systems is more complex than that of homogeneous architectures because maximizing resource utilization while minimizing shared resource interference between CPU and GPU applications is difficult. We show that GPU applications tend to monopolize the shared hardware resources, such as memory and network, because of their high thread-level parallelism (TLP), and discuss the limitations of existing GPU-based concurrency management techniques when employed in heterogeneous systems. To solve this problem, we propose an integrated concurrency management strategy that modulates the TLP in GPUs to control the performance of both CPU and GPU applications. This mechanism considers both GPU core state and system-wide memory and network congestion information to dynamically decide on the level of GPU concurrency to maximize system performance. We propose and evaluate two schemes: one (CM-CPU) for boosting CPU performance in the presence of GPU interference, the other (CM-BAL) for improving both CPU and GPU performance in a balanced manner and thus overall system performance. Our evaluations show that the first scheme improves average CPU performance by 24%, while reducing average GPU performance by 11%. The second scheme provides 7% average performance improvement for both CPU and GPU applications. We also show that our solution allows the user to control performance trade-offs between CPUs and GPUs.
Author Loh, Gabriel H.
Jog, Adwait
Ausavarungnirun, Rachata
Kandemir, Mahmut T.
Mutlu, Onur
Kayiran, Onur
Das, Chita R.
Nachiappan, Nachiappan Chidambaram
Author_xml – sequence: 1
  givenname: Onur
  surname: Kayiran
  fullname: Kayiran, Onur
  email: onur@cse.psu.edu
  organization: Pennsylvania State Univ., University Park, PA, USA
– sequence: 2
  givenname: Nachiappan Chidambaram
  surname: Nachiappan
  fullname: Nachiappan, Nachiappan Chidambaram
  email: nachi@cse.psu.edu
  organization: Pennsylvania State Univ., University Park, PA, USA
– sequence: 3
  givenname: Adwait
  surname: Jog
  fullname: Jog, Adwait
  email: adwait@cse.psu.edu
  organization: Pennsylvania State Univ., University Park, PA, USA
– sequence: 4
  givenname: Rachata
  surname: Ausavarungnirun
  fullname: Ausavarungnirun, Rachata
  email: rachata@cmu.edu
  organization: Carnegie Mellon Univ., Pittsburgh, PA, USA
– sequence: 5
  givenname: Mahmut T.
  surname: Kandemir
  fullname: Kandemir, Mahmut T.
  email: kandemir@cse.psu.edu
  organization: Pennsylvania State Univ., University Park, PA, USA
– sequence: 6
  givenname: Gabriel H.
  surname: Loh
  fullname: Loh, Gabriel H.
  email: gabriel.loh@amd.com
– sequence: 7
  givenname: Onur
  surname: Mutlu
  fullname: Mutlu, Onur
  email: onur@cmu.edu
  organization: Carnegie Mellon Univ., Pittsburgh, PA, USA
– sequence: 8
  givenname: Chita R.
  surname: Das
  fullname: Das, Chita R.
  email: das@cse.psu.edu
  organization: Pennsylvania State Univ., University Park, PA, USA
BookMark eNotzr1OwzAUQGEjFYm2dGRiyQsk3Gs7jj1WEbSVWhUhOlf-uQlB4CAnGfr2IJXpbJ_Ogs1iH4mxB4QCEczTYVe_HQsOKAvFb9gCZWWMMkbzGZsjVDyXssQ7thqGTwBApSRKMWfVwUbbdrHNNq-nrO6jn1Ki6C9ZF7MtjZT6liL105Ctk__oRvLjlGi4Z7eN_Rpo9d8lO708v9fbfH_c7Or1Prdc8zE3PrhGlGRC8EFZ9H8TOjgttDUOPHkCjUFWAA2VjjvjUCvLG8O9RgtSLNnj1e2I6PyTum-bLucKEIXm4hcl0EgF
CODEN IEEPAD
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/MICRO.2014.62
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Xplore POP ALL
IEEE Xplore All Conference Proceedings
IEEE/IET Electronic Library (IEL) (UW System Shared)
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE/IET Electronic Library (IEL) (UW System Shared)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISBN 1479969982
9781479969982
EndPage 126
ExternalDocumentID 7011382
Genre orig-research
GroupedDBID -~X
123
29O
6IE
6IF
6IK
6IL
6IN
AAJGR
AAWTH
ADZIZ
AFFNX
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IPLJI
M43
OCL
RIE
RIL
RNS
ID FETCH-LOGICAL-a282t-9cdbf35e9ddcd6a1c4518db838a9b0cece081d4700fe5b2b9b186a2f92c81a043
IEDL.DBID RIE
ISICitedReferencesCount 58
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000365531100010&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 1072-4451
IngestDate Wed Aug 27 01:52:26 EDT 2025
IsDoiOpenAccess false
IsOpenAccess true
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-a282t-9cdbf35e9ddcd6a1c4518db838a9b0cece081d4700fe5b2b9b186a2f92c81a043
OpenAccessLink https://figshare.com/articles/journal_contribution/Managing_GPU_Concurrency_in_Heterogeneous_Architectures/6468998
PageCount 13
ParticipantIDs ieee_primary_7011382
PublicationCentury 2000
PublicationDate 2014-Dec.
PublicationDateYYYYMMDD 2014-12-01
PublicationDate_xml – month: 12
  year: 2014
  text: 2014-Dec.
PublicationDecade 2010
PublicationTitle 2014 47th Annual IEEE/ACM International Symposium on Microarchitecture
PublicationTitleAbbrev MICRO
PublicationYear 2014
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0001664143
ssj0008695
Score 2.2117774
Snippet Heterogeneous architectures consisting of general-purpose CPUs and throughput-optimized GPUs are projected to be the dominant computing platforms for many...
SourceID ieee
SourceType Publisher
StartPage 114
SubjectTerms Bandwidth
Central Processing Unit
Computer architecture
concurrency
Concurrent computing
CPU-GPU
GPUs
Graphics processing units
heterogeneous architectures
Resource management
scheduling
System performance
thread-level parallelism
Title Managing GPU Concurrency in Heterogeneous Architectures
URI https://ieeexplore.ieee.org/document/7011382
WOSCitedRecordID wos000365531100010&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1JSwMxFA61ePBUtRV3cvBo2plkJstRxNpTLWKht5LlBXqZSjfw35uk0-XgxVsIBPJelseXvPd9CD2Bp1x6aggABIAifUaUE5R4aaQpOTNUuiQ2IYZDOZmoUQM972thwpiUfAbd2Ex_-W5u1_GprCfCZmQyXLgnQvBtrdbhPYXzIlHV1bew5ElxJaAbSiIJ14Ffsxfs-_yIWV1FN4rkHKmqpKDSb_1vOueoc6jOw6N93LlADaguUWsnz4Dr09pGYqdBhN9HYxwG28TFZH_wrMKDmAYzD7sHAvTHL0ffCcsOGvffvl4HpNZJIDoAphVR1hnPSlDOWcd1boO50hnJpFYms2AhxH1XiCzzUBpqlMkl19QramWus4JdoWY1r-Aa4TKY5EomnHBQBKSmLGWlzzV47w1ofoPa0Q_T7y0VxrR2we3f3XfoLHp5m_1xj5qrxRoe0KndrGbLxWNav19tXppW
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LSwMxEB5KFfRUtRXf5uDRbXezu9nkKGKtWGuRFnorm2QCvWxLH4L_3iTdPg5evIVAIDN5DF8y830AD2go44bKABEtQOEmDITOaGC45DJlsaRce7GJrNfjo5HoV-BxWwtjx_jkM2y6pv_L11O1ck9lrcxuxpjbC_cgTRIarqu1di8qjCWerK68hznzmisW39DA0XDtGDZb1sKvT5fXlTSdTM6erooPK-3a_yZ0Ao1dfR7pbyPPKVSwOIPaRqCBlOe1DtlGhYi89ofEDlaejUn9kElBOi4RZmr3D1rwT572PhQWDRi2XwbPnaBUSghyC5mWgVBamjhFobXSLI-UNZdryWOeCxkqVGgjv06yMDSYSiqFjDjLqRFU8SgPk_gcqsW0wAsgqTVJp3GmM42JxWpC0Tg1UY7GGIk5u4S688N4tibDGJcuuPq7-x6OOoOP7rj71nu_hmPn8XUuyA1Ul_MV3sKh-l5OFvM7v5a_p8qdnQ
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=proceeding&rft.title=2014+47th+Annual+IEEE%2FACM+International+Symposium+on+Microarchitecture&rft.atitle=Managing+GPU+Concurrency+in+Heterogeneous+Architectures&rft.au=Kayiran%2C+Onur&rft.au=Nachiappan%2C+Nachiappan+Chidambaram&rft.au=Jog%2C+Adwait&rft.au=Ausavarungnirun%2C+Rachata&rft.date=2014-12-01&rft.pub=IEEE&rft.issn=1072-4451&rft.spage=114&rft.epage=126&rft_id=info:doi/10.1109%2FMICRO.2014.62&rft.externalDocID=7011382
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1072-4451&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1072-4451&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1072-4451&client=summon