Rapid identification of architectural bottlenecks via precise event counting

On-chip performance counters play a vital role in computer architecture research due to their ability to quickly provide insights into application behaviors that are time consuming to characterize with traditional methods. The usefulness of modern performance counters, however, is limited by ineffic...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:2011 38th Annual International Symposium on Computer Architecture (ISCA) s. 353 - 364
Hlavní autoři: Demme, John, Sethumadhavan, Simha
Médium: Konferenční příspěvek
Jazyk:angličtina
Vydáno: New York, NY, USA ACM 04.06.2011
IEEE
Edice:ACM Conferences
Témata:
ISBN:9781450304726, 1450304729
ISSN:1063-6897
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Abstract On-chip performance counters play a vital role in computer architecture research due to their ability to quickly provide insights into application behaviors that are time consuming to characterize with traditional methods. The usefulness of modern performance counters, however, is limited by inefficient techniques used today to access them. Current access techniques rely on imprecise sampling or heavyweight kernel interaction forcing users to choose between precision or speed and thus restricting the use of performance counter hardware. In this paper, we describe new methods that enable precise, lightweight interfacing to on-chip performance counters. These low-overhead techniques allow precise reading of virtualized counters in low tens of nanoseconds, which is one to two orders of magnitude faster than current access techniques. Further, these tools provide several fresh insights on the behavior of modern parallel programs such as MySQL and Firefox, which were previously obscured (or impossible to obtain) by existing methods for characterization. Based on case studies with our new access methods, we discuss seven implications for computer architects in the cloud era and three methods for enhancing hardware counters further. Taken together, these observations have the potential to open up new avenues for architecture research.
AbstractList On-chip performance counters play a vital role in computer architecture research due to their ability to quickly provide insights into application behaviors that are time consuming to characterize with traditional methods. The usefulness of modern performance counters, however, is limited by inefficient techniques used today to access them. Current access techniques rely on imprecise sampling or heavyweight kernel interaction forcing users to choose between precision or speed and thus restricting the use of performance counter hardware. In this paper, we describe new methods that enable precise, lightweight interfacing to on-chip performance counters. These low-overhead techniques allow precise reading of virtualized counters in low tens of nanoseconds, which is one to two orders of magnitude faster than current access techniques. Further, these tools provide several fresh insights on the behavior of modern parallel programs such as MySQL and Firefox, which were previously obscured (or impossible to obtain) by existing methods for characterization. Based on case studies with our new access methods, we discuss seven implications for computer architects in the cloud era and three methods for enhancing hardware counters further. Taken together, these observations have the potential to open up new avenues for architecture research.
On-chip performance counters play a vital role in computer architecture research due to their ability to quickly provide insights into application behaviors that are time consuming to characterize with traditional methods. The usefulness of modern performance counters, however, is limited by inefficient techniques used today to access them. Current access techniques rely on imprecise sampling or heavyweight kernel interaction forcing users to choose between precision or speed and thus restricting the use of performance counter hardware. In this paper, we describe new methods that enable precise, lightweight interfacing to on-chip performance counters. These low-overhead techniques allow precise reading of virtualized counters in low tens of nanoseconds, which is one to two orders of magnitude faster than current access techniques. Further, these tools provide several fresh insights on the behavior of modern parallel programs such as MySQL and Firefox, which were previously obscured (or impossible to obtain) by existing methods for characterization. Based on case studies with our new access methods, we discuss seven implications for computer architects in the cloud era and three methods for enhancing hardware counters further. Taken together, these observations have the potential to open up new avenues for architecture research.
Author Demme, John
Sethumadhavan, Simha
Author_xml – sequence: 1
  givenname: John
  surname: Demme
  fullname: Demme, John
  email: jdd@cs.columbia.edu
  organization: Columbia University, NY, NY, USA
– sequence: 2
  givenname: Simha
  surname: Sethumadhavan
  fullname: Sethumadhavan, Simha
  email: simha@cs.columbia.edu
  organization: Columbia University, NY, NY, USA
BookMark eNqNkL9PwzAQhY0oElA6M7B4ZEmxnUuuGVHFL6kSEoLZutgXMG2TKnEr8d_jqp2YuOXp9Ol7w7sUo7ZrWYhrraZaQ3FnVLoSpvvUCk_EpMJZAipXgKY8_fOPxIVWZZ6VswrPxWQYvvd-hQCFuRCLN9oEL4PnNoYmOIqha2XXSOrdV4js4ranlay7GFfcslsOchdIbnp2YWDJu-RJ122T3X5eibOGVgNPjjkWH48P7_PnbPH69DK_X2RkAGNmQGNjSFcVANQM3umCkMADYQJFXZCpdVF4AK68YWTvPWsgyH1dYZmPxc2hNzCz3fRhTf2PLXOFiCbR6YGSW9u665aD1crup7PH6exxOlv3gZsk3P5TyH8BCr5r3A
ContentType Conference Proceeding
Copyright 2011 ACM
Copyright_xml – notice: 2011 ACM
DBID 6IE
6IH
CBEJK
RIE
RIO
DOI 10.1145/2000064.2000107
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan (POP) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE/IET Electronic Library
IEEE Proceedings Order Plans (POP) 1998-present
DatabaseTitleList

Database_xml – sequence: 1
  dbid: RIE
  name: IEEE/IET Electronic Library
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISBN 9781450304726
1450304729
EndPage 364
ExternalDocumentID 6307772
Genre orig-research
GroupedDBID 6IE
6IF
6IG
6IH
6IK
6IL
6IM
6IN
AAJGR
AAWTH
ACM
ADFMO
ADPZR
ALMA_UNASSIGNED_HOLDINGS
APO
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
GUFHI
IEGSK
IERZE
IJVOP
LHSKQ
OCL
RIB
RIC
RIE
RIL
RIO
23M
29F
29O
ACGFS
ADZIZ
CHZPO
IPLJI
M43
ZY4
ID FETCH-LOGICAL-a247t-2417f2a199444be4dc15a7a4d4a77f25b5a2b155d44e9d2e7eddde14a43db9763
IEDL.DBID RIE
ISBN 9781450304726
1450304729
ISICitedReferencesCount 24
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000292709800031&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 1063-6897
IngestDate Wed Aug 27 03:28:42 EDT 2025
Wed Jan 31 06:47:39 EST 2024
Wed Jan 08 03:36:12 EST 2025
IsPeerReviewed false
IsScholarly true
Keywords performance evaluation
hardware performance counters
locking
Language English
License Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Permissions@acm.org
LinkModel DirectLink
MeetingName ISCA '11: The 38th Annual International Symposium on Computer Architecture
MergedId FETCHMERGED-LOGICAL-a247t-2417f2a199444be4dc15a7a4d4a77f25b5a2b155d44e9d2e7eddde14a43db9763
PageCount 12
ParticipantIDs ieee_primary_6307772
acm_books_10_1145_2000064_2000107
acm_books_10_1145_2000064_2000107_brief
PublicationCentury 2000
PublicationDate 20110604
2011-June
PublicationDateYYYYMMDD 2011-06-04
2011-06-01
PublicationDate_xml – month: 06
  year: 2011
  text: 20110604
  day: 04
PublicationDecade 2010
PublicationPlace New York, NY, USA
PublicationPlace_xml – name: New York, NY, USA
PublicationSeriesTitle ACM Conferences
PublicationTitle 2011 38th Annual International Symposium on Computer Architecture (ISCA)
PublicationTitleAbbrev ISCA
PublicationYear 2011
Publisher ACM
IEEE
Publisher_xml – name: ACM
– name: IEEE
SSID ssj0000974452
ssj0019956
Score 2.0256674
Snippet On-chip performance counters play a vital role in computer architecture research due to their ability to quickly provide insights into application behaviors...
SourceID ieee
acm
SourceType Publisher
StartPage 353
SubjectTerms Computer architecture
General and reference -- Cross-computing tools and techniques -- Measurement
General and reference -- Cross-computing tools and techniques -- Metrics
Hardware
Hardware -- Hardware validation
Hardware Performance Counters
Instruments
Kernel
Libraries
Linux
Locking
Performance Evaluation
Radiation detectors
Title Rapid identification of architectural bottlenecks via precise event counting
URI https://ieeexplore.ieee.org/document/6307772
WOSCitedRecordID wos000292709800031&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3PS8MwFH5sw8NOUzdx_iKC4MW6tkub7iwOD2MMUditJH2vUNRtrNv-fpM0qwqCeGvTHsprku_9yPc9gBuRm6BECk_KDD2eZIFeUoo8ikLEMEE_R6uuPxHTaTKfj2YNuKu5MERkD5_Rvbm0tXxcZluTKhvEekJqb7AJTSHiiqtV51N87RjzqJaOMsxjyyzSEOzFiZH_awc8MoVA7U7utZ7cfew0f_TAILQ7uM226FDFglb28aP1ikWeced_33wIvS8KH5vV4HQEDVocQ2ffw4G5Jd2FybNcFcgKdKeG7I9iy5x9qzDId2Yajml8ouytZLtCspURxSiJWf0ntm840YPX8ePLw5PnOix4MuRi42n4FnkojT4w54o4ZkEkheTIpdAPIhXJUGmPAzmnEYYkCPV2GHDJh6i0IzM8gdZiuaBTYMLU34z-G_mK-yoeBcgxSDAeJYIkV3241oZMTehQphUbOkqdsVNn7D7c_vlOqtYF5X3oGlOnq0qSI3VWPvt9-BzaVSrYJE8uoLVZb-kSDrLdpijXV3YefQL_ErzN
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1dS8MwFL3MKbinqZs4PyMIvli3dmnTPYsycY4hE_ZWkt5bKOo21m2_3yTLpoIgvrVpH8ptknM_cs8BuBKZCUqk8KRM0eNx6uslpcijMEAMYmxlaNn1e6Lfj0ejzqAEN5teGCKyh8_o1lzaWj5O0oVJlTUjPSG1N7gF20Y5y3VrbTIqLe0a83BDHmV6j21vkQZhL4oNAWDF56EpBWqHcs325O4jx_qjB5qB3cNtvkUHKxa20o8f4isWex6q__vqPah_NfGxwQae9qFE4wOorlUcmFvUNei9yGmOLEd3bsj-KjbJ2Lcag3xnRnJMIxSlbwVb5pJNDS1GQcwyQLG15EQdXh_uh3ddz2kseDLgYu5pABdZIA1DMOeKOKZ-KIXkyKXQD0IVykBpnwM5pw4GJAj1huhzyduotCvTPoTyeDKmI2DCVOAMAxy1FG-pqOMjRz_GqBMLklw14FIbMjHBQ5Gs-qHDxBk7ccZuwPWf7yRqllPWgJoxdTJdkXIkzsrHvw9fwG53-NxLeo_9pxOorBLDJpVyCuX5bEFnsJMu53kxO7dz6hMNmsAW
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=Proceedings+of+the+38th+annual+international+symposium+on+Computer+architecture&rft.atitle=Rapid+identification+of+architectural+bottlenecks+via+precise+event+counting&rft.au=Demme%2C+John&rft.au=Sethumadhavan%2C+Simha&rft.series=ACM+Conferences&rft.date=2011-06-04&rft.pub=ACM&rft.isbn=9781450304726&rft.spage=353&rft.epage=364&rft_id=info:doi/10.1145%2F2000064.2000107
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1063-6897&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1063-6897&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1063-6897&client=summon