Rapid identification of architectural bottlenecks via precise event counting
On-chip performance counters play a vital role in computer architecture research due to their ability to quickly provide insights into application behaviors that are time consuming to characterize with traditional methods. The usefulness of modern performance counters, however, is limited by ineffic...
Uloženo v:
| Vydáno v: | 2011 38th Annual International Symposium on Computer Architecture (ISCA) s. 353 - 364 |
|---|---|
| Hlavní autoři: | , |
| Médium: | Konferenční příspěvek |
| Jazyk: | angličtina |
| Vydáno: |
New York, NY, USA
ACM
04.06.2011
IEEE |
| Edice: | ACM Conferences |
| Témata: | |
| ISBN: | 9781450304726, 1450304729 |
| ISSN: | 1063-6897 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Abstract | On-chip performance counters play a vital role in computer architecture research due to their ability to quickly provide insights into application behaviors that are time consuming to characterize with traditional methods. The usefulness of modern performance counters, however, is limited by inefficient techniques used today to access them. Current access techniques rely on imprecise sampling or heavyweight kernel interaction forcing users to choose between precision or speed and thus restricting the use of performance counter hardware.
In this paper, we describe new methods that enable precise, lightweight interfacing to on-chip performance counters. These low-overhead techniques allow precise reading of virtualized counters in low tens of nanoseconds, which is one to two orders of magnitude faster than current access techniques. Further, these tools provide several fresh insights on the behavior of modern parallel programs such as MySQL and Firefox, which were previously obscured (or impossible to obtain) by existing methods for characterization. Based on case studies with our new access methods, we discuss seven implications for computer architects in the cloud era and three methods for enhancing hardware counters further. Taken together, these observations have the potential to open up new avenues for architecture research. |
|---|---|
| AbstractList | On-chip performance counters play a vital role in computer architecture research due to their ability to quickly provide insights into application behaviors that are time consuming to characterize with traditional methods. The usefulness of modern performance counters, however, is limited by inefficient techniques used today to access them. Current access techniques rely on imprecise sampling or heavyweight kernel interaction forcing users to choose between precision or speed and thus restricting the use of performance counter hardware. In this paper, we describe new methods that enable precise, lightweight interfacing to on-chip performance counters. These low-overhead techniques allow precise reading of virtualized counters in low tens of nanoseconds, which is one to two orders of magnitude faster than current access techniques. Further, these tools provide several fresh insights on the behavior of modern parallel programs such as MySQL and Firefox, which were previously obscured (or impossible to obtain) by existing methods for characterization. Based on case studies with our new access methods, we discuss seven implications for computer architects in the cloud era and three methods for enhancing hardware counters further. Taken together, these observations have the potential to open up new avenues for architecture research. On-chip performance counters play a vital role in computer architecture research due to their ability to quickly provide insights into application behaviors that are time consuming to characterize with traditional methods. The usefulness of modern performance counters, however, is limited by inefficient techniques used today to access them. Current access techniques rely on imprecise sampling or heavyweight kernel interaction forcing users to choose between precision or speed and thus restricting the use of performance counter hardware. In this paper, we describe new methods that enable precise, lightweight interfacing to on-chip performance counters. These low-overhead techniques allow precise reading of virtualized counters in low tens of nanoseconds, which is one to two orders of magnitude faster than current access techniques. Further, these tools provide several fresh insights on the behavior of modern parallel programs such as MySQL and Firefox, which were previously obscured (or impossible to obtain) by existing methods for characterization. Based on case studies with our new access methods, we discuss seven implications for computer architects in the cloud era and three methods for enhancing hardware counters further. Taken together, these observations have the potential to open up new avenues for architecture research. |
| Author | Demme, John Sethumadhavan, Simha |
| Author_xml | – sequence: 1 givenname: John surname: Demme fullname: Demme, John email: jdd@cs.columbia.edu organization: Columbia University, NY, NY, USA – sequence: 2 givenname: Simha surname: Sethumadhavan fullname: Sethumadhavan, Simha email: simha@cs.columbia.edu organization: Columbia University, NY, NY, USA |
| BookMark | eNqNkL9PwzAQhY0oElA6M7B4ZEmxnUuuGVHFL6kSEoLZutgXMG2TKnEr8d_jqp2YuOXp9Ol7w7sUo7ZrWYhrraZaQ3FnVLoSpvvUCk_EpMJZAipXgKY8_fOPxIVWZZ6VswrPxWQYvvd-hQCFuRCLN9oEL4PnNoYmOIqha2XXSOrdV4js4ranlay7GFfcslsOchdIbnp2YWDJu-RJ122T3X5eibOGVgNPjjkWH48P7_PnbPH69DK_X2RkAGNmQGNjSFcVANQM3umCkMADYQJFXZCpdVF4AK68YWTvPWsgyH1dYZmPxc2hNzCz3fRhTf2PLXOFiCbR6YGSW9u665aD1crup7PH6exxOlv3gZsk3P5TyH8BCr5r3A |
| ContentType | Conference Proceeding |
| Copyright | 2011 ACM |
| Copyright_xml | – notice: 2011 ACM |
| DBID | 6IE 6IH CBEJK RIE RIO |
| DOI | 10.1145/2000064.2000107 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan (POP) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE/IET Electronic Library IEEE Proceedings Order Plans (POP) 1998-present |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE/IET Electronic Library url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Computer Science |
| EISBN | 9781450304726 1450304729 |
| EndPage | 364 |
| ExternalDocumentID | 6307772 |
| Genre | orig-research |
| GroupedDBID | 6IE 6IF 6IG 6IH 6IK 6IL 6IM 6IN AAJGR AAWTH ACM ADFMO ADPZR ALMA_UNASSIGNED_HOLDINGS APO BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK GUFHI IEGSK IERZE IJVOP LHSKQ OCL RIB RIC RIE RIL RIO 23M 29F 29O ACGFS ADZIZ CHZPO IPLJI M43 ZY4 |
| ID | FETCH-LOGICAL-a247t-2417f2a199444be4dc15a7a4d4a77f25b5a2b155d44e9d2e7eddde14a43db9763 |
| IEDL.DBID | RIE |
| ISBN | 9781450304726 1450304729 |
| ISICitedReferencesCount | 24 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000292709800031&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 1063-6897 |
| IngestDate | Wed Aug 27 03:28:42 EDT 2025 Wed Jan 31 06:47:39 EST 2024 Wed Jan 08 03:36:12 EST 2025 |
| IsPeerReviewed | false |
| IsScholarly | true |
| Keywords | performance evaluation hardware performance counters locking |
| Language | English |
| License | Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Permissions@acm.org |
| LinkModel | DirectLink |
| MeetingName | ISCA '11: The 38th Annual International Symposium on Computer Architecture |
| MergedId | FETCHMERGED-LOGICAL-a247t-2417f2a199444be4dc15a7a4d4a77f25b5a2b155d44e9d2e7eddde14a43db9763 |
| PageCount | 12 |
| ParticipantIDs | ieee_primary_6307772 acm_books_10_1145_2000064_2000107 acm_books_10_1145_2000064_2000107_brief |
| PublicationCentury | 2000 |
| PublicationDate | 20110604 2011-June |
| PublicationDateYYYYMMDD | 2011-06-04 2011-06-01 |
| PublicationDate_xml | – month: 06 year: 2011 text: 20110604 day: 04 |
| PublicationDecade | 2010 |
| PublicationPlace | New York, NY, USA |
| PublicationPlace_xml | – name: New York, NY, USA |
| PublicationSeriesTitle | ACM Conferences |
| PublicationTitle | 2011 38th Annual International Symposium on Computer Architecture (ISCA) |
| PublicationTitleAbbrev | ISCA |
| PublicationYear | 2011 |
| Publisher | ACM IEEE |
| Publisher_xml | – name: ACM – name: IEEE |
| SSID | ssj0000974452 ssj0019956 |
| Score | 2.0256674 |
| Snippet | On-chip performance counters play a vital role in computer architecture research due to their ability to quickly provide insights into application behaviors... |
| SourceID | ieee acm |
| SourceType | Publisher |
| StartPage | 353 |
| SubjectTerms | Computer architecture General and reference -- Cross-computing tools and techniques -- Measurement General and reference -- Cross-computing tools and techniques -- Metrics Hardware Hardware -- Hardware validation Hardware Performance Counters Instruments Kernel Libraries Linux Locking Performance Evaluation Radiation detectors |
| Title | Rapid identification of architectural bottlenecks via precise event counting |
| URI | https://ieeexplore.ieee.org/document/6307772 |
| WOSCitedRecordID | wos000292709800031&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3PS8MwFH5sw8NOUzdx_iKC4MW6tkub7iwOD2MMUditJH2vUNRtrNv-fpM0qwqCeGvTHsprku_9yPc9gBuRm6BECk_KDD2eZIFeUoo8ikLEMEE_R6uuPxHTaTKfj2YNuKu5MERkD5_Rvbm0tXxcZluTKhvEekJqb7AJTSHiiqtV51N87RjzqJaOMsxjyyzSEOzFiZH_awc8MoVA7U7utZ7cfew0f_TAILQ7uM226FDFglb28aP1ikWeced_33wIvS8KH5vV4HQEDVocQ2ffw4G5Jd2FybNcFcgKdKeG7I9iy5x9qzDId2Yajml8ouytZLtCspURxSiJWf0ntm840YPX8ePLw5PnOix4MuRi42n4FnkojT4w54o4ZkEkheTIpdAPIhXJUGmPAzmnEYYkCPV2GHDJh6i0IzM8gdZiuaBTYMLU34z-G_mK-yoeBcgxSDAeJYIkV3241oZMTehQphUbOkqdsVNn7D7c_vlOqtYF5X3oGlOnq0qSI3VWPvt9-BzaVSrYJE8uoLVZb-kSDrLdpijXV3YefQL_ErzN |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1dS8MwFL3MKbinqZs4PyMIvli3dmnTPYsycY4hE_ZWkt5bKOo21m2_3yTLpoIgvrVpH8ptknM_cs8BuBKZCUqk8KRM0eNx6uslpcijMEAMYmxlaNn1e6Lfj0ejzqAEN5teGCKyh8_o1lzaWj5O0oVJlTUjPSG1N7gF20Y5y3VrbTIqLe0a83BDHmV6j21vkQZhL4oNAWDF56EpBWqHcs325O4jx_qjB5qB3cNtvkUHKxa20o8f4isWex6q__vqPah_NfGxwQae9qFE4wOorlUcmFvUNei9yGmOLEd3bsj-KjbJ2Lcag3xnRnJMIxSlbwVb5pJNDS1GQcwyQLG15EQdXh_uh3ddz2kseDLgYu5pABdZIA1DMOeKOKZ-KIXkyKXQD0IVykBpnwM5pw4GJAj1huhzyduotCvTPoTyeDKmI2DCVOAMAxy1FG-pqOMjRz_GqBMLklw14FIbMjHBQ5Gs-qHDxBk7ccZuwPWf7yRqllPWgJoxdTJdkXIkzsrHvw9fwG53-NxLeo_9pxOorBLDJpVyCuX5bEFnsJMu53kxO7dz6hMNmsAW |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=Proceedings+of+the+38th+annual+international+symposium+on+Computer+architecture&rft.atitle=Rapid+identification+of+architectural+bottlenecks+via+precise+event+counting&rft.au=Demme%2C+John&rft.au=Sethumadhavan%2C+Simha&rft.series=ACM+Conferences&rft.date=2011-06-04&rft.pub=ACM&rft.isbn=9781450304726&rft.spage=353&rft.epage=364&rft_id=info:doi/10.1145%2F2000064.2000107 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1063-6897&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1063-6897&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1063-6897&client=summon |

