OptiWISE: Combining Sampling and Instrumentation for Granular CPI Analysis
Despite decades of improvement in compiler technology, it remains necessary to profile applications to improve performance. Existing profiling tools typically either sample hardware performance counters or instrument the program with extra instructions to analyze its execution. Both techniques are v...
Uložené v:
| Vydané v: | Proceedings / International Symposium on Code Generation and Optimization s. 373 - 385 |
|---|---|
| Hlavní autori: | , , , , , , |
| Médium: | Konferenčný príspevok.. |
| Jazyk: | English |
| Vydavateľské údaje: |
IEEE
02.03.2024
|
| Predmet: | |
| ISSN: | 2643-2838 |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Abstract | Despite decades of improvement in compiler technology, it remains necessary to profile applications to improve performance. Existing profiling tools typically either sample hardware performance counters or instrument the program with extra instructions to analyze its execution. Both techniques are valuable with different strengths and weaknesses, but do not always correctly identify optimization opportunities. We present OPTIWISE, a profiling tool that runs the program twice, once with low-overhead sampling to accurately measure performance, and once with instrumentation to accurately capture control flow and execution counts. OPTIWISE then combines this information to give a highly detailed per-instruction CPI metric by computing the ratio of samples to execution counts, as well as aggregated information such as costs per loop, source-code line, or function. We evaluate OPTIWISE to show it has an overhead of 8.1× geomean, and 57× worst case on SPEC CPU2017 benchmarks. Using OPTIWISE, we present case studies of optimizing selected SPEC benchmarks on a modern x86 server processor. The per-instruction CPI metrics quickly reveal problems such as costly mispredicted branches and cache misses, which we use to manually optimize for effective performance improvements. |
|---|---|
| AbstractList | Despite decades of improvement in compiler technology, it remains necessary to profile applications to improve performance. Existing profiling tools typically either sample hardware performance counters or instrument the program with extra instructions to analyze its execution. Both techniques are valuable with different strengths and weaknesses, but do not always correctly identify optimization opportunities. We present OPTIWISE, a profiling tool that runs the program twice, once with low-overhead sampling to accurately measure performance, and once with instrumentation to accurately capture control flow and execution counts. OPTIWISE then combines this information to give a highly detailed per-instruction CPI metric by computing the ratio of samples to execution counts, as well as aggregated information such as costs per loop, source-code line, or function. We evaluate OPTIWISE to show it has an overhead of 8.1× geomean, and 57× worst case on SPEC CPU2017 benchmarks. Using OPTIWISE, we present case studies of optimizing selected SPEC benchmarks on a modern x86 server processor. The per-instruction CPI metrics quickly reveal problems such as costly mispredicted branches and cache misses, which we use to manually optimize for effective performance improvements. |
| Author | Erdos, Marton Vougioukas, Ilias Jones, Timothy M. Chadwick, Alex W. Bora, Utpal Gabrielli, Giacomo Guo, Yuxin |
| Author_xml | – sequence: 1 givenname: Yuxin surname: Guo fullname: Guo, Yuxin email: yg413@cl.cam.ac.uk organization: University of Cambridge,UK – sequence: 2 givenname: Alex W. surname: Chadwick fullname: Chadwick, Alex W. email: alex.chadwick@cl.cam.ac.uk organization: University of Cambridge,UK – sequence: 3 givenname: Marton surname: Erdos fullname: Erdos, Marton email: marton.erdos@cl.cam.ac.uk organization: University of Cambridge,UK – sequence: 4 givenname: Utpal surname: Bora fullname: Bora, Utpal email: ub230@cl.cam.ac.uk organization: University of Cambridge,UK – sequence: 5 givenname: Ilias surname: Vougioukas fullname: Vougioukas, Ilias email: ilias.vougioukas@arm.com organization: Arm,USA – sequence: 6 givenname: Giacomo surname: Gabrielli fullname: Gabrielli, Giacomo email: giacomo.gabrielli@arm.com organization: Arm,UK – sequence: 7 givenname: Timothy M. surname: Jones fullname: Jones, Timothy M. email: timothy.jones@cl.cam.ac.uk organization: University of Cambridge,UK |
| BookMark | eNo1j9FKwzAYRqMouM29gUheoDXJnzSJd6NstTKoMMXLkbSJRNq0tN3F3l5FvfrOxeHAt0RXsY8OoXtKUkqJfsiLSsgMSMoI4yklnHMp6QVaa6kVCAJaEK0v0YJlHBKmQN2g5TR9EsIkp7BAz9Uwh_fysH3Eed_ZEEP8wAfTDe0PmNjgMk7zeOpcnM0c-oh9P-JiNPHUmhHnLyXeRNOepzDdomtv2smt_3aF3nbb1_wp2VdFmW_2iQFQc5JxywmradZQA7WjNCPKe-8EM9JqC0yz2llJ6tpqwaWnyoAH0ahvAVjTwArd_XaDc-44jKEz4_n4_x2-ADonT68 |
| ContentType | Conference Proceeding |
| DBID | 6IE 6IL CBEJK RIE RIL |
| DOI | 10.1109/CGO57630.2024.10444771 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Xplore POP ALL IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Computer Science |
| EISBN | 9798350395099 |
| EISSN | 2643-2838 |
| EndPage | 385 |
| ExternalDocumentID | 10444771 |
| Genre | orig-research |
| GroupedDBID | 29O 6IE 6IF 6IK 6IL 6IN AAJGR ABLEC ADZIZ ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK CHZPO IEGSK IPLJI OCL RIE RIL |
| ID | FETCH-LOGICAL-a338t-64b402c16d1a3ce11608fffe52a7b9b3292ceb70ccb9547f18a3f35d852a32dd3 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 3 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001179185400030&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| IngestDate | Wed Aug 27 03:08:40 EDT 2025 |
| IsDoiOpenAccess | false |
| IsOpenAccess | true |
| IsPeerReviewed | false |
| IsScholarly | true |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-a338t-64b402c16d1a3ce11608fffe52a7b9b3292ceb70ccb9547f18a3f35d852a32dd3 |
| OpenAccessLink | https://www.repository.cam.ac.uk/handle/1810/363050 |
| PageCount | 13 |
| ParticipantIDs | ieee_primary_10444771 |
| PublicationCentury | 2000 |
| PublicationDate | 2024-March-2 |
| PublicationDateYYYYMMDD | 2024-03-02 |
| PublicationDate_xml | – month: 03 year: 2024 text: 2024-March-2 day: 02 |
| PublicationDecade | 2020 |
| PublicationTitle | Proceedings / International Symposium on Code Generation and Optimization |
| PublicationTitleAbbrev | CGO |
| PublicationYear | 2024 |
| Publisher | IEEE |
| Publisher_xml | – name: IEEE |
| SSID | ssj0027413 ssib057256076 |
| Score | 2.2695713 |
| Snippet | Despite decades of improvement in compiler technology, it remains necessary to profile applications to improve performance. Existing profiling tools typically... |
| SourceID | ieee |
| SourceType | Publisher |
| StartPage | 373 |
| SubjectTerms | Benchmark testing Codes Costs Hardware Instruments Measurement Optimization Servers |
| Title | OptiWISE: Combining Sampling and Instrumentation for Granular CPI Analysis |
| URI | https://ieeexplore.ieee.org/document/10444771 |
| WOSCitedRecordID | wos001179185400030&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV27TsMwFLWgYmAqjyLe8sDqEj8S26xRW8rQViqIbpVfkRhIERS-n2s3KWJgYLOSKEp845xzk3vuQehGhdjVO2QEuKolQrmMKGEK4oGcM-crnkuXzCbkZKIWCz1rxOpJCxNCSMVnoR-H6V--X7nP-KkMVrgQQkbF-K6UxUas1T48uYzgHbF1m21R3kiCaaZvy9EUqDXPICVkot-e6ZenSoKUYfefF3OAej_iPDzbws4h2gn1Eeq27gy4WazH6GEKb4Pn8Xxwh2GnTUYQeG5iBTkMTO3xOHWPfW3URzUG_opHgF2xMhWXszFuO5b00NNw8Fjek8Y5gRhIOdekEBbyQkcLTw13gdIiU1VVhZwZabXlTDMXrMycszoXsqLKcAiLV3AAZ97zE9SpV3U4RbjwkVZJnVsgW5R7LW2uLCC9FMEJ489QL87N8m3THGPZTsv5H9sv0H6MQCrjYpeoA_cZrtCe-1q_fLxfp5B-A5Iwn1E |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3JTsMwELVQQYJTWYrY8YGrSxw7scO16hIobaUW0VvlLVIPpAgK38_YJEUcOHCzkihKPHHem2TePIRupPNdvV1EgKtqwqWJiOQqJRbIeWxswRJhgtmEGI3kfJ5NKrF60MI450LxmWv7YfiXb1fmw38qgxXOORdeMb7trbMquVb9-CTCw7dH102-RVklCqZRdtvpj4FcswiSwpi363P9clUJoNJr_vNy9lHrR56HJxvgOUBbrjxEzdqfAVfL9Qjdj-F98JxPu3cYdupgBYGnyteQw0CVFuehf-xLpT8qMTBY3Af08rWpuDPJcd2zpIWeet1ZZ0Aq7wSiIOlck5RryAwNTS1VzDhK00gWReGSWAmdaRZnsXFaRMboLOGioFIxCIyVcACLrWXHqFGuSneCcGo9sRJZooFuUWYzoROpAesFd4Yre4pafm4Wr9_tMRb1tJz9sf0a7Q5mj8PFMB89nKM9H41Q1BVfoAbcs7tEO-ZzvXx_uwrh_QISB6Ka |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=Proceedings+%2F+International+Symposium+on+Code+Generation+and+Optimization&rft.atitle=OptiWISE%3A+Combining+Sampling+and+Instrumentation+for+Granular+CPI+Analysis&rft.au=Guo%2C+Yuxin&rft.au=Chadwick%2C+Alex+W.&rft.au=Erdos%2C+Marton&rft.au=Bora%2C+Utpal&rft.date=2024-03-02&rft.pub=IEEE&rft.eissn=2643-2838&rft.spage=373&rft.epage=385&rft_id=info:doi/10.1109%2FCGO57630.2024.10444771&rft.externalDocID=10444771 |