GPU-accelerated Path-based Timing Analysis
Path-based Analysis (PBA) is an important step in the design closure flow for reducing slack pessimism. However, PBA is extremely time-consuming. Recent years have seen many parallel PBA algorithms, but most of them are architecturally constrained by the CPU parallelism and do not scale beyond a few...
Uloženo v:
| Vydáno v: | 2021 58th ACM/IEEE Design Automation Conference (DAC) s. 721 - 726 |
|---|---|
| Hlavní autoři: | , , , |
| Médium: | Konferenční příspěvek |
| Jazyk: | angličtina |
| Vydáno: |
IEEE
05.12.2021
|
| Témata: | |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Abstract | Path-based Analysis (PBA) is an important step in the design closure flow for reducing slack pessimism. However, PBA is extremely time-consuming. Recent years have seen many parallel PBA algorithms, but most of them are architecturally constrained by the CPU parallelism and do not scale beyond a few threads. To overcome this challenge, we propose in this paper a new fast and accurate PBA algorithm by harnessing the power of graphics processing unit (GPU). We introduce GPU-efficient data structures, high-performance kernels, and efficient CPU-GPU task decomposition strateiges, to accelerate PBA to a new performance milestone. Experimental results show that our method can speed up the state-of-the-art algorithm by 543\times on a design of 1.6 million gates with exact accuracy. At the extreme, our method of 1 CPU and 1 GPU outperforms the state-of-the-art algorithm of 40 CPUs by 25-45\times. |
|---|---|
| AbstractList | Path-based Analysis (PBA) is an important step in the design closure flow for reducing slack pessimism. However, PBA is extremely time-consuming. Recent years have seen many parallel PBA algorithms, but most of them are architecturally constrained by the CPU parallelism and do not scale beyond a few threads. To overcome this challenge, we propose in this paper a new fast and accurate PBA algorithm by harnessing the power of graphics processing unit (GPU). We introduce GPU-efficient data structures, high-performance kernels, and efficient CPU-GPU task decomposition strateiges, to accelerate PBA to a new performance milestone. Experimental results show that our method can speed up the state-of-the-art algorithm by 543\times on a design of 1.6 million gates with exact accuracy. At the extreme, our method of 1 CPU and 1 GPU outperforms the state-of-the-art algorithm of 40 CPUs by 25-45\times. |
| Author | Huang, Tsung-Wei Guo, Guannan Wong, Martin Lin, Yibo |
| Author_xml | – sequence: 1 givenname: Guannan surname: Guo fullname: Guo, Guannan organization: University of Illinois at Urbana-Champaign,Department of Electrical and Computer Engineering,IL,USA – sequence: 2 givenname: Tsung-Wei surname: Huang fullname: Huang, Tsung-Wei organization: University of Utah,Department of Electrical and Computer Engineering,Salt Lake City,UT,USA – sequence: 3 givenname: Yibo surname: Lin fullname: Lin, Yibo organization: Peking University,Department of Computer Science,Beijing,China – sequence: 4 givenname: Martin surname: Wong fullname: Wong, Martin organization: University of Illinois at Urbana-Champaign,Department of Electrical and Computer Engineering,IL,USA |
| BookMark | eNotj81Kw0AYRUewoLZ5AhG6FhLn_2cZoq1CwS7adflm5hsdSKNksunbG7Cbe8_qcu4DuR1-BiTkidGGMepeXtuOWWpkwylnjVNWC6ZvSOWMZVorKbiR9I5UpWRPNVVWznlPnrf7Yw0hYI8jTBjXe5i-aw9lxkM-5-Fr3Q7QX0ouK7JI0Besrr0kx83boXuvd5_bj67d1cCtmWqBIdggedI2Ss-dCMJb5Mkl60wAkxhI7ZWKiVoK0cfIGBeYZncDnmuxJI__uxkRT79jPsN4OV0fiT-88kLL |
| ContentType | Conference Proceeding |
| DBID | 6IE 6IH CBEJK RIE RIO |
| DOI | 10.1109/DAC18074.2021.9586316 |
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan (POP) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP) 1998-present |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| EISBN | 9781665432740 1665432748 |
| EndPage | 726 |
| ExternalDocumentID | 9586316 |
| Genre | orig-research |
| GroupedDBID | 6IE 6IH ACM ALMA_UNASSIGNED_HOLDINGS CBEJK RIE RIO |
| ID | FETCH-LOGICAL-a287t-3ecc8c42f68d4b293c3b8e2f9f897ca7f1a46b55df080adbdd1123ef0747ab263 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 23 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000766079700121&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| IngestDate | Wed Aug 27 02:28:29 EDT 2025 |
| IsPeerReviewed | false |
| IsScholarly | true |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-a287t-3ecc8c42f68d4b293c3b8e2f9f897ca7f1a46b55df080adbdd1123ef0747ab263 |
| PageCount | 6 |
| ParticipantIDs | ieee_primary_9586316 |
| PublicationCentury | 2000 |
| PublicationDate | 2021-Dec.-5 |
| PublicationDateYYYYMMDD | 2021-12-05 |
| PublicationDate_xml | – month: 12 year: 2021 text: 2021-Dec.-5 day: 05 |
| PublicationDecade | 2020 |
| PublicationTitle | 2021 58th ACM/IEEE Design Automation Conference (DAC) |
| PublicationTitleAbbrev | DAC |
| PublicationYear | 2021 |
| Publisher | IEEE |
| Publisher_xml | – name: IEEE |
| SSID | ssib060584060 |
| Score | 2.3195329 |
| Snippet | Path-based Analysis (PBA) is an important step in the design closure flow for reducing slack pessimism. However, PBA is extremely time-consuming. Recent years... |
| SourceID | ieee |
| SourceType | Publisher |
| StartPage | 721 |
| SubjectTerms | Data structures Graphics processing units Instruction sets Logic gates Parallel processing Runtime Timing |
| Title | GPU-accelerated Path-based Timing Analysis |
| URI | https://ieeexplore.ieee.org/document/9586316 |
| WOSCitedRecordID | wos000766079700121&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NSwMxEB3a4sGTSit-swdPYtr9yGaTo1Srp7KHFnorSSYRL1up2_5-J9ttRfDiLQRCmEngvclk3gDcSy88JsgZKukZF7liJnOeaUPcwUpLpA6bZhPFdCoXC1V24PFQC-Ocaz6fuWEYNrl8XNlNeCobqVyKLBFd6BaF2NVq7e9OyO4RNsVtkU4Sq9Hz0zgJUi8UBKbJsF37q4lKgyGTk__tfgqDn2K8qDzAzBl0XNWHh9dyzrS1BBpB6wGjkpgcC5CE0Sw06nqP9nIjA5hPXmbjN9a2PWCawpeaZeRVaXnqhURuCI5tZqRLvfJSFVYXPtFcmDxHT2xPo0EkzkT-DVL42qQiO4detarcBUQUDKByhFIaY24oVgryfCn3hOoixcxcQj_YufzcKVssWxOv_p6-huPgyuYzR34DvXq9cbdwZLf1x9f6rjmOb0UuipI |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NSwMxEB1qFfSk0orf7sGTmHY_smlyFLVWrGUPLfRWkkwiXlqprb_fyXZbEbx4C4EQZhJ4bzKZNwDX0guPCXKGSnrGRa6YyZxn2hB3sNISqcOy2URnMJDjsSpqcLuphXHOlZ_PXCsMy1w-zuwyPJW1VS5Flogt2M45T-NVtdb69oT8HqFTXJXpJLFqP9zdJ0HshcLANGlVq3-1USlRpLv_v_0PoPlTjhcVG6A5hJqbNuDmqRgxbS3BRlB7wKggLscCKGE0DK263qK14EgTRt3H4X2PVY0PmKYAZsEy8qu0PPVCIjcEyDYz0qVeeak6Vnd8orkweY6e-J5Gg0isiTwcxPC1SUV2BPXpbOqOIaJwAJUjnNIYc0PRUhDoS7knXBcpZuYEGsHOycdK22JSmXj69_QV7PaGr_1J_3nwcgZ7wa3l1478HOqL-dJdwI79Wrx_zi_Lo_kGsLCN2Q |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2021+58th+ACM%2FIEEE+Design+Automation+Conference+%28DAC%29&rft.atitle=GPU-accelerated+Path-based+Timing+Analysis&rft.au=Guo%2C+Guannan&rft.au=Huang%2C+Tsung-Wei&rft.au=Lin%2C+Yibo&rft.au=Wong%2C+Martin&rft.date=2021-12-05&rft.pub=IEEE&rft.spage=721&rft.epage=726&rft_id=info:doi/10.1109%2FDAC18074.2021.9586316&rft.externalDocID=9586316 |