GPU-accelerated Path-based Timing Analysis

Path-based Analysis (PBA) is an important step in the design closure flow for reducing slack pessimism. However, PBA is extremely time-consuming. Recent years have seen many parallel PBA algorithms, but most of them are architecturally constrained by the CPU parallelism and do not scale beyond a few...

Celý popis

Uložené v:

Podrobná bibliografia
Vydané v:	2021 58th ACM/IEEE Design Automation Conference (DAC) s. 721 - 726
Hlavní autori:	Guo, Guannan, Huang, Tsung-Wei, Lin, Yibo, Wong, Martin
Médium:	Konferenčný príspevok..
Jazyk:	English
Vydavateľské údaje:	IEEE 05.12.2021
Predmet:	Data structures Graphics processing units Instruction sets Logic gates Parallel processing Runtime Timing
On-line prístup:	Získať plný text
Tagy:	Pridať tag Žiadne tagy, Buďte prvý, kto otaguje tento záznam!

Abstract	Path-based Analysis (PBA) is an important step in the design closure flow for reducing slack pessimism. However, PBA is extremely time-consuming. Recent years have seen many parallel PBA algorithms, but most of them are architecturally constrained by the CPU parallelism and do not scale beyond a few threads. To overcome this challenge, we propose in this paper a new fast and accurate PBA algorithm by harnessing the power of graphics processing unit (GPU). We introduce GPU-efficient data structures, high-performance kernels, and efficient CPU-GPU task decomposition strateiges, to accelerate PBA to a new performance milestone. Experimental results show that our method can speed up the state-of-the-art algorithm by 543\times on a design of 1.6 million gates with exact accuracy. At the extreme, our method of 1 CPU and 1 GPU outperforms the state-of-the-art algorithm of 40 CPUs by 25-45\times.
AbstractList	Path-based Analysis (PBA) is an important step in the design closure flow for reducing slack pessimism. However, PBA is extremely time-consuming. Recent years have seen many parallel PBA algorithms, but most of them are architecturally constrained by the CPU parallelism and do not scale beyond a few threads. To overcome this challenge, we propose in this paper a new fast and accurate PBA algorithm by harnessing the power of graphics processing unit (GPU). We introduce GPU-efficient data structures, high-performance kernels, and efficient CPU-GPU task decomposition strateiges, to accelerate PBA to a new performance milestone. Experimental results show that our method can speed up the state-of-the-art algorithm by 543\times on a design of 1.6 million gates with exact accuracy. At the extreme, our method of 1 CPU and 1 GPU outperforms the state-of-the-art algorithm of 40 CPUs by 25-45\times.
Author	Huang, Tsung-Wei Guo, Guannan Wong, Martin Lin, Yibo
Author_xml	– sequence: 1 givenname: Guannan surname: Guo fullname: Guo, Guannan organization: University of Illinois at Urbana-Champaign,Department of Electrical and Computer Engineering,IL,USA – sequence: 2 givenname: Tsung-Wei surname: Huang fullname: Huang, Tsung-Wei organization: University of Utah,Department of Electrical and Computer Engineering,Salt Lake City,UT,USA – sequence: 3 givenname: Yibo surname: Lin fullname: Lin, Yibo organization: Peking University,Department of Computer Science,Beijing,China – sequence: 4 givenname: Martin surname: Wong fullname: Wong, Martin organization: University of Illinois at Urbana-Champaign,Department of Electrical and Computer Engineering,IL,USA
BookMark	eNotj81Kw0AYRUewoLZ5AhG6FhLn_2cZoq1CwS7adflm5hsdSKNksunbG7Cbe8_qcu4DuR1-BiTkidGGMepeXtuOWWpkwylnjVNWC6ZvSOWMZVorKbiR9I5UpWRPNVVWznlPnrf7Yw0hYI8jTBjXe5i-aw9lxkM-5-Fr3Q7QX0ouK7JI0Besrr0kx83boXuvd5_bj67d1cCtmWqBIdggedI2Ss-dCMJb5Mkl60wAkxhI7ZWKiVoK0cfIGBeYZncDnmuxJI__uxkRT79jPsN4OV0fiT-88kLL
ContentType	Conference Proceeding
DBID	6IE 6IH CBEJK RIE RIO
DOI	10.1109/DAC18074.2021.9586316
DatabaseName	IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan (POP) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP) 1998-present
DatabaseTitleList
Database_xml	– sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher
DeliveryMethod	fulltext_linktorsrc
EISBN	9781665432740 1665432748
EndPage	726
ExternalDocumentID	9586316
Genre	orig-research
GroupedDBID	6IE 6IH ACM ALMA_UNASSIGNED_HOLDINGS CBEJK RIE RIO
ID	FETCH-LOGICAL-a287t-3ecc8c42f68d4b293c3b8e2f9f897ca7f1a46b55df080adbdd1123ef0747ab263
IEDL.DBID	RIE
ISICitedReferencesCount	23
ISICitedReferencesURI	http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000766079700121&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
IngestDate	Wed Aug 27 02:28:29 EDT 2025
IsPeerReviewed	false
IsScholarly	true
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-a287t-3ecc8c42f68d4b293c3b8e2f9f897ca7f1a46b55df080adbdd1123ef0747ab263
PageCount	6
ParticipantIDs	ieee_primary_9586316
PublicationCentury	2000
PublicationDate	2021-Dec.-5
PublicationDateYYYYMMDD	2021-12-05
PublicationDate_xml	– month: 12 year: 2021 text: 2021-Dec.-5 day: 05
PublicationDecade	2020
PublicationTitle	2021 58th ACM/IEEE Design Automation Conference (DAC)
PublicationTitleAbbrev	DAC
PublicationYear	2021
Publisher	IEEE
Publisher_xml	– name: IEEE
SSID	ssib060584060
Score	2.3195329
Snippet	Path-based Analysis (PBA) is an important step in the design closure flow for reducing slack pessimism. However, PBA is extremely time-consuming. Recent years...
SourceID	ieee
SourceType	Publisher
StartPage	721
SubjectTerms	Data structures Graphics processing units Instruction sets Logic gates Parallel processing Runtime Timing
Title	GPU-accelerated Path-based Timing Analysis
URI	https://ieeexplore.ieee.org/document/9586316
WOSCitedRecordID	wos000766079700121&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NSwMxEB3a4sGTSit-swdPYtrdTXaTHKVaPUjZQyu9lWwmES-ttFt_v5PttiJ4kVxCIISZhLyZZOYNwC3HQFJl6PZz1rDAbsKU8MgECsIrJ1NbR1u8vcrxWM1mumjB_T4XxjlXB5-5fujWf_m4tJvwVDbQmcp5krehLWW-zdXanZ3wu0fYFDdJOkmsB48PwyRQvZATmCb9Zu6vIio1hoyO_rf6MfR-kvGiYg8zJ9Byiy7cPRdTZqwl0AhcDxgVZMmxAEkYTUKhrvdoRzfSg-noaTJ8YU3ZA2bIfakYJ60qK1KfKxQlwbHlpXKp115paY30iRF5mWXoydozWCKSzcSdD1T4pkxzfgqdxXLhziASynFqViRIbk9slBCohVdCe6OUyc6hG-Scf26ZLeaNiBd_D1_CYVBlHcyRXUGnWm3cNRzYr-pjvbqpt-Mb9oeKaw
linkProvider	IEEE
linkToHtml	http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NTwIxEJ0gmuhJDRi_3YMnY2F3O7vbHg2KGJHsAQw30u20xgsYBH-_7bJgTLyYXpomTTPTpm-mnXkDcM3Jk1Qpd_sZrZhnN2ECLTEkdHhlsliX0Rav_WwwEOOxzGtwu8mFMcaUwWem5bvlXz7N9NI_lbVlIlIepVuwnSDG4Spba316_P-eQ6ewStOJQtm-v-tEnuzFuYFx1Kpm_yqjUqJId_9_6x9A8ycdL8g3QHMINTNtwM1jPmJKawcbnu2BgtzZcsyDEgVDX6rrLVgTjjRh1H0YdnqsKnzAlHNgFow7vQqNsU0FYeEAWfNCmNhKK2SmVWYjhWmRJGSdvaeoIHJWEzfWk-GrIk75EdSns6k5hgCF4a5pjMg5PqESiCTRCpRWCaGSE2h4OScfK26LSSXi6d_DV7DbG770J_2nwfMZ7Hm1lqEdyTnUF_OluYAd_bV4_5xfllvzDV6kjbI
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2021+58th+ACM%2FIEEE+Design+Automation+Conference+%28DAC%29&rft.atitle=GPU-accelerated+Path-based+Timing+Analysis&rft.au=Guo%2C+Guannan&rft.au=Huang%2C+Tsung-Wei&rft.au=Lin%2C+Yibo&rft.au=Wong%2C+Martin&rft.date=2021-12-05&rft.pub=IEEE&rft.spage=721&rft.epage=726&rft_id=info:doi/10.1109%2FDAC18074.2021.9586316&rft.externalDocID=9586316