INSTA: An Ultra-Fast, Differentiable, Statistical Static Timing Analysis Engine for Industrial Physical Design Applications

Prior GPU-accelerated Static Timing Analysis (GPU-STA) works all struggle to find industrial adoption, primarily because they aim to build standalone timing engines that can never emulate the proprietary delay models used in commercial tools. In this paper, we adopt a different philosophy by present...

Full description

Saved in:
Bibliographic Details
Published in:2025 62nd ACM/IEEE Design Automation Conference (DAC) pp. 1 - 7
Main Authors: Lu, Yi-Chen, Guo, Zhizheng, Kunal, Kishor, Liang, Rongjian, Ren, Haoxing
Format: Conference Proceeding
Language:English
Published: IEEE 22.06.2025
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Prior GPU-accelerated Static Timing Analysis (GPU-STA) works all struggle to find industrial adoption, primarily because they aim to build standalone timing engines that can never emulate the proprietary delay models used in commercial tools. In this paper, we adopt a different philosophy by presenting INSTA, the first-ever differentiable, statistical GPU-STA engine that achieves unprecedented accuracy and scalability by a one-time initialization from any reference tool, bringing two transformative capabilities to Physical Design (PD): (1) rapid, high-fidelity timing analysis for incremental netlist update, and (2) gradient-based truly-global timing optimization at scale. Notably, INSTA demonstrates a near-perfect 0.999 correlation with an industryleading signoff tool on a 15 -million-pin design in a commercial 3 nm node with runtime under 0.1 seconds. Experimental results showcase INSTA's capability through three PD applications: (1) serving as a fast evaluator in an industrial gate sizing flow, achieving \mathbf{2 5 x} faster incremental update_timing runtime with almost no accuracy loss; (2) INSTA-Size, a gradient-based gate sizer that achieves up to \mathbf{1 5 \%} better Total Negative Slack (TNS) than the reference signoff engine by sizing 68 \% fewer amount of cells; and (3) INSTA-Place, a differentiable timingdriven global placer that outperforms the state-of-the-art net-weighting placer by up to 16% in Half-Perimeter Wirelegnth (HPWL) and 59.4% in TNS on the ICCAD'15 benchmark [15].
DOI:10.1109/DAC63849.2025.11132858