An Efficient Training Accelerator for Transformers With Hardware-Algorithm Co-Optimization
Transformers have achieved significant success in deep learning, and training Transformers efficiently on resource-constrained platforms has been attracting continuous attention for domain adaptions and privacy concerns. However, deploying Transformers training on these platforms is still challengin...
Gespeichert in:
| Veröffentlicht in: | IEEE transactions on very large scale integration (VLSI) systems Jg. 31; H. 11; S. 1788 - 1801 |
|---|---|
| Hauptverfasser: | , , , |
| Format: | Journal Article |
| Sprache: | Englisch |
| Veröffentlicht: |
New York
IEEE
01.11.2023
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Schlagworte: | |
| ISSN: | 1063-8210, 1557-9999 |
| Online-Zugang: | Volltext |
| Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
| Abstract | Transformers have achieved significant success in deep learning, and training Transformers efficiently on resource-constrained platforms has been attracting continuous attention for domain adaptions and privacy concerns. However, deploying Transformers training on these platforms is still challenging due to its dynamic workloads, intensive computations, and massive memory accesses. To address these issues, we propose an Efficient Training Accelerator for TRansformers (TRETA) through a hardware-algorithm co-optimization strategy. First, a hardware-friendly mixed-precision training algorithm is presented based on a compact and efficient data format, which significantly reduces the computation and memory requirements. Second, a flexible and scalable architecture is proposed to achieve high utilization of computing resources when processing arbitrary irregular general matrix multiplication (GEMM) operations during training. These irregular GEMMs lead to severe under-utilization when simply mapped on traditional systolic architectures. Third, we develop training-oriented architectures for the crucial Softmax and layer normalization functions in Transformers, respectively. These area-efficient modules have unified and flexible microarchitectures to meet various computation requirements of different training phases. Finally, TRETA is implemented under Taiwan Semiconductor Manufacturing Company (TSMC) 28-nm technology and evaluated on multiple benchmarks. The experimental results show that our training framework achieves the same accuracy as the full precision baseline. Moreover, TRETA can achieve 14.71 tera operations per second (TOPS) and 3.31 TOPS/W in terms of throughput and energy efficiency, respectively. Compared with prior arts, the proposed design shows 1.4-<inline-formula> <tex-math notation="LaTeX">24.5\times </tex-math></inline-formula> speedup and 1.5-<inline-formula> <tex-math notation="LaTeX">25.4\times </tex-math></inline-formula> energy efficiency improvement. |
|---|---|
| AbstractList | Transformers have achieved significant success in deep learning, and training Transformers efficiently on resource-constrained platforms has been attracting continuous attention for domain adaptions and privacy concerns. However, deploying Transformers training on these platforms is still challenging due to its dynamic workloads, intensive computations, and massive memory accesses. To address these issues, we propose an Efficient Training Accelerator for TRansformers (TRETA) through a hardware-algorithm co-optimization strategy. First, a hardware-friendly mixed-precision training algorithm is presented based on a compact and efficient data format, which significantly reduces the computation and memory requirements. Second, a flexible and scalable architecture is proposed to achieve high utilization of computing resources when processing arbitrary irregular general matrix multiplication (GEMM) operations during training. These irregular GEMMs lead to severe under-utilization when simply mapped on traditional systolic architectures. Third, we develop training-oriented architectures for the crucial Softmax and layer normalization functions in Transformers, respectively. These area-efficient modules have unified and flexible microarchitectures to meet various computation requirements of different training phases. Finally, TRETA is implemented under Taiwan Semiconductor Manufacturing Company (TSMC) 28-nm technology and evaluated on multiple benchmarks. The experimental results show that our training framework achieves the same accuracy as the full precision baseline. Moreover, TRETA can achieve 14.71 tera operations per second (TOPS) and 3.31 TOPS/W in terms of throughput and energy efficiency, respectively. Compared with prior arts, the proposed design shows 1.4–[Formula Omitted] speedup and 1.5–[Formula Omitted] energy efficiency improvement. Transformers have achieved significant success in deep learning, and training Transformers efficiently on resource-constrained platforms has been attracting continuous attention for domain adaptions and privacy concerns. However, deploying Transformers training on these platforms is still challenging due to its dynamic workloads, intensive computations, and massive memory accesses. To address these issues, we propose an Efficient Training Accelerator for TRansformers (TRETA) through a hardware-algorithm co-optimization strategy. First, a hardware-friendly mixed-precision training algorithm is presented based on a compact and efficient data format, which significantly reduces the computation and memory requirements. Second, a flexible and scalable architecture is proposed to achieve high utilization of computing resources when processing arbitrary irregular general matrix multiplication (GEMM) operations during training. These irregular GEMMs lead to severe under-utilization when simply mapped on traditional systolic architectures. Third, we develop training-oriented architectures for the crucial Softmax and layer normalization functions in Transformers, respectively. These area-efficient modules have unified and flexible microarchitectures to meet various computation requirements of different training phases. Finally, TRETA is implemented under Taiwan Semiconductor Manufacturing Company (TSMC) 28-nm technology and evaluated on multiple benchmarks. The experimental results show that our training framework achieves the same accuracy as the full precision baseline. Moreover, TRETA can achieve 14.71 tera operations per second (TOPS) and 3.31 TOPS/W in terms of throughput and energy efficiency, respectively. Compared with prior arts, the proposed design shows 1.4-<inline-formula> <tex-math notation="LaTeX">24.5\times </tex-math></inline-formula> speedup and 1.5-<inline-formula> <tex-math notation="LaTeX">25.4\times </tex-math></inline-formula> energy efficiency improvement. |
| Author | Wang, Zhongfeng Wang, Meiqi Shao, Haikuo Lu, Jinming |
| Author_xml | – sequence: 1 givenname: Haikuo orcidid: 0009-0008-6965-3436 surname: Shao fullname: Shao, Haikuo email: hkshao@smail.nju.edu.cn organization: School of Electronic Science and Engineering, Nanjing University, Nanjing, China – sequence: 2 givenname: Jinming orcidid: 0000-0002-7134-6514 surname: Lu fullname: Lu, Jinming email: jmlu@smail.nju.edu.cn organization: School of Electronic Science and Engineering, Nanjing University, Nanjing, China – sequence: 3 givenname: Meiqi orcidid: 0000-0001-9553-3640 surname: Wang fullname: Wang, Meiqi email: wangmq53@mail.sysu.edu.cn organization: School of Integrated Circuits, Sun Yat-sen University, Shenzhen, China – sequence: 4 givenname: Zhongfeng orcidid: 0000-0002-7227-4786 surname: Wang fullname: Wang, Zhongfeng email: zfwang@nju.edu.cn organization: School of Electronic Science and Engineering, Nanjing University, Nanjing, China |
| BookMark | eNp9kE1LAzEQhoNUsK3-AfGw4HnrJNlss8dSqi0UPFgVvCzZ7KSmtNmaTRH99aYfB_HgQJhJMs98vD3ScY1DQq4pDCiF4m7xMn-aDRgwPuAchMiLM9KlQgzTIlonxpDzVDIKF6TXtisAmmUFdMnbyCUTY6y26EKy8Mo665bJSGtco1eh8YmJJ364NgYb9G3yasN7MlW-_lQe09F62fj4sknGTfq4DXZjv1Wwjbsk50atW7w6-T55vp8sxtN0_vgwG4_mqWZFHtK6qJUaGuBIMwlFDSwTGfIqA6qlNFUVLxwzBqoqDMtzJpCCMJrLvKo11rxPbo91t7752GEbylWz8y62LJkcSgGSizxmyWOW9k3bejSltuEwZ4hLr0sK5V7J8qBkuVeyPCkZUfYH3Xq7Uf7rf-jmCFlE_AUwQWlO-Q_qPoJt |
| CODEN | IEVSE9 |
| CitedBy_id | crossref_primary_10_1109_JETCAS_2025_3555970 crossref_primary_10_1109_JETCAS_2025_3575272 crossref_primary_10_1109_TVLSI_2025_3552534 crossref_primary_10_1109_TVLSI_2025_3553069 crossref_primary_10_1109_TCSII_2025_3591633 crossref_primary_10_1109_TVLSI_2024_3432403 crossref_primary_10_1109_TVLSI_2025_3561000 |
| Cites_doi | 10.1109/TCSI.2020.3021397 10.1109/JSSC.2023.3234893 10.1109/CVPR42600.2020.00807 10.1109/CVPR.2016.90 10.1609/aaai.v35i4.16462 10.1109/HPCA47549.2020.00015 10.18653/v1/W18-6313 10.1109/TPDS.2022.3149787 10.1109/VLSIC.2018.8502276 10.1109/ISPASS48437.2020.00016 10.1109/ISSCC.2019.8662302 10.1007/s11263-021-01453-z 10.1109/CVPR42600.2020.00204 10.1109/JSSC.2021.3120113 10.1109/5.726791 10.1145/3079856.3080246 10.1109/ICASSP39728.2021.9413535 10.1109/ICCV48922.2021.00986 10.1109/CVPR.2009.5206848 10.1109/VLSICircuits18222.2020.9162917 10.21236/ADA273556 10.1109/ISCA52012.2021.00061 10.1109/ICASSP.2018.8462506 10.1109/OJSSCS.2021.3119554 |
| ContentType | Journal Article |
| Copyright | Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023 |
| Copyright_xml | – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2023 |
| DBID | 97E RIA RIE AAYXX CITATION 7SP 8FD L7M |
| DOI | 10.1109/TVLSI.2023.3305569 |
| DatabaseName | IEEE Xplore (IEEE) IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE/IET Electronic Library (IEL) (UW System Shared) CrossRef Electronics & Communications Abstracts Technology Research Database Advanced Technologies Database with Aerospace |
| DatabaseTitle | CrossRef Technology Research Database Advanced Technologies Database with Aerospace Electronics & Communications Abstracts |
| DatabaseTitleList | Technology Research Database |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE/IET Electronic Library (IEL) (UW System Shared) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Engineering |
| EISSN | 1557-9999 |
| EndPage | 1801 |
| ExternalDocumentID | 10_1109_TVLSI_2023_3305569 10251161 |
| Genre | orig-research |
| GrantInformation_xml | – fundername: National Natural Science Foundation of China grantid: 62174084 funderid: 10.13039/501100001809 – fundername: National Key Research and Development Program of China grantid: 2022YFB4400604 funderid: 10.13039/501100012166 |
| GroupedDBID | -~X .DC 0R~ 29I 3EH 4.4 5GY 5VS 6IK 97E AAJGR AARMG AASAJ AAWTH ABAZT ABFSI ABQJQ ABVLG ACGFS ACIWK AENEX AETIX AGQYO AGSQL AHBIQ AI. AIBXA AKJIK AKQYR ALLEH ALMA_UNASSIGNED_HOLDINGS ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ CS3 DU5 E.L EBS EJD HZ~ H~9 ICLAB IEDLZ IFIPE IFJZH IPLJI JAVBF LAI M43 O9- OCL P2P RIA RIE RNS TN5 VH1 AAYXX CITATION 7SP 8FD L7M |
| ID | FETCH-LOGICAL-c296t-d9daa7f03e14809d02454e3b401c88fbb4e33e420ab9f26625e105fc386bdced3 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 9 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001068976200001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 1063-8210 |
| IngestDate | Sun Nov 09 08:55:55 EST 2025 Tue Nov 18 20:53:12 EST 2025 Sat Nov 29 03:36:21 EST 2025 Wed Aug 27 02:34:55 EDT 2025 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 11 |
| Language | English |
| License | https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html https://doi.org/10.15223/policy-029 https://doi.org/10.15223/policy-037 |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c296t-d9daa7f03e14809d02454e3b401c88fbb4e33e420ab9f26625e105fc386bdced3 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ORCID | 0000-0001-9553-3640 0009-0008-6965-3436 0000-0002-7134-6514 0000-0002-7227-4786 |
| PQID | 2878508356 |
| PQPubID | 85424 |
| PageCount | 14 |
| ParticipantIDs | proquest_journals_2878508356 crossref_citationtrail_10_1109_TVLSI_2023_3305569 crossref_primary_10_1109_TVLSI_2023_3305569 ieee_primary_10251161 |
| PublicationCentury | 2000 |
| PublicationDate | 2023-11-01 |
| PublicationDateYYYYMMDD | 2023-11-01 |
| PublicationDate_xml | – month: 11 year: 2023 text: 2023-11-01 day: 01 |
| PublicationDecade | 2020 |
| PublicationPlace | New York |
| PublicationPlace_xml | – name: New York |
| PublicationTitle | IEEE transactions on very large scale integration (VLSI) systems |
| PublicationTitleAbbrev | TVLSI |
| PublicationYear | 2023 |
| Publisher | IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Publisher_xml | – name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| References | ref35 ref12 ref34 ref15 ref37 ref36 ref33 wang (ref22) 2022; 65 ref10 lan (ref4) 2019 radford (ref5) 2019; 1 ref39 ref19 ref18 han (ref16) 2015 nagel (ref17) 2021 parmar (ref8) 2018 ref24 ref45 ref26 ref25 ref20 brown (ref14) 2020 ref41 micikevicius (ref23) 2017 ref21 lu (ref30) 2022 liu (ref3) 2019 ref28 ref27 devlin (ref2) 2019; abs 1810 ref29 drumond (ref31) 2018 ref7 noh (ref38) 2022 ref9 dosovitskiy (ref6) 2020 vaswani (ref1) 2017 krizhevsky (ref44) 2009 huang (ref11) 2018 radford (ref13) 2018 kalamkar (ref32) 2019 ref40 merity (ref42) 2016 hill (ref43) 2015 |
| References_xml | – ident: ref20 doi: 10.1109/TCSI.2020.3021397 – year: 2018 ident: ref11 article-title: Music transformer publication-title: arXiv 1809 04281 – ident: ref21 doi: 10.1109/JSSC.2023.3234893 – ident: ref19 doi: 10.1109/CVPR42600.2020.00807 – year: 2009 ident: ref44 publication-title: Learning multiple layers of features from tiny images – ident: ref15 doi: 10.1109/CVPR.2016.90 – year: 2022 ident: ref30 article-title: ETA: An efficient training accelerator for DNNs based on hardware-algorithm co-optimization publication-title: IEEE Trans Neural Netw Learn Syst – ident: ref33 doi: 10.1609/aaai.v35i4.16462 – year: 2019 ident: ref4 article-title: ALBERT: A lite BERT for self-supervised learning of language representations publication-title: arXiv 1909 11942 – ident: ref36 doi: 10.1109/HPCA47549.2020.00015 – ident: ref39 doi: 10.18653/v1/W18-6313 – year: 2015 ident: ref43 article-title: The Goldilocks principle: Reading children's books with explicit memory representations publication-title: arXiv 1511 02301 – ident: ref27 doi: 10.1109/TPDS.2022.3149787 – ident: ref25 doi: 10.1109/VLSIC.2018.8502276 – ident: ref40 doi: 10.1109/ISPASS48437.2020.00016 – ident: ref26 doi: 10.1109/ISSCC.2019.8662302 – ident: ref18 doi: 10.1007/s11263-021-01453-z – ident: ref24 doi: 10.1109/CVPR42600.2020.00204 – ident: ref35 doi: 10.1109/JSSC.2021.3120113 – year: 2020 ident: ref14 article-title: Language models are few-shot learners publication-title: arXiv 2005 14165 – start-page: 1 year: 2018 ident: ref31 article-title: Training DNNs with hybrid block floating point publication-title: Proc Adv Neural Inf Process Syst – year: 2018 ident: ref13 publication-title: Improving language understanding by generative pre-training – ident: ref29 doi: 10.1109/5.726791 – volume: abs 1810 year: 2019 ident: ref2 article-title: BERT: Pre-training of deep bidirectional transformers for language understanding publication-title: ArXiv – ident: ref28 doi: 10.1145/3079856.3080246 – year: 2019 ident: ref3 article-title: RoBERTa: A robustly optimized BERT pretraining approach publication-title: arXiv 1907 11692 – ident: ref10 doi: 10.1109/ICASSP39728.2021.9413535 – ident: ref7 doi: 10.1109/ICCV48922.2021.00986 – year: 2020 ident: ref6 article-title: An image is worth 16×16 words: Transformers for image recognition at scale publication-title: arXiv 2010 11929 – ident: ref45 doi: 10.1109/CVPR.2009.5206848 – year: 2021 ident: ref17 article-title: A white paper on neural network quantization publication-title: arXiv 2106 08295 – volume: 65 start-page: 1 year: 2022 ident: ref22 article-title: A 28 nm 27.5TOPS/W approximate-computing-based transformer processor with asymptotic sparsity speculating and out-of-order computing publication-title: IEEE Int Solid-State Circuits Conf (ISSCC) Dig Tech Papers – year: 2022 ident: ref38 article-title: FlexBlock: A flexible DNN training accelerator with multi-mode block floating point support publication-title: arXiv 2203 06673 – start-page: 1 year: 2015 ident: ref16 article-title: Deep compression: Compressing deep neural network with pruning, trained quantization and Huffman coding publication-title: Proc Int Conf Learn Represent – year: 2019 ident: ref32 article-title: A study of BFLOAT16 for deep learning training publication-title: arXiv 1905 12322 – ident: ref34 doi: 10.1109/VLSICircuits18222.2020.9162917 – ident: ref41 doi: 10.21236/ADA273556 – year: 2017 ident: ref1 article-title: Attention is all you need publication-title: Proc NIPS – ident: ref37 doi: 10.1109/ISCA52012.2021.00061 – year: 2016 ident: ref42 article-title: Pointer sentinel mixture models publication-title: arXiv 1609 07843 – ident: ref9 doi: 10.1109/ICASSP.2018.8462506 – start-page: 4055 year: 2018 ident: ref8 article-title: Image transformer publication-title: Proc 35th Int Conf Mach Learn (ICML) – ident: ref12 doi: 10.1109/OJSSCS.2021.3119554 – year: 2017 ident: ref23 article-title: Mixed precision training publication-title: arXiv 1710 03740 – volume: 1 start-page: 9 year: 2019 ident: ref5 article-title: Language models are unsupervised multitask learners publication-title: OpenAIRE blog |
| SSID | ssj0014490 |
| Score | 2.4823492 |
| Snippet | Transformers have achieved significant success in deep learning, and training Transformers efficiently on resource-constrained platforms has been attracting... |
| SourceID | proquest crossref ieee |
| SourceType | Aggregation Database Enrichment Source Index Database Publisher |
| StartPage | 1788 |
| SubjectTerms | Algorithm-hardware codesign Algorithms Computational modeling Computer architecture Computer memory Data models Energy efficiency general matrix multiplication (GEMM) Hardware Memory management nonlinear function Optimization Platforms Task analysis Technology assessment Training training accelerator Transformer Transformers |
| Title | An Efficient Training Accelerator for Transformers With Hardware-Algorithm Co-Optimization |
| URI | https://ieeexplore.ieee.org/document/10251161 https://www.proquest.com/docview/2878508356 |
| Volume | 31 |
| WOSCitedRecordID | wos001068976200001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVIEE databaseName: IEEE/IET Electronic Library (IEL) (UW System Shared) customDbUrl: eissn: 1557-9999 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0014490 issn: 1063-8210 databaseCode: RIE dateStart: 19930101 isFulltext: true titleUrlDefault: https://ieeexplore.ieee.org/ providerName: IEEE |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3PS8MwFA46POjBnxOnU3LwJplt03bJcYwNhTEFpw4vpUkTHWytdJ3--76k3ZiIgre2JKHkNXnvy-v3PYQumacdoZkkngwEAJTAITzwPSKYG7OYSu1Kq64_aA-HbDzm9xVZ3XJhlFL25zPVMpc2l59kcmGOymCFm4DYgJ3NdjssyVqrlIHv81J6IKSEAZBZMmQcfj16Gjzctkyh8Ba1klj8mxeyZVV-7MXWwfT3_vlq-2i3iiRxpzT9AdpQ6SHaWdMXPEIvnRT3rEYEdMajqhoE7kgJzsbm1zHErHi0DF4hFMTPk-INm3z-Z5wr0pm-Zjk8meFuRu5ge5lVvM06euz3Rt0bUhVTINLjYUESnsRxWztUAQByeGJSrr6iAvCVZEwLATdU-Z4TC67Ba3uBgtBLS8pCkUiV0GNUS7NUnSDsxp7mXCaAhKQfiJDBiBxwifbcOGG-aiB3ObmRrJTGTcGLaWQRh8Mja5DIGCSqDNJAV6s-76XOxp-t68YEay3L2W-g5tKIUbUW5xFgQmZE74Pw9JduZ2jbjF5SDJuoVuQLdY625EcxmecX9jP7AsrczzY |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3dS8MwEA8yBfXBz4nTqXnwTTL7veSxjI0N5xSsOnwpTZroYB-ydfrve0m7MREF39qSa8tdk7tfrvc7hC6poyyuqCCO8DkAFN8izPccwqmd0MQVyhaGXb9b7_Vov8_ui2J1UwsjpTQ_n8maPjS5_HQi5nqrDGa4Dog12Fn3Pc-x8nKtZdLA81hOPhC4hAKUWdTIWOw6euo-dGq6VXjNNaRY7JsfMo1VfqzGxsW0dv_5cntop4glcZgbfx-tyfEB2l5hGDxEL-EYNw1LBAjjqOgHgUMhwN2YDDuGqBVHi_AVgkH8PMjesM7ofyZTScLh62QKV0a4MSF3sMCMisrNMnpsNaNGmxTtFIhwWJCRlKVJUleWKwECWSzVSVdPuhwQlqBUcQ4nrgS1Jpwp8NuOLyH4UsKlAU-FTN0jVBpPxvIYYTtxFGMiBSwkPJ8HFO7IAJkox05S6skKshfKjUXBNa5bXgxjgzksFhuDxNogcWGQCrpayrznTBt_ji5rE6yMzLVfQdWFEeNiNs5iQIVU0977wckvYhdosx3dduNup3dzirb0k_KCwyoqZdO5PEMb4iMbzKbn5pP7ArwO0n0 |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=An+Efficient+Training+Accelerator+for+Transformers+With+Hardware-Algorithm+Co-Optimization&rft.jtitle=IEEE+transactions+on+very+large+scale+integration+%28VLSI%29+systems&rft.au=Shao%2C+Haikuo&rft.au=Lu%2C+Jinming&rft.au=Wang%2C+Meiqi&rft.au=Wang%2C+Zhongfeng&rft.date=2023-11-01&rft.pub=The+Institute+of+Electrical+and+Electronics+Engineers%2C+Inc.+%28IEEE%29&rft.issn=1063-8210&rft.eissn=1557-9999&rft.volume=31&rft.issue=11&rft.spage=1788&rft_id=info:doi/10.1109%2FTVLSI.2023.3305569&rft.externalDBID=NO_FULL_TEXT |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1063-8210&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1063-8210&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1063-8210&client=summon |