Hardware Implementation of Approximate Fixed-point Divider for Machine Learning Optimization Algorithm

Division operation is necessary for many applications, especially optimization algorithms for machine learning. Usually, a certain degree of loss is acceptable in calculating nonsignificant intermediate variables for a considerable speed improvement. This paper proposes a specialized divider to acce...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Asia Pacific Conference on Postgraduate Research in Microelectronics & Electronics (Online) s. 22 - 25
Hlavní autoři: Han, Gandong, Zhang, Weiyi, Niu, Liting, Zhang, Chun, Wang, Zhihua, Wang, Ziqiang
Médium: Konferenční příspěvek
Jazyk:angličtina
Vydáno: IEEE 11.11.2022
Témata:
ISSN:2159-2160
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Abstract Division operation is necessary for many applications, especially optimization algorithms for machine learning. Usually, a certain degree of loss is acceptable in calculating nonsignificant intermediate variables for a considerable speed improvement. This paper proposes a specialized divider to accelerate machine learning optimization algorithm implementation on hardware. Inspired by the fast inverse square root algorithm, we designed a hardware implementation method according to the algorithm, which generates an approximate division result with conversion between floating-point and fixed-point numbers and multiplication. This paper includes three versions of divider: fastDiv_accuracy, a conventional design with a 35% less delay and minimal error compared to delay-minimized standard divider from the Synopsys DesignWare library; fastDiv_area, an area-oriented design with a 67% less delay and acceptable error compared to the standard divider constrained to the same area size; fastDiv_speed, the fastest design with a 54% less delay compared to delay-minimized standard divider. All these three versions can be applied in deploying optimization algorithms in FPGA or ASIC design on demand.
AbstractList Division operation is necessary for many applications, especially optimization algorithms for machine learning. Usually, a certain degree of loss is acceptable in calculating nonsignificant intermediate variables for a considerable speed improvement. This paper proposes a specialized divider to accelerate machine learning optimization algorithm implementation on hardware. Inspired by the fast inverse square root algorithm, we designed a hardware implementation method according to the algorithm, which generates an approximate division result with conversion between floating-point and fixed-point numbers and multiplication. This paper includes three versions of divider: fastDiv_accuracy, a conventional design with a 35% less delay and minimal error compared to delay-minimized standard divider from the Synopsys DesignWare library; fastDiv_area, an area-oriented design with a 67% less delay and acceptable error compared to the standard divider constrained to the same area size; fastDiv_speed, the fastest design with a 54% less delay compared to delay-minimized standard divider. All these three versions can be applied in deploying optimization algorithms in FPGA or ASIC design on demand.
Author Zhang, Chun
Wang, Zhihua
Zhang, Weiyi
Niu, Liting
Wang, Ziqiang
Han, Gandong
Author_xml – sequence: 1
  givenname: Gandong
  surname: Han
  fullname: Han, Gandong
  organization: School of Integrated Circuits, Tsinghua -University,Beijing,China
– sequence: 2
  givenname: Weiyi
  surname: Zhang
  fullname: Zhang, Weiyi
  organization: School of Integrated Circuits, Tsinghua -University,Beijing,China
– sequence: 3
  givenname: Liting
  surname: Niu
  fullname: Niu, Liting
  organization: School of Integrated Circuits, Tsinghua -University,Beijing,China
– sequence: 4
  givenname: Chun
  surname: Zhang
  fullname: Zhang, Chun
  organization: School of Integrated Circuits, Tsinghua -University,Beijing,China
– sequence: 5
  givenname: Zhihua
  surname: Wang
  fullname: Wang, Zhihua
  email: zhihua@tsinghua.edu.cn
  organization: Research Institute of Tsinghua University in Shenzhen,Shenzhen,China
– sequence: 6
  givenname: Ziqiang
  surname: Wang
  fullname: Wang, Ziqiang
  email: wangziq@tsinghua.edu.cn
  organization: Research Institute of Tsinghua University in Shenzhen,Shenzhen,China
BookMark eNo1kLFOwzAUAA0CCSj9AwYPrCnPjuPEY1QorVRUBpgrN3luH2rsyLGg8PVUKky3nU53wy588MjYvYCJEGAeXiN1WA9kCw1aTSRIOREgQAGIMzY2ZSW0LlQB2sA5u5aiMJkUGq7YeBg-ACCXoEQO18zNbWy_bES-6Po9duiTTRQ8D47XfR_DgTqbkM_ogG3WB_KJP9IntRi5C5G_2GZHHvkSbfTkt3zVJ-ro5ySp99sQKe26W3bp7H7A8R9H7H329DadZ8vV82JaLzM6BqVM5CixsNpaMChAb6xApbGsSie0bZ0yG1maxihnTKUMOmgqLJuyqJRrdFHlI3Z38hIirvvjJhu_1_9r8l-xnV4Q
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/PrimeAsia56064.2022.10104001
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISBN 9781665450690
166545069X
EISSN 2159-2160
EndPage 25
ExternalDocumentID 10104001
Genre orig-research
GrantInformation_xml – fundername: Shenzhen Science and Technology Program
  grantid: JSGG20191129141019090,JCYJ20180306170609470,SGLH20180622095014688
  funderid: 10.13039/501100010877
GroupedDBID 6IE
6IF
6IL
6IN
AAWTH
ABLEC
ADZIZ
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IEGSK
OCL
RIE
RIL
ID FETCH-LOGICAL-i204t-13e2e5a6aa09e106ba1e46e787f16adf49b279c94f99849ef0c8e7c7584fc6583
IEDL.DBID RIE
ISICitedReferencesCount 1
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001004162500006&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
IngestDate Wed Aug 27 02:21:30 EDT 2025
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i204t-13e2e5a6aa09e106ba1e46e787f16adf49b279c94f99849ef0c8e7c7584fc6583
PageCount 4
ParticipantIDs ieee_primary_10104001
PublicationCentury 2000
PublicationDate 2022-Nov.-11
PublicationDateYYYYMMDD 2022-11-11
PublicationDate_xml – month: 11
  year: 2022
  text: 2022-Nov.-11
  day: 11
PublicationDecade 2020
PublicationTitle Asia Pacific Conference on Postgraduate Research in Microelectronics & Electronics (Online)
PublicationTitleAbbrev PRIMEASIA
PublicationYear 2022
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0003204130
Score 1.82025
Snippet Division operation is necessary for many applications, especially optimization algorithms for machine learning. Usually, a certain degree of loss is acceptable...
SourceID ieee
SourceType Publisher
StartPage 22
SubjectTerms Approximation algorithms
Delays
fast square root
fixed-point division
Hardware
hardware acceleration
Libraries
Machine learning
Machine learning algorithms
optimization algorithm
Throughput
Title Hardware Implementation of Approximate Fixed-point Divider for Machine Learning Optimization Algorithm
URI https://ieeexplore.ieee.org/document/10104001
WOSCitedRecordID wos001004162500006&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV07T8MwELagQggWXkW85aFrSl614zECKhZKB5C6VY5zLpZoUqUp9OdzdtMCAwNbFClWdH7cd-f7viOkk3DBo9yPvQzdDwYoWnoC15EHIsiYYoxrlbhmE3wwSEYjMWzI6o4LAwCu-Ay69tHd5eelWthUGe7wwK45DHa2OWcrstYmoRKFvj2Qd0mn0dG8HVqB_HRuJHp1ZvMnYdhdD_GrmYrzJf2Df_7FIWl_s_LocONvjsgWFMdk_4eg4AnR9iL-U1ZAnervtCEWFbTUNLXq4UuDCBVo3ywh92alKWp6bxwVjyJ6pU-utBJoo7o6oc94okwbqiZN3ydlZeq3aZu89h9e7h69ppOCZ9Aqtt88hNCTTEpfAAaBmQwgZoCbVQdM5joWWciFErHG6CsWoH2VAFcYS8RaIUaJTkmrKAs4IxThS5JFuU6ULfHr4deZVpkIQy41oo_gnLStycazlVjGeG2tiz_eX5I9OzGW3hcEV6RVVwu4Jjvqozbz6sZN8Rd29Kkg
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1JT8JAFJ4YNC4XN4y7c-Ba7EbbOTYqwQjIARNuZDp9g5NIS0pRfr5vxoJ68OCtadJJ82Z5y7zv-whpRCELvdT2rQTdDyYoklsM15EFzEkCEQShFJERmwj7_Wg0YoMKrG6wMABgms-gqR_NXX6ai4UuleEOd_Saw2RnU0tnVXCtdUnFc219JG-TRsWkeTvQFPnxXHH064GuoLhuczXILzkV403a-__8jwNS_8bl0cHa4xySDciOyN4PSsFjIvVV_AcvgBre32kFLcpoLmms-cOXCmNUoG21hNSa5Sor6b0yYDyK8SvtmeZKoBXv6oQ-45kyrcCaNH6b5IUqX6d18tJ-GN51rEpLwVJoFa04Dy60eMC5zQDTwIQ74AeA21U6AU-lzxI3ZIL5EvMvn4G0RQShwGzClwKjFO-E1LI8g1NCMYCJEi-VkdBNfi38OpEiYa4bconxh3NG6tpk49kXXcZ4Za3zP97fkJ3OsNcddx_7TxdkV0-SBvs5ziWplcUCrsiWeC_VvLg20_0JcfWsaQ
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=Asia+Pacific+Conference+on+Postgraduate+Research+in+Microelectronics+%26+Electronics+%28Online%29&rft.atitle=Hardware+Implementation+of+Approximate+Fixed-point+Divider+for+Machine+Learning+Optimization+Algorithm&rft.au=Han%2C+Gandong&rft.au=Zhang%2C+Weiyi&rft.au=Niu%2C+Liting&rft.au=Zhang%2C+Chun&rft.date=2022-11-11&rft.pub=IEEE&rft.eissn=2159-2160&rft.spage=22&rft.epage=25&rft_id=info:doi/10.1109%2FPrimeAsia56064.2022.10104001&rft.externalDocID=10104001