RPkNN: An OpenCL-Based FPGA Implementation of the Dimensionality-Reduced kNN Algorithm Using Random Projection
Due to the so-called curse of dimensionality and increase in the size of databases, there is an ever-increasing demand for computing resources and memory bandwidth when performing the k-nearest neighbors (kNNs) algorithm, resulting in a slow-down to process large datasets. This work presents an Open...
Uloženo v:
| Vydáno v: | IEEE transactions on very large scale integration (VLSI) systems Ročník 30; číslo 4; s. 549 - 552 |
|---|---|
| Hlavní autoři: | , , |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
New York
IEEE
01.04.2022
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Témata: | |
| ISSN: | 1063-8210, 1557-9999 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Abstract | Due to the so-called curse of dimensionality and increase in the size of databases, there is an ever-increasing demand for computing resources and memory bandwidth when performing the k-nearest neighbors (kNNs) algorithm, resulting in a slow-down to process large datasets. This work presents an OpenCL-based framework for accelerating the kNN algorithm on field-programmable gate arrays (FPGAs) benefiting from the random projection dimensionality reduction. The proposed RPkNN framework includes two compute modules implementing a throughput-optimized hardware architecture based on random projection and the kNN algorithm and a host program facilitating easy integration of the compute modules in the existing applications. RPkNN also utilizes a new buffering scheme tailored to random projection and the kNN algorithm. The proposed architecture enables parallel kNN computations with a single memory channel and takes advantage of the sparsity features of the input data to implement a highly optimized and parallel implementation of random projection. We employ a computation storage device (CSD) to directly access the high-dimensional data on non-volatile memory express (NVMe) solid state drive (SSD) and store and reuse the compressed and low-dimensional data on the FPGA dynamic random access memory (DRAM), hence eliminating data transfers to the host DRAM. We compare RPkNN implemented on the Samsung SmartSSD CSD with the kNN implementation of the scikit-learn library running on an Intel Xeon Gold 6154 CPU. The experimental results show that the proposed RPkNN solution achieves, on average, <inline-formula> <tex-math notation="LaTeX">26\times </tex-math></inline-formula> and <inline-formula> <tex-math notation="LaTeX">46\times </tex-math></inline-formula> higher performance across different dimensions per a single kNN computation for the SIFT1M and GIST1M databases, respectively. Finally, RPkNN is <inline-formula> <tex-math notation="LaTeX">1.7\times </tex-math></inline-formula> faster than the similar FPGA-based reference method. |
|---|---|
| AbstractList | Due to the so-called curse of dimensionality and increase in the size of databases, there is an ever-increasing demand for computing resources and memory bandwidth when performing the k-nearest neighbors (kNNs) algorithm, resulting in a slow-down to process large datasets. This work presents an OpenCL-based framework for accelerating the kNN algorithm on field-programmable gate arrays (FPGAs) benefiting from the random projection dimensionality reduction. The proposed RPkNN framework includes two compute modules implementing a throughput-optimized hardware architecture based on random projection and the kNN algorithm and a host program facilitating easy integration of the compute modules in the existing applications. RPkNN also utilizes a new buffering scheme tailored to random projection and the kNN algorithm. The proposed architecture enables parallel kNN computations with a single memory channel and takes advantage of the sparsity features of the input data to implement a highly optimized and parallel implementation of random projection. We employ a computation storage device (CSD) to directly access the high-dimensional data on non-volatile memory express (NVMe) solid state drive (SSD) and store and reuse the compressed and low-dimensional data on the FPGA dynamic random access memory (DRAM), hence eliminating data transfers to the host DRAM. We compare RPkNN implemented on the Samsung SmartSSD CSD with the kNN implementation of the scikit-learn library running on an Intel Xeon Gold 6154 CPU. The experimental results show that the proposed RPkNN solution achieves, on average, [Formula Omitted] and [Formula Omitted] higher performance across different dimensions per a single kNN computation for the SIFT1M and GIST1M databases, respectively. Finally, RPkNN is [Formula Omitted] faster than the similar FPGA-based reference method. Due to the so-called curse of dimensionality and increase in the size of databases, there is an ever-increasing demand for computing resources and memory bandwidth when performing the k-nearest neighbors (kNNs) algorithm, resulting in a slow-down to process large datasets. This work presents an OpenCL-based framework for accelerating the kNN algorithm on field-programmable gate arrays (FPGAs) benefiting from the random projection dimensionality reduction. The proposed RPkNN framework includes two compute modules implementing a throughput-optimized hardware architecture based on random projection and the kNN algorithm and a host program facilitating easy integration of the compute modules in the existing applications. RPkNN also utilizes a new buffering scheme tailored to random projection and the kNN algorithm. The proposed architecture enables parallel kNN computations with a single memory channel and takes advantage of the sparsity features of the input data to implement a highly optimized and parallel implementation of random projection. We employ a computation storage device (CSD) to directly access the high-dimensional data on non-volatile memory express (NVMe) solid state drive (SSD) and store and reuse the compressed and low-dimensional data on the FPGA dynamic random access memory (DRAM), hence eliminating data transfers to the host DRAM. We compare RPkNN implemented on the Samsung SmartSSD CSD with the kNN implementation of the scikit-learn library running on an Intel Xeon Gold 6154 CPU. The experimental results show that the proposed RPkNN solution achieves, on average, <inline-formula> <tex-math notation="LaTeX">26\times </tex-math></inline-formula> and <inline-formula> <tex-math notation="LaTeX">46\times </tex-math></inline-formula> higher performance across different dimensions per a single kNN computation for the SIFT1M and GIST1M databases, respectively. Finally, RPkNN is <inline-formula> <tex-math notation="LaTeX">1.7\times </tex-math></inline-formula> faster than the similar FPGA-based reference method. |
| Author | Yao, Xuebin Bank Tavakoli, Erfan Beygi, Amir |
| Author_xml | – sequence: 1 givenname: Erfan orcidid: 0000-0002-3248-9301 surname: Bank Tavakoli fullname: Bank Tavakoli, Erfan email: ebanktav@asu.edu organization: School of Computing and Augmented Intelligence, Arizona State University, Tempe, AZ, USA – sequence: 2 givenname: Amir surname: Beygi fullname: Beygi, Amir email: a.beygi@samsung.com organization: Memory Solutions Lab, Samsung Semiconductor, Inc, San Jose, CA, USA – sequence: 3 givenname: Xuebin surname: Yao fullname: Yao, Xuebin email: xuebin.yao@samsung.com organization: Memory Solutions Lab, Samsung Semiconductor, Inc, San Jose, CA, USA |
| BookMark | eNp9kMlOwzAQhi0EEusLwMUS55TxkqbmVspWqWqrslwjJ5mAS2IXOz307XEp4sABX2yP5vvH_o7JvnUWCTln0GMM1NXz6-Rp3OPAeU8wmWVS7JEjlqZZouLaj2foi2TAGRyS4xCWAExKBUfELuYf0-k1HVo6W6EdTZIbHbCi9_OHIR23qwZbtJ3ujLPU1bR7R3prYinEgm5Mt0kWWK3LSMQYOmzenDfde0tfgrFvdKFt5Vo6926J5TbjlBzUugl49rOfkJf7u-fRYzKZPYxHw0lScpV2iRCFAgWYQt0XQqka4q9YEa8VMs0FSwe6AkBW1bIuECohpOKizLQuq0IW4oRc7nJX3n2uMXT50q19fHHIeV8ypVLOIXbxXVfpXQge63zlTav9JmeQb73m317zrdf8x2uEBn-g0uwEdV6b5n_0YocaRPydpTImIQPxBXPGh24 |
| CODEN | ITCOB4 |
| CitedBy_id | crossref_primary_10_3390_s23125710 crossref_primary_10_1016_j_eswa_2024_123570 crossref_primary_10_1038_s41598_024_80210_x crossref_primary_10_1145_3616873 |
| Cites_doi | 10.1371/journal.pgen.1004573 10.1109/FPT.2016.7929193 10.1109/ICCD46524.2019.00030 10.1109/ICFPT51103.2020.00027 10.1090/conm/026/737400 10.1007/978-3-540-39964-3_62 10.1007/s10115-007-0114-2 10.1007/s10994-006-6265-7 10.1109/CVPR.2018.00517 10.1109/ICFPT51103.2020.00026 10.1002/9780470713785 10.1007/978-1-4419-9660-2 10.1145/1150402.1150436 |
| ContentType | Journal Article |
| Copyright | Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2022 |
| Copyright_xml | – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2022 |
| DBID | 97E RIA RIE AAYXX CITATION 7SP 8FD L7M |
| DOI | 10.1109/TVLSI.2022.3147743 |
| DatabaseName | IEEE Xplore (IEEE) IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Electronic Library (IEL) CrossRef Electronics & Communications Abstracts Technology Research Database Advanced Technologies Database with Aerospace |
| DatabaseTitle | CrossRef Technology Research Database Advanced Technologies Database with Aerospace Electronics & Communications Abstracts |
| DatabaseTitleList | Technology Research Database |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Engineering |
| EISSN | 1557-9999 |
| EndPage | 552 |
| ExternalDocumentID | 10_1109_TVLSI_2022_3147743 9714070 |
| Genre | orig-research |
| GroupedDBID | -~X .DC 0R~ 29I 3EH 4.4 5GY 5VS 6IK 97E AAJGR AARMG AASAJ AAWTH ABAZT ABFSI ABQJQ ABVLG ACGFS ACIWK AENEX AETIX AGQYO AGSQL AHBIQ AI. AIBXA AKJIK AKQYR ALLEH ALMA_UNASSIGNED_HOLDINGS ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ CS3 DU5 E.L EBS EJD HZ~ H~9 ICLAB IEDLZ IFIPE IFJZH IPLJI JAVBF LAI M43 O9- OCL P2P RIA RIE RNS TN5 VH1 AAYXX CITATION 7SP 8FD L7M |
| ID | FETCH-LOGICAL-c295t-33b9090e50f63399f03141b50fde1a23158ad00e1df4fbe0d334923c7aacdb4b3 |
| IEDL.DBID | RIE |
| ISICitedReferencesCount | 10 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000758189100001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 1063-8210 |
| IngestDate | Mon Jun 30 17:03:03 EDT 2025 Sat Nov 29 03:36:19 EST 2025 Tue Nov 18 21:32:40 EST 2025 Wed Aug 27 02:40:31 EDT 2025 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 4 |
| Language | English |
| License | https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html https://doi.org/10.15223/policy-029 https://doi.org/10.15223/policy-037 |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c295t-33b9090e50f63399f03141b50fde1a23158ad00e1df4fbe0d334923c7aacdb4b3 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ORCID | 0000-0002-3248-9301 |
| PQID | 2641995220 |
| PQPubID | 85424 |
| PageCount | 4 |
| ParticipantIDs | crossref_primary_10_1109_TVLSI_2022_3147743 proquest_journals_2641995220 ieee_primary_9714070 crossref_citationtrail_10_1109_TVLSI_2022_3147743 |
| PublicationCentury | 2000 |
| PublicationDate | 2022-04-01 |
| PublicationDateYYYYMMDD | 2022-04-01 |
| PublicationDate_xml | – month: 04 year: 2022 text: 2022-04-01 day: 01 |
| PublicationDecade | 2020 |
| PublicationPlace | New York |
| PublicationPlace_xml | – name: New York |
| PublicationTitle | IEEE transactions on very large scale integration (VLSI) systems |
| PublicationTitleAbbrev | TVLSI |
| PublicationYear | 2022 |
| Publisher | IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Publisher_xml | – name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| References | ref13 ref12 ref14 ref11 ref10 ref2 ref1 (ref15) 2021 ref8 ref7 ref4 ref3 ref6 ref5 Pedregosa (ref9) 2012; 12 |
| References_xml | – ident: ref2 doi: 10.1371/journal.pgen.1004573 – volume: 12 start-page: 2825 year: 2012 ident: ref9 article-title: Scikit-learn: Machine learning in Python publication-title: J. Mach. Learn. Res. – ident: ref14 doi: 10.1109/FPT.2016.7929193 – ident: ref8 doi: 10.1109/ICCD46524.2019.00030 – ident: ref13 doi: 10.1109/ICFPT51103.2020.00027 – ident: ref10 doi: 10.1090/conm/026/737400 – ident: ref1 doi: 10.1007/978-3-540-39964-3_62 – ident: ref3 doi: 10.1007/s10115-007-0114-2 – ident: ref11 doi: 10.1007/s10994-006-6265-7 – ident: ref7 doi: 10.1109/CVPR.2018.00517 – ident: ref6 doi: 10.1109/ICFPT51103.2020.00026 – volume-title: SmartSSD Computational Storage Drive: Installation and User Guide year: 2021 ident: ref15 – ident: ref4 doi: 10.1002/9780470713785 – ident: ref5 doi: 10.1007/978-1-4419-9660-2 – ident: ref12 doi: 10.1145/1150402.1150436 |
| SSID | ssj0014490 |
| Score | 2.3950236 |
| Snippet | Due to the so-called curse of dimensionality and increase in the size of databases, there is an ever-increasing demand for computing resources and memory... |
| SourceID | proquest crossref ieee |
| SourceType | Aggregation Database Enrichment Source Index Database Publisher |
| StartPage | 549 |
| SubjectTerms | Algorithms Bandwidth Computation Computer architecture Dynamic random access memory Field programmable gate arrays Field-programmable gate array (FPGA) Hardware k-nearest neighbors (kNNs) K-nearest neighbors algorithm Kernel Modules near-storage acceleration Projection Random access memory random projection Solid state devices Sparse matrices |
| Title | RPkNN: An OpenCL-Based FPGA Implementation of the Dimensionality-Reduced kNN Algorithm Using Random Projection |
| URI | https://ieeexplore.ieee.org/document/9714070 https://www.proquest.com/docview/2641995220 |
| Volume | 30 |
| WOSCitedRecordID | wos000758189100001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVIEE databaseName: IEEE Electronic Library (IEL) customDbUrl: eissn: 1557-9999 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0014490 issn: 1063-8210 databaseCode: RIE dateStart: 19930101 isFulltext: true titleUrlDefault: https://ieeexplore.ieee.org/ providerName: IEEE |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LS8QwEA4qHvTgW1xdJQdvGk2T9BFv6-qqIMuyPvBWmibRRbeVdfX3O0m7i6II3lrIhNIv7XyTzHyD0H5kMmrD0BCtqSXCRAkBVhQQyzKlJReSMuWbTcTdbvLwIHsz6HBaC2OM8cln5shd-rN8XebvbqvsWDp1uRgC9Nk4jqparemJgRCyUh6IOEkgjpkUyFB5fHt_fXMFoSBjEKEK4Dv8mxPyXVV-_Iq9f-ks_-_JVtBSzSNxqwJ-Fc2YYg0tflEXXEdFv_fc7Z7gVoFd2kj7mpyCy9K407toYa8KPKwLjwpcWgxMEJ85rf9KpwPYOek7XVewgGlw6-WxHA3GT0PsswxwPyt0OcS9aicHTDbQXef8tn1J6vYKJGcyHBPOlaSSmpDaiANPsU7JPlBwq02QAe8Lk0xTagJthVWGau6UDHkeZ1mulVB8E80VZWG2EE7yMGQizpOQWaESpgBuocEgiLQxWdxAweR9p3mtPe5aYLykPgahMvUYpQ6jtMaogQ6mNq-V8safo9cdKtORNSAN1JzAmtYf51sKHNAVpjNGt3-32kELbu4qQaeJ5sajd7OL5vOP8eBttOfX3SfUI9R6 |
| linkProvider | IEEE |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3fb9MwED5NGxLsYTAGomOAH_YGZo5_JDFvZVA2rYuqUtDeoji2YdqaTF3H38_ZSSsQCIm3RPIlkb8k95199x3AYeoq5pVy1FrmqXRpTpEVJdTzylgtpGbcxGYTWVHkFxd6sgFv1rUwzrmYfObehsO4l2_b-i4slR3poC6XYYC-FTpnqa5aa71nIKXutAdSQXOMZFYlMkwfzb6OP59iMMg5xqgSGY_4zQ3Fvip__Iyjhxk9_L9newQ7PZMkww76XdhwzWPY_kVfcA-a6eSqKN6RYUNC4sjxmL5Hp2XJaPJpSKIu8LwvPWpI6wlyQfIhqP13Sh3Iz-k0KLuiBV6GDK-_tYvL5fc5iXkGZFo1tp2TSbeWgyZP4Mvo4-z4hPYNFmjNtVpSIYxmmjnFfCqQqfigZZ8YPLUuqZD5qbyyjLnEeumNY1YELUNRZ1VVWyONeAqbTdu4Z0DyWikuszpX3EuTc4OAS4sGSWqdq7IBJKv5LutefTw0wbguYxTCdBkxKgNGZY_RAF6vbW467Y1_jt4LqKxH9oAM4GAFa9l_nrclssBQms452_-71Su4fzI7H5fj0-LsOTwI9-nSdQ5gc7m4cy_gXv1jeXm7eBnfwZ8VYNfF |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=RPkNN%3A+An+OpenCL-Based+FPGA+Implementation+of+the+Dimensionality-Reduced+kNN+Algorithm+Using+Random+Projection&rft.jtitle=IEEE+transactions+on+very+large+scale+integration+%28VLSI%29+systems&rft.au=Bank+Tavakoli%2C+Erfan&rft.au=Beygi%2C+Amir&rft.au=Yao%2C+Xuebin&rft.date=2022-04-01&rft.issn=1063-8210&rft.eissn=1557-9999&rft.volume=30&rft.issue=4&rft.spage=549&rft.epage=552&rft_id=info:doi/10.1109%2FTVLSI.2022.3147743&rft.externalDBID=n%2Fa&rft.externalDocID=10_1109_TVLSI_2022_3147743 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1063-8210&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1063-8210&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1063-8210&client=summon |