RPkNN: An OpenCL-Based FPGA Implementation of the Dimensionality-Reduced kNN Algorithm Using Random Projection

Due to the so-called curse of dimensionality and increase in the size of databases, there is an ever-increasing demand for computing resources and memory bandwidth when performing the k-nearest neighbors (kNNs) algorithm, resulting in a slow-down to process large datasets. This work presents an Open...

Celý popis

Uloženo v:

Podrobná bibliografie
Vydáno v:	IEEE transactions on very large scale integration (VLSI) systems Ročník 30; číslo 4; s. 549 - 552
Hlavní autoři:	Bank Tavakoli, Erfan, Beygi, Amir, Yao, Xuebin
Médium:	Journal Article
Jazyk:	angličtina
Vydáno:	New York IEEE 01.04.2022 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Témata:	Algorithms Bandwidth Computation Computer architecture Dynamic random access memory Field programmable gate arrays Field-programmable gate array (FPGA) Hardware k-nearest neighbors (kNNs) K-nearest neighbors algorithm Kernel Modules near-storage acceleration Projection Random access memory random projection Solid state devices Sparse matrices
ISSN:	1063-8210, 1557-9999
On-line přístup:	Získat plný text
Tagy:	Přidat tag Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!

Abstract	Due to the so-called curse of dimensionality and increase in the size of databases, there is an ever-increasing demand for computing resources and memory bandwidth when performing the k-nearest neighbors (kNNs) algorithm, resulting in a slow-down to process large datasets. This work presents an OpenCL-based framework for accelerating the kNN algorithm on field-programmable gate arrays (FPGAs) benefiting from the random projection dimensionality reduction. The proposed RPkNN framework includes two compute modules implementing a throughput-optimized hardware architecture based on random projection and the kNN algorithm and a host program facilitating easy integration of the compute modules in the existing applications. RPkNN also utilizes a new buffering scheme tailored to random projection and the kNN algorithm. The proposed architecture enables parallel kNN computations with a single memory channel and takes advantage of the sparsity features of the input data to implement a highly optimized and parallel implementation of random projection. We employ a computation storage device (CSD) to directly access the high-dimensional data on non-volatile memory express (NVMe) solid state drive (SSD) and store and reuse the compressed and low-dimensional data on the FPGA dynamic random access memory (DRAM), hence eliminating data transfers to the host DRAM. We compare RPkNN implemented on the Samsung SmartSSD CSD with the kNN implementation of the scikit-learn library running on an Intel Xeon Gold 6154 CPU. The experimental results show that the proposed RPkNN solution achieves, on average, <inline-formula> <tex-math notation="LaTeX">26\times </tex-math></inline-formula> and <inline-formula> <tex-math notation="LaTeX">46\times </tex-math></inline-formula> higher performance across different dimensions per a single kNN computation for the SIFT1M and GIST1M databases, respectively. Finally, RPkNN is <inline-formula> <tex-math notation="LaTeX">1.7\times </tex-math></inline-formula> faster than the similar FPGA-based reference method.
AbstractList	Due to the so-called curse of dimensionality and increase in the size of databases, there is an ever-increasing demand for computing resources and memory bandwidth when performing the k-nearest neighbors (kNNs) algorithm, resulting in a slow-down to process large datasets. This work presents an OpenCL-based framework for accelerating the kNN algorithm on field-programmable gate arrays (FPGAs) benefiting from the random projection dimensionality reduction. The proposed RPkNN framework includes two compute modules implementing a throughput-optimized hardware architecture based on random projection and the kNN algorithm and a host program facilitating easy integration of the compute modules in the existing applications. RPkNN also utilizes a new buffering scheme tailored to random projection and the kNN algorithm. The proposed architecture enables parallel kNN computations with a single memory channel and takes advantage of the sparsity features of the input data to implement a highly optimized and parallel implementation of random projection. We employ a computation storage device (CSD) to directly access the high-dimensional data on non-volatile memory express (NVMe) solid state drive (SSD) and store and reuse the compressed and low-dimensional data on the FPGA dynamic random access memory (DRAM), hence eliminating data transfers to the host DRAM. We compare RPkNN implemented on the Samsung SmartSSD CSD with the kNN implementation of the scikit-learn library running on an Intel Xeon Gold 6154 CPU. The experimental results show that the proposed RPkNN solution achieves, on average, [Formula Omitted] and [Formula Omitted] higher performance across different dimensions per a single kNN computation for the SIFT1M and GIST1M databases, respectively. Finally, RPkNN is [Formula Omitted] faster than the similar FPGA-based reference method. Due to the so-called curse of dimensionality and increase in the size of databases, there is an ever-increasing demand for computing resources and memory bandwidth when performing the k-nearest neighbors (kNNs) algorithm, resulting in a slow-down to process large datasets. This work presents an OpenCL-based framework for accelerating the kNN algorithm on field-programmable gate arrays (FPGAs) benefiting from the random projection dimensionality reduction. The proposed RPkNN framework includes two compute modules implementing a throughput-optimized hardware architecture based on random projection and the kNN algorithm and a host program facilitating easy integration of the compute modules in the existing applications. RPkNN also utilizes a new buffering scheme tailored to random projection and the kNN algorithm. The proposed architecture enables parallel kNN computations with a single memory channel and takes advantage of the sparsity features of the input data to implement a highly optimized and parallel implementation of random projection. We employ a computation storage device (CSD) to directly access the high-dimensional data on non-volatile memory express (NVMe) solid state drive (SSD) and store and reuse the compressed and low-dimensional data on the FPGA dynamic random access memory (DRAM), hence eliminating data transfers to the host DRAM. We compare RPkNN implemented on the Samsung SmartSSD CSD with the kNN implementation of the scikit-learn library running on an Intel Xeon Gold 6154 CPU. The experimental results show that the proposed RPkNN solution achieves, on average, <inline-formula> <tex-math notation="LaTeX">26\times </tex-math></inline-formula> and <inline-formula> <tex-math notation="LaTeX">46\times </tex-math></inline-formula> higher performance across different dimensions per a single kNN computation for the SIFT1M and GIST1M databases, respectively. Finally, RPkNN is <inline-formula> <tex-math notation="LaTeX">1.7\times </tex-math></inline-formula> faster than the similar FPGA-based reference method.
Author	Yao, Xuebin Bank Tavakoli, Erfan Beygi, Amir
Author_xml	– sequence: 1 givenname: Erfan orcidid: 0000-0002-3248-9301 surname: Bank Tavakoli fullname: Bank Tavakoli, Erfan email: ebanktav@asu.edu organization: School of Computing and Augmented Intelligence, Arizona State University, Tempe, AZ, USA – sequence: 2 givenname: Amir surname: Beygi fullname: Beygi, Amir email: a.beygi@samsung.com organization: Memory Solutions Lab, Samsung Semiconductor, Inc, San Jose, CA, USA – sequence: 3 givenname: Xuebin surname: Yao fullname: Yao, Xuebin email: xuebin.yao@samsung.com organization: Memory Solutions Lab, Samsung Semiconductor, Inc, San Jose, CA, USA
BookMark	eNp9kMlOwzAQhi0EEusLwMUS55TxkqbmVspWqWqrslwjJ5mAS2IXOz307XEp4sABX2yP5vvH_o7JvnUWCTln0GMM1NXz6-Rp3OPAeU8wmWVS7JEjlqZZouLaj2foi2TAGRyS4xCWAExKBUfELuYf0-k1HVo6W6EdTZIbHbCi9_OHIR23qwZbtJ3ujLPU1bR7R3prYinEgm5Mt0kWWK3LSMQYOmzenDfde0tfgrFvdKFt5Vo6926J5TbjlBzUugl49rOfkJf7u-fRYzKZPYxHw0lScpV2iRCFAgWYQt0XQqka4q9YEa8VMs0FSwe6AkBW1bIuECohpOKizLQuq0IW4oRc7nJX3n2uMXT50q19fHHIeV8ypVLOIXbxXVfpXQge63zlTav9JmeQb73m317zrdf8x2uEBn-g0uwEdV6b5n_0YocaRPydpTImIQPxBXPGh24
CODEN	ITCOB4
CitedBy_id	crossref_primary_10_3390_s23125710 crossref_primary_10_1016_j_eswa_2024_123570 crossref_primary_10_1038_s41598_024_80210_x crossref_primary_10_1145_3616873
Cites_doi	10.1371/journal.pgen.1004573 10.1109/FPT.2016.7929193 10.1109/ICCD46524.2019.00030 10.1109/ICFPT51103.2020.00027 10.1090/conm/026/737400 10.1007/978-3-540-39964-3_62 10.1007/s10115-007-0114-2 10.1007/s10994-006-6265-7 10.1109/CVPR.2018.00517 10.1109/ICFPT51103.2020.00026 10.1002/9780470713785 10.1007/978-1-4419-9660-2 10.1145/1150402.1150436
ContentType	Journal Article
Copyright	Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2022
Copyright_xml	– notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2022
DBID	97E RIA RIE AAYXX CITATION 7SP 8FD L7M
DOI	10.1109/TVLSI.2022.3147743
DatabaseName	IEEE Xplore (IEEE) IEEE All-Society Periodicals Package (ASPP) 1998–Present IEEE Electronic Library (IEL) CrossRef Electronics & Communications Abstracts Technology Research Database Advanced Technologies Database with Aerospace
DatabaseTitle	CrossRef Technology Research Database Advanced Technologies Database with Aerospace Electronics & Communications Abstracts
DatabaseTitleList	Technology Research Database
Database_xml	– sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://ieeexplore.ieee.org/ sourceTypes: Publisher
DeliveryMethod	fulltext_linktorsrc
Discipline	Engineering
EISSN	1557-9999
EndPage	552
ExternalDocumentID	10_1109_TVLSI_2022_3147743 9714070
Genre	orig-research
GroupedDBID	-~X .DC 0R~ 29I 3EH 4.4 5GY 5VS 6IK 97E AAJGR AARMG AASAJ AAWTH ABAZT ABFSI ABQJQ ABVLG ACGFS ACIWK AENEX AETIX AGQYO AGSQL AHBIQ AI. AIBXA AKJIK AKQYR ALLEH ALMA_UNASSIGNED_HOLDINGS ATWAV BEFXN BFFAM BGNUA BKEBE BPEOZ CS3 DU5 E.L EBS EJD HZ~ H~9 ICLAB IEDLZ IFIPE IFJZH IPLJI JAVBF LAI M43 O9- OCL P2P RIA RIE RNS TN5 VH1 AAYXX CITATION 7SP 8FD L7M
ID	FETCH-LOGICAL-c295t-33b9090e50f63399f03141b50fde1a23158ad00e1df4fbe0d334923c7aacdb4b3
IEDL.DBID	RIE
ISICitedReferencesCount	10
ISICitedReferencesURI	http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000758189100001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN	1063-8210
IngestDate	Mon Jun 30 17:03:03 EDT 2025 Sat Nov 29 03:36:19 EST 2025 Tue Nov 18 21:32:40 EST 2025 Wed Aug 27 02:40:31 EDT 2025
IsPeerReviewed	true
IsScholarly	true
Issue	4
Language	English
License	https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html https://doi.org/10.15223/policy-029 https://doi.org/10.15223/policy-037
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-c295t-33b9090e50f63399f03141b50fde1a23158ad00e1df4fbe0d334923c7aacdb4b3
Notes	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ORCID	0000-0002-3248-9301
PQID	2641995220
PQPubID	85424
PageCount	4
ParticipantIDs	crossref_primary_10_1109_TVLSI_2022_3147743 proquest_journals_2641995220 ieee_primary_9714070 crossref_citationtrail_10_1109_TVLSI_2022_3147743
PublicationCentury	2000
PublicationDate	2022-04-01
PublicationDateYYYYMMDD	2022-04-01
PublicationDate_xml	– month: 04 year: 2022 text: 2022-04-01 day: 01
PublicationDecade	2020
PublicationPlace	New York
PublicationPlace_xml	– name: New York
PublicationTitle	IEEE transactions on very large scale integration (VLSI) systems
PublicationTitleAbbrev	TVLSI
PublicationYear	2022
Publisher	IEEE The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml	– name: IEEE – name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References	ref13 ref12 ref14 ref11 ref10 ref2 ref1 (ref15) 2021 ref8 ref7 ref4 ref3 ref6 ref5 Pedregosa (ref9) 2012; 12
References_xml	– ident: ref2 doi: 10.1371/journal.pgen.1004573 – volume: 12 start-page: 2825 year: 2012 ident: ref9 article-title: Scikit-learn: Machine learning in Python publication-title: J. Mach. Learn. Res. – ident: ref14 doi: 10.1109/FPT.2016.7929193 – ident: ref8 doi: 10.1109/ICCD46524.2019.00030 – ident: ref13 doi: 10.1109/ICFPT51103.2020.00027 – ident: ref10 doi: 10.1090/conm/026/737400 – ident: ref1 doi: 10.1007/978-3-540-39964-3_62 – ident: ref3 doi: 10.1007/s10115-007-0114-2 – ident: ref11 doi: 10.1007/s10994-006-6265-7 – ident: ref7 doi: 10.1109/CVPR.2018.00517 – ident: ref6 doi: 10.1109/ICFPT51103.2020.00026 – volume-title: SmartSSD Computational Storage Drive: Installation and User Guide year: 2021 ident: ref15 – ident: ref4 doi: 10.1002/9780470713785 – ident: ref5 doi: 10.1007/978-1-4419-9660-2 – ident: ref12 doi: 10.1145/1150402.1150436
SSID	ssj0014490
Score	2.3950236
Snippet	Due to the so-called curse of dimensionality and increase in the size of databases, there is an ever-increasing demand for computing resources and memory...
SourceID	proquest crossref ieee
SourceType	Aggregation Database Enrichment Source Index Database Publisher
StartPage	549
SubjectTerms	Algorithms Bandwidth Computation Computer architecture Dynamic random access memory Field programmable gate arrays Field-programmable gate array (FPGA) Hardware k-nearest neighbors (kNNs) K-nearest neighbors algorithm Kernel Modules near-storage acceleration Projection Random access memory random projection Solid state devices Sparse matrices
Title	RPkNN: An OpenCL-Based FPGA Implementation of the Dimensionality-Reduced kNN Algorithm Using Random Projection
URI	https://ieeexplore.ieee.org/document/9714070 https://www.proquest.com/docview/2641995220
Volume	30
WOSCitedRecordID	wos000758189100001&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
journalDatabaseRights	– providerCode: PRVIEE databaseName: IEEE Electronic Library (IEL) customDbUrl: eissn: 1557-9999 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0014490 issn: 1063-8210 databaseCode: RIE dateStart: 19930101 isFulltext: true titleUrlDefault: https://ieeexplore.ieee.org/ providerName: IEEE
link	http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LS8QwEA4qHvTgW1xdJQdvGk2T9BFv6-qqIMuyPvBWmibRRbeVdfX3O0m7i6II3lrIhNIv7XyTzHyD0H5kMmrD0BCtqSXCRAkBVhQQyzKlJReSMuWbTcTdbvLwIHsz6HBaC2OM8cln5shd-rN8XebvbqvsWDp1uRgC9Nk4jqparemJgRCyUh6IOEkgjpkUyFB5fHt_fXMFoSBjEKEK4Dv8mxPyXVV-_Iq9f-ks_-_JVtBSzSNxqwJ-Fc2YYg0tflEXXEdFv_fc7Z7gVoFd2kj7mpyCy9K407toYa8KPKwLjwpcWgxMEJ85rf9KpwPYOek7XVewgGlw6-WxHA3GT0PsswxwPyt0OcS9aicHTDbQXef8tn1J6vYKJGcyHBPOlaSSmpDaiANPsU7JPlBwq02QAe8Lk0xTagJthVWGau6UDHkeZ1mulVB8E80VZWG2EE7yMGQizpOQWaESpgBuocEgiLQxWdxAweR9p3mtPe5aYLykPgahMvUYpQ6jtMaogQ6mNq-V8safo9cdKtORNSAN1JzAmtYf51sKHNAVpjNGt3-32kELbu4qQaeJ5sajd7OL5vOP8eBttOfX3SfUI9R6
linkProvider	IEEE
linkToHtml	http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3fb9MwED5NGxLsYTAGomOAH_YGZo5_JDFvZVA2rYuqUtDeoji2YdqaTF3H38_ZSSsQCIm3RPIlkb8k95199x3AYeoq5pVy1FrmqXRpTpEVJdTzylgtpGbcxGYTWVHkFxd6sgFv1rUwzrmYfObehsO4l2_b-i4slR3poC6XYYC-FTpnqa5aa71nIKXutAdSQXOMZFYlMkwfzb6OP59iMMg5xqgSGY_4zQ3Fvip__Iyjhxk9_L9newQ7PZMkww76XdhwzWPY_kVfcA-a6eSqKN6RYUNC4sjxmL5Hp2XJaPJpSKIu8LwvPWpI6wlyQfIhqP13Sh3Iz-k0KLuiBV6GDK-_tYvL5fc5iXkGZFo1tp2TSbeWgyZP4Mvo4-z4hPYNFmjNtVpSIYxmmjnFfCqQqfigZZ8YPLUuqZD5qbyyjLnEeumNY1YELUNRZ1VVWyONeAqbTdu4Z0DyWikuszpX3EuTc4OAS4sGSWqdq7IBJKv5LutefTw0wbguYxTCdBkxKgNGZY_RAF6vbW467Y1_jt4LqKxH9oAM4GAFa9l_nrclssBQms452_-71Su4fzI7H5fj0-LsOTwI9-nSdQ5gc7m4cy_gXv1jeXm7eBnfwZ8VYNfF
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=RPkNN%3A+An+OpenCL-Based+FPGA+Implementation+of+the+Dimensionality-Reduced+kNN+Algorithm+Using+Random+Projection&rft.jtitle=IEEE+transactions+on+very+large+scale+integration+%28VLSI%29+systems&rft.au=Bank+Tavakoli%2C+Erfan&rft.au=Beygi%2C+Amir&rft.au=Yao%2C+Xuebin&rft.date=2022-04-01&rft.issn=1063-8210&rft.eissn=1557-9999&rft.volume=30&rft.issue=4&rft.spage=549&rft.epage=552&rft_id=info:doi/10.1109%2FTVLSI.2022.3147743&rft.externalDBID=n%2Fa&rft.externalDocID=10_1109_TVLSI_2022_3147743
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1063-8210&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1063-8210&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1063-8210&client=summon