RPkNN: An OpenCL-Based FPGA Implementation of the Dimensionality-Reduced kNN Algorithm Using Random Projection

Due to the so-called curse of dimensionality and increase in the size of databases, there is an ever-increasing demand for computing resources and memory bandwidth when performing the k-nearest neighbors (kNNs) algorithm, resulting in a slow-down to process large datasets. This work presents an Open...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE transactions on very large scale integration (VLSI) systems Vol. 30; no. 4; pp. 549 - 552
Main Authors:	Bank Tavakoli, Erfan, Beygi, Amir, Yao, Xuebin
Format:	Journal Article
Language:	English
Published:	New York IEEE 01.04.2022 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects:	Algorithms Bandwidth Computation Computer architecture Dynamic random access memory Field programmable gate arrays Field-programmable gate array (FPGA) Hardware k-nearest neighbors (kNNs) K-nearest neighbors algorithm Kernel Modules near-storage acceleration Projection Random access memory random projection Solid state devices Sparse matrices
ISSN:	1063-8210, 1557-9999
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Due to the so-called curse of dimensionality and increase in the size of databases, there is an ever-increasing demand for computing resources and memory bandwidth when performing the k-nearest neighbors (kNNs) algorithm, resulting in a slow-down to process large datasets. This work presents an OpenCL-based framework for accelerating the kNN algorithm on field-programmable gate arrays (FPGAs) benefiting from the random projection dimensionality reduction. The proposed RPkNN framework includes two compute modules implementing a throughput-optimized hardware architecture based on random projection and the kNN algorithm and a host program facilitating easy integration of the compute modules in the existing applications. RPkNN also utilizes a new buffering scheme tailored to random projection and the kNN algorithm. The proposed architecture enables parallel kNN computations with a single memory channel and takes advantage of the sparsity features of the input data to implement a highly optimized and parallel implementation of random projection. We employ a computation storage device (CSD) to directly access the high-dimensional data on non-volatile memory express (NVMe) solid state drive (SSD) and store and reuse the compressed and low-dimensional data on the FPGA dynamic random access memory (DRAM), hence eliminating data transfers to the host DRAM. We compare RPkNN implemented on the Samsung SmartSSD CSD with the kNN implementation of the scikit-learn library running on an Intel Xeon Gold 6154 CPU. The experimental results show that the proposed RPkNN solution achieves, on average, <inline-formula> <tex-math notation="LaTeX">26\times </tex-math></inline-formula> and <inline-formula> <tex-math notation="LaTeX">46\times </tex-math></inline-formula> higher performance across different dimensions per a single kNN computation for the SIFT1M and GIST1M databases, respectively. Finally, RPkNN is <inline-formula> <tex-math notation="LaTeX">1.7\times </tex-math></inline-formula> faster than the similar FPGA-based reference method.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	1063-8210 1557-9999
DOI:	10.1109/TVLSI.2022.3147743