A parameter-free nearest neighbor algorithm with reduced prediction time and improved performance through injected randomness

K-nearest neighbor is considered in top machine learning algorithms because of its effectiveness in pattern classification and simple implementation. However, usage of KNN is limited due to its larger prediction time than model-based machine learning algorithms, its sensitivity to the existing outli...

Full description

Saved in:

Bibliographic Details
Published in:	Neural computing & applications Vol. 37; no. 17; pp. 10531 - 10556
Main Authors:	Singh, Manpreet, Chhabra, Jitender Kumar
Format:	Journal Article
Language:	English
Published:	London Springer London 01.06.2025 Springer Nature B.V
Subjects:	Algorithms Artificial Intelligence Computational Biology/Bioinformatics Computational Science and Engineering Computer Science Data Mining and Knowledge Discovery Datasets Image Processing and Computer Vision Machine learning Parameter sensitivity Pattern classification Probability and Statistics in Computer Science Randomness S.I.: Timely Advances of Deep Learning with applications and Data Driven Modeling Special Issue on Timely Advances of Deep Learning with applications and Data Driven Modeling Binary search tree Nearest neighbors K-means clustering Injected randomness
ISSN:	0941-0643, 1433-3058
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	K-nearest neighbor is considered in top machine learning algorithms because of its effectiveness in pattern classification and simple implementation. However, usage of KNN is limited due to its larger prediction time than model-based machine learning algorithms, its sensitivity to the existing outliers in the training dataset, and tuning parameter neighborhood size ( k ). Therefore, this research article proposes a new variant of the KNN to reduce the training and prediction time with improved performance. The prediction time of the KNN is reduced by making a binary search tree (BST) using the divide-and-conquer strategy, and prediction performance is improved using ensembling by injecting randomness such as bootstrap aggregation, random subspace, and random node splitting. The proposed KNN variant is parameter-free and, hence, not sensitive to the hyperparameter neighborhood size. Finally, three experiments have been performed based on 26 selected datasets to show the prediction time and prediction power superiority of the proposed KNN over random forest and six selected KNN variants. Results prove that the proposed KNN variant gives better prediction results with reduced prediction and training time.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	0941-0643 1433-3058
DOI:	10.1007/s00521-024-10565-9