Implementation and Optimization of Parallel KNN Algorithm for Sunway Architecture

The K-Nearest Neighbor(KNN) algorithm is the most typically used classification algorithm in artificial intelligence,and its performance improvement significantly affects the sorting and analysis of massive data and big data classification.The current new generation of Sunway supercomputers is in th...

Full description

Saved in:
Bibliographic Details
Published in:Ji suan ji gong cheng Vol. 49; no. 5; pp. 286 - 294
Main Author: WANG Qihan, PANG Jianmin, YUE Feng, ZHU Di, SHEN Li, XIAO Qian
Format: Journal Article
Language:Chinese
English
Published: Editorial Office of Computer Engineering 01.05.2023
Subjects:
ISSN:1000-3428
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:The K-Nearest Neighbor(KNN) algorithm is the most typically used classification algorithm in artificial intelligence,and its performance improvement significantly affects the sorting and analysis of massive data and big data classification.The current new generation of Sunway supercomputers is in the initial stage of application development. Exploiting the structural characteristics of the new-generation Sunway heterogeneous many-core processors allows an efficient KNN algorithm to be achieved for massive data analysis and collation.In this study,based on the structural characteristics of the SW26010pro processor,the master-slave acceleration programming model is used to implement the basic version of the KNN parallel algorithm,which transfers the computing core to the slave core for thread-level parallelism.Subsequently,the key factors affecting the performance of the basic parallel algorithm are analyzed,and the SWKNN algorithm is proposed,which is different from the task-division method of the basic parallel KNN algorithm. Finally,unnecessary communication overhead is reduced through data pipelining optimization,intercore communication optimization,and secondary load balancing optimization,which effectively relieves memory access pressure and further improves the algorithm performance.The experimental results show that,compared with the serial KNN algorithm,the basic parallel KNN algorithm for the Sunway architecture can achieve a maximum speedup that is 48 times higher on the single-core group of the SW26010pro processor.At the same scale,the SWKNN can achieve a speedup that is 399 times higher than that of the basic parallel KNN algorithm.
ISSN:1000-3428
DOI:10.19678/j.issn.1000-3428.0063954