Distributed Sparse Random Projection Trees for Constructing K-Nearest Neighbor Graphs

A random projection tree that partitions data points by projecting them onto random vectors is widely used for approximate nearest neighbor search in high-dimensional space. We consider a particular case of random projection trees for constructing a k-nearest neighbor graph (KNNG) from high-dimensio...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:Proceedings - IEEE International Parallel and Distributed Processing Symposium s. 36 - 46
Hlavní autori: Ranawaka, Isuru, Rahman, Md Khaledur, Azad, Ariful
Médium: Konferenčný príspevok..
Jazyk:English
Vydavateľské údaje: IEEE 01.05.2023
Predmet:
ISSN:1530-2075
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Popis
Shrnutí:A random projection tree that partitions data points by projecting them onto random vectors is widely used for approximate nearest neighbor search in high-dimensional space. We consider a particular case of random projection trees for constructing a k-nearest neighbor graph (KNNG) from high-dimensional data. We develop a distributed-memory Random Projection Tree (DRPT) algorithm for constructing sparse random projection trees and then running a query on the forest to create the KNN graph. DRPT uses sparse matrix operations and a communication reduction scheme to scale KNN graph constructions to thousands of processes on a supercomputer. The accuracy of DRPT is comparable to state-of-the-art methods for approximate nearest neighbor search, while it runs two orders of magnitude faster than its peers. DRPT is available at https://github.com/HipGraph/DRPT.
ISSN:1530-2075
DOI:10.1109/IPDPS54959.2023.00014