Distributed Sparse Random Projection Trees for Constructing K-Nearest Neighbor Graphs

A random projection tree that partitions data points by projecting them onto random vectors is widely used for approximate nearest neighbor search in high-dimensional space. We consider a particular case of random projection trees for constructing a k-nearest neighbor graph (KNNG) from high-dimensio...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Proceedings - IEEE International Parallel and Distributed Processing Symposium S. 36 - 46
Hauptverfasser: Ranawaka, Isuru, Rahman, Md Khaledur, Azad, Ariful
Format: Tagungsbericht
Sprache:Englisch
Veröffentlicht: IEEE 01.05.2023
Schlagworte:
ISSN:1530-2075
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Abstract A random projection tree that partitions data points by projecting them onto random vectors is widely used for approximate nearest neighbor search in high-dimensional space. We consider a particular case of random projection trees for constructing a k-nearest neighbor graph (KNNG) from high-dimensional data. We develop a distributed-memory Random Projection Tree (DRPT) algorithm for constructing sparse random projection trees and then running a query on the forest to create the KNN graph. DRPT uses sparse matrix operations and a communication reduction scheme to scale KNN graph constructions to thousands of processes on a supercomputer. The accuracy of DRPT is comparable to state-of-the-art methods for approximate nearest neighbor search, while it runs two orders of magnitude faster than its peers. DRPT is available at https://github.com/HipGraph/DRPT.
AbstractList A random projection tree that partitions data points by projecting them onto random vectors is widely used for approximate nearest neighbor search in high-dimensional space. We consider a particular case of random projection trees for constructing a k-nearest neighbor graph (KNNG) from high-dimensional data. We develop a distributed-memory Random Projection Tree (DRPT) algorithm for constructing sparse random projection trees and then running a query on the forest to create the KNN graph. DRPT uses sparse matrix operations and a communication reduction scheme to scale KNN graph constructions to thousands of processes on a supercomputer. The accuracy of DRPT is comparable to state-of-the-art methods for approximate nearest neighbor search, while it runs two orders of magnitude faster than its peers. DRPT is available at https://github.com/HipGraph/DRPT.
Author Azad, Ariful
Rahman, Md Khaledur
Ranawaka, Isuru
Author_xml – sequence: 1
  givenname: Isuru
  surname: Ranawaka
  fullname: Ranawaka, Isuru
  email: isjarana@iu.edu
  organization: Indiana University,Bloomington,IN,USA
– sequence: 2
  givenname: Md Khaledur
  surname: Rahman
  fullname: Rahman, Md Khaledur
  email: khaledrahman@meta.com
  organization: Meta Inc
– sequence: 3
  givenname: Ariful
  surname: Azad
  fullname: Azad, Ariful
  email: azad@iu.edu
  organization: Indiana University,Bloomington,IN,USA
BookMark eNotjF1PwjAYRqvRRED-gSb9A8O3X-t6aYYgkSARuCbd-hZKZCPtuPDfu0SvnuSck2dI7pq2QUKeGUwYA_OyWE_XGyWNMhMOXEwAgMkbMjbaFEKBEDrP-S0ZMCUg46DVAxmmdALgIKQZkN00pC6G6tqho5uLjQnpl21ce6br2J6w7kLb0G1ETNS3kZZt0_fXHjcH-pGt0EZMHV1hOByr3s-jvRzTI7n39jvh-H9HZDd725bv2fJzvihfl1ngILus8gJz4ZnGinGpVC24Es5i4fJC1bVXVnNlmGIuR-O5yWvBQDrra68LdE6MyNPfb0DE_SWGs40_ewZMa8lA_ALa51Sm
CODEN IEEPAD
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/IPDPS54959.2023.00014
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Xplore POP ALL
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISBN 9798350337662
EISSN 1530-2075
EndPage 46
ExternalDocumentID 10177410
Genre orig-research
GrantInformation_xml – fundername: Advanced Scientific Computing Research
  funderid: 10.13039/100006192
– fundername: U.S. Department of Energy
  funderid: 10.13039/100000015
GroupedDBID 29O
6IE
6IF
6IH
6IK
6IL
6IN
AAJGR
AAWTH
ABLEC
ADZIZ
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IEGSK
IPLJI
OCL
RIE
RIL
ID FETCH-LOGICAL-i204t-bf3e63f17eb12455c3253dae8d685ccf5a7259151d6e9f296c3104dafcf78edd3
IEDL.DBID RIE
ISICitedReferencesCount 2
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001035517300005&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
IngestDate Wed Aug 27 02:11:46 EDT 2025
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i204t-bf3e63f17eb12455c3253dae8d685ccf5a7259151d6e9f296c3104dafcf78edd3
PageCount 11
ParticipantIDs ieee_primary_10177410
PublicationCentury 2000
PublicationDate 2023-May
PublicationDateYYYYMMDD 2023-05-01
PublicationDate_xml – month: 05
  year: 2023
  text: 2023-May
PublicationDecade 2020
PublicationTitle Proceedings - IEEE International Parallel and Distributed Processing Symposium
PublicationTitleAbbrev IPDPS
PublicationYear 2023
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0020349
Score 1.8451526
Snippet A random projection tree that partitions data points by projecting them onto random vectors is widely used for approximate nearest neighbor search in...
SourceID ieee
SourceType Publisher
StartPage 36
SubjectTerms Approximation algorithms
Costs
Data visualization
Distributed databases
distributed memory algorithms
k nearest neighbor graph
Nearest neighbor methods
parallel algorithm
Parallel processing
Scalability
Title Distributed Sparse Random Projection Trees for Constructing K-Nearest Neighbor Graphs
URI https://ieeexplore.ieee.org/document/10177410
WOSCitedRecordID wos001035517300005&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3JTsMwELWg4sCJrYhdPnANzW7nTCkgUBTRFvVWeRmjIpFWTeH7mUnSwoUDt8hSFNljZ_zsee8xdp262DodZXTp7jzM0JGndZB5Ugc2AKcyV3sdvj6LPJeTSVa0ZPWaCwMAdfEZ3NBjfZdv5-aTjsp6NH0wAyJC3xZCNGStDboioZWWohP4We-x6BdDBD8JsVFCkjH1ianzy0KlziCDvX9-e591f7h4vNhkmQO2BeUh21ubMfB2bR6xcZ8kcMm9CiwfLhCvAn9RpZ1_0OvvdcVVyUdLgIrjRpWTU2ejHVu-8ScvJy3basVzOirFecHvScm66rLx4G50--C1ngneLPTjladdBGnkAoH_4DBOEhOFSWQVSJvKxBiXKIGAB9O8TQHjkKUG93exVc44IcHa6Jh1ynkJJ4xLLRBOSWnD2MYKgRb4KtWZSUPlSKfrlHVpmKaLRhZjuh6hsz_az9kuRaKpFrxgHewjXLId87WaVcurOpjf80uhZQ
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1dT8IwFG0MmugTfmD8tg--TvbRbe2ziBBwWQQMb6Rrbw0mDsLQ32_vNtAXH3xbmixLe9vdnvaecwi5iwzTJgsEXrobx2bowMkyTzg887QHRgpTeh2-DuMk4dOpSGuyesmFAYCy-Azu8bG8y9cL9YlHZW2cPjYDWoS-GzLmexVda4uvUGqlJul4rmj30046svAnRD6Kj0KmLnJ1fpmolDmk2_zn1w9J64eNR9NtnjkiO5Afk-bGjoHWq_OETDoogov-VaDpaGkRK9AXmevFB77-XtZc5XS8Aiio3apS9Oqs1GPzNzpwElSzLdY0wcNSOzPoE2pZFy0y6T6OH3pO7ZrgzH2XrZ3MBBAFxovtX9hnYagCPwy0BK4jHiplQhlbyGMTvY7ARkJEyu7wmJZGmZiD1sEpaeSLHM4I5VlsARXn2meaSQu1wJVRJlTkS4NKXeekhcM0W1bCGLPNCF380X5L9nvj5-Fs2E8Gl-QAo1LVDl6Rhu0vXJM99bWeF6ubMrDfhwGkrA
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=Proceedings+-+IEEE+International+Parallel+and+Distributed+Processing+Symposium&rft.atitle=Distributed+Sparse+Random+Projection+Trees+for+Constructing+K-Nearest+Neighbor+Graphs&rft.au=Ranawaka%2C+Isuru&rft.au=Rahman%2C+Md+Khaledur&rft.au=Azad%2C+Ariful&rft.date=2023-05-01&rft.pub=IEEE&rft.eissn=1530-2075&rft.spage=36&rft.epage=46&rft_id=info:doi/10.1109%2FIPDPS54959.2023.00014&rft.externalDocID=10177410