Distributed Sparse Random Projection Trees for Constructing K-Nearest Neighbor Graphs

A random projection tree that partitions data points by projecting them onto random vectors is widely used for approximate nearest neighbor search in high-dimensional space. We consider a particular case of random projection trees for constructing a k-nearest neighbor graph (KNNG) from high-dimensio...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Proceedings - IEEE International Parallel and Distributed Processing Symposium s. 36 - 46
Hlavní autoři: Ranawaka, Isuru, Rahman, Md Khaledur, Azad, Ariful
Médium: Konferenční příspěvek
Jazyk:angličtina
Vydáno: IEEE 01.05.2023
Témata:
ISSN:1530-2075
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Abstract A random projection tree that partitions data points by projecting them onto random vectors is widely used for approximate nearest neighbor search in high-dimensional space. We consider a particular case of random projection trees for constructing a k-nearest neighbor graph (KNNG) from high-dimensional data. We develop a distributed-memory Random Projection Tree (DRPT) algorithm for constructing sparse random projection trees and then running a query on the forest to create the KNN graph. DRPT uses sparse matrix operations and a communication reduction scheme to scale KNN graph constructions to thousands of processes on a supercomputer. The accuracy of DRPT is comparable to state-of-the-art methods for approximate nearest neighbor search, while it runs two orders of magnitude faster than its peers. DRPT is available at https://github.com/HipGraph/DRPT.
AbstractList A random projection tree that partitions data points by projecting them onto random vectors is widely used for approximate nearest neighbor search in high-dimensional space. We consider a particular case of random projection trees for constructing a k-nearest neighbor graph (KNNG) from high-dimensional data. We develop a distributed-memory Random Projection Tree (DRPT) algorithm for constructing sparse random projection trees and then running a query on the forest to create the KNN graph. DRPT uses sparse matrix operations and a communication reduction scheme to scale KNN graph constructions to thousands of processes on a supercomputer. The accuracy of DRPT is comparable to state-of-the-art methods for approximate nearest neighbor search, while it runs two orders of magnitude faster than its peers. DRPT is available at https://github.com/HipGraph/DRPT.
Author Azad, Ariful
Rahman, Md Khaledur
Ranawaka, Isuru
Author_xml – sequence: 1
  givenname: Isuru
  surname: Ranawaka
  fullname: Ranawaka, Isuru
  email: isjarana@iu.edu
  organization: Indiana University,Bloomington,IN,USA
– sequence: 2
  givenname: Md Khaledur
  surname: Rahman
  fullname: Rahman, Md Khaledur
  email: khaledrahman@meta.com
  organization: Meta Inc
– sequence: 3
  givenname: Ariful
  surname: Azad
  fullname: Azad, Ariful
  email: azad@iu.edu
  organization: Indiana University,Bloomington,IN,USA
BookMark eNotjF1PwjAYRqvRRED-gSb9A8O3X-t6aYYgkSARuCbd-hZKZCPtuPDfu0SvnuSck2dI7pq2QUKeGUwYA_OyWE_XGyWNMhMOXEwAgMkbMjbaFEKBEDrP-S0ZMCUg46DVAxmmdALgIKQZkN00pC6G6tqho5uLjQnpl21ce6br2J6w7kLb0G1ETNS3kZZt0_fXHjcH-pGt0EZMHV1hOByr3s-jvRzTI7n39jvh-H9HZDd725bv2fJzvihfl1ngILus8gJz4ZnGinGpVC24Es5i4fJC1bVXVnNlmGIuR-O5yWvBQDrra68LdE6MyNPfb0DE_SWGs40_ewZMa8lA_ALa51Sm
CODEN IEEPAD
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/IPDPS54959.2023.00014
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISBN 9798350337662
EISSN 1530-2075
EndPage 46
ExternalDocumentID 10177410
Genre orig-research
GrantInformation_xml – fundername: Advanced Scientific Computing Research
  funderid: 10.13039/100006192
– fundername: U.S. Department of Energy
  funderid: 10.13039/100000015
GroupedDBID 29O
6IE
6IF
6IH
6IK
6IL
6IN
AAJGR
AAWTH
ABLEC
ADZIZ
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IEGSK
IPLJI
OCL
RIE
RIL
ID FETCH-LOGICAL-i204t-bf3e63f17eb12455c3253dae8d685ccf5a7259151d6e9f296c3104dafcf78edd3
IEDL.DBID RIE
ISICitedReferencesCount 2
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=001035517300005&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
IngestDate Wed Aug 27 02:11:46 EDT 2025
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i204t-bf3e63f17eb12455c3253dae8d685ccf5a7259151d6e9f296c3104dafcf78edd3
PageCount 11
ParticipantIDs ieee_primary_10177410
PublicationCentury 2000
PublicationDate 2023-May
PublicationDateYYYYMMDD 2023-05-01
PublicationDate_xml – month: 05
  year: 2023
  text: 2023-May
PublicationDecade 2020
PublicationTitle Proceedings - IEEE International Parallel and Distributed Processing Symposium
PublicationTitleAbbrev IPDPS
PublicationYear 2023
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0020349
Score 1.8456875
Snippet A random projection tree that partitions data points by projecting them onto random vectors is widely used for approximate nearest neighbor search in...
SourceID ieee
SourceType Publisher
StartPage 36
SubjectTerms Approximation algorithms
Costs
Data visualization
Distributed databases
distributed memory algorithms
k nearest neighbor graph
Nearest neighbor methods
parallel algorithm
Parallel processing
Scalability
Title Distributed Sparse Random Projection Trees for Constructing K-Nearest Neighbor Graphs
URI https://ieeexplore.ieee.org/document/10177410
WOSCitedRecordID wos001035517300005&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV09T8MwELWgYmDiq4hveWANTWwnjmdKAYGiiLaoW5XaZ1Qk0qop_H7ukrawMLBFkaLI9sXvnnPvHWPXk9Bo46QPIudcoIwzAdKIEBM5jACZIEbXKv7XZ51l6Whk8pVYvdbCAEBdfAY3dFn_y3cz-0lHZR0KH0RAZOjbWutGrLVhV2S0spLoRKHpPObdvI_kJyY1iiAb05CUOr9aqNQI0tv757v3WftHi8fzDcocsC0oD9neuhkDX32bR2zYJQtc6l4FjvfnyFeBvxSlm33Q4-91xVXJBwuAimOiyqlTZ-MdW77xpyAjL9tqyTM6KsW44PfkZF212bB3N7h9CFY9E4KpCNUymHgJifSRxj1YqDi2UsTSFZC6JI2t9XGhkfAgzLsEjBcmsZjfKVd463UKzslj1ipnJZwwHik7KZS0TlExX6wKTJ2Uj6QWSgsQxSlr0zSN540txng9Q2d_3D9nu7QSTbXgBWvhGOGS7div5bRaXNWL-Q3EKKCX
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1NT8JAEN0YNNETfmD8dg9eK-1-tN2ziBCwIQKGGym7swYTC6Ho73enLejFg7emSdPs7nTfvO28N4TczXwVKcOtFxhjPKGM8hyN8F0i5yKAhw6jCxX_az9KkngyUYNKrF5oYQCgKD6De7ws_uWbhf7Eo7Imho9DQMfQd6UQLCjlWlt-hVYrlUgn8FWzO2gNho7-SNSjMDQy9VGr86uJSoEh7fo_335IGj9qPDrY4swR2YHsmNQ37Rho9XWekHELTXCxfxUYOlw6xgr0Jc3M4gMffy9qrjI6WgHk1KWqFHt1lu6x2RvteQm62eZrmuBhqYsM-oRe1nmDjNuPo4eOV3VN8ObMF2tvZjmE3AaR24WZkFJzJrlJITZhLLW2Mo0c5XFAb0JQlqlQuwxPmNRqG8VgDD8ltWyRwRmhgdCzVHBtBJbzSZG65EnYgEdMRAxYek4aOE3TZWmMMd3M0MUf92_Jfmf03J_2u0nvkhzgqpS1g1ek5sYL12RPf63n-eqmWNhvTAKj3g
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=Proceedings+-+IEEE+International+Parallel+and+Distributed+Processing+Symposium&rft.atitle=Distributed+Sparse+Random+Projection+Trees+for+Constructing+K-Nearest+Neighbor+Graphs&rft.au=Ranawaka%2C+Isuru&rft.au=Rahman%2C+Md+Khaledur&rft.au=Azad%2C+Ariful&rft.date=2023-05-01&rft.pub=IEEE&rft.eissn=1530-2075&rft.spage=36&rft.epage=46&rft_id=info:doi/10.1109%2FIPDPS54959.2023.00014&rft.externalDocID=10177410