NEO-DNND: Communication-Optimized Distributed Nearest Neighbor Graph Construction
Graph-based approximate nearest neighbor algorithms have shown high neighbor structure representation quality. NN-Descent is a widely known graph-based approximate nearest neighbor (ANN) algorithm. However, graph-based approaches are memory- and time-consuming.To address the drawbacks, we develop a...
Uložené v:
| Vydané v: | SC24-W: Workshops of the International Conference for High Performance Computing, Networking, Storage and Analysis s. 688 - 696 |
|---|---|
| Hlavní autori: | , , , , |
| Médium: | Konferenčný príspevok.. |
| Jazyk: | English |
| Vydavateľské údaje: |
IEEE
17.11.2024
|
| Predmet: | |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Shrnutí: | Graph-based approximate nearest neighbor algorithms have shown high neighbor structure representation quality. NN-Descent is a widely known graph-based approximate nearest neighbor (ANN) algorithm. However, graph-based approaches are memory- and time-consuming.To address the drawbacks, we develop a scalable distributed NN-Descent. Our NEO-DNND (neighbor-checking efficiency optimized distributed NN-Descent) is built on top of MPI and designed to utilize network bandwidth efficiently. NEO-DNND reduces duplicate elements, increases intra-node data sharing, and leverages available DRAM to replicate data that may be sent frequently.NEO-DNND showed remarkable scalability up to 256 nodes and was able to construct neighborhood graphs from billion-scale datasets. Compared to a leading shared-memory ANN library, NEO-DNND achieved competitive performance even on a single node and exhibited 41.7X better performance by scaling up to 32 nodes. Furthermore, NEO-DNND outperformed a state-of-the-art distributed NN-Descent implementation, achieving up to a 6.0X speedup. |
|---|---|
| DOI: | 10.1109/SCW63240.2024.00096 |