Efficient and Robust Approximate Nearest Neighbor Search Using Hierarchical Navigable Small World Graphs

We present a new approach for the approximate K-nearest neighbor search based on navigable small world graphs with controllable hierarchy (Hierarchical NSW, HNSW). The proposed solution is fully graph-based, without any need for additional search structures (typically used at the coarse search stage...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE transactions on pattern analysis and machine intelligence Vol. 42; no. 4; pp. 824 - 836
Main Authors:	Malkov, Yu A., Yashunin, D. A.
Format:	Journal Article
Language:	English
Published:	United States IEEE 01.04.2020 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects:	Algorithms approximate search Approximation algorithms artificial intelligence big data Biological system modeling Brain modeling Complexity theory Data models data structures Graph and tree search strategies Graphs graphs and networks information search and retrieval information storage and retrieval information technology and systems Metric space Multilayers nearest neighbor search Performance evaluation Proximity Routing Search problems search process Searching similarity search Structural hierarchy
ISSN:	0162-8828, 1939-3539, 2160-9292, 1939-3539
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	We present a new approach for the approximate K-nearest neighbor search based on navigable small world graphs with controllable hierarchy (Hierarchical NSW, HNSW). The proposed solution is fully graph-based, without any need for additional search structures (typically used at the coarse search stage of the most proximity graph techniques). Hierarchical NSW incrementally builds a multi-layer structure consisting of a hierarchical set of proximity graphs (layers) for nested subsets of the stored elements. The maximum layer in which an element is present is selected randomly with an exponentially decaying probability distribution. This allows producing graphs similar to the previously studied Navigable Small World (NSW) structures while additionally having the links separated by their characteristic distance scales. Starting the search from the upper layer together with utilizing the scale separation boosts the performance compared to NSW and allows a logarithmic complexity scaling. Additional employment of a heuristic for selecting proximity graph neighbors significantly increases performance at high recall and in case of highly clustered data. Performance evaluation has demonstrated that the proposed general metric space search index is able to strongly outperform previous opensource state-of-the-art vector-only approaches. Similarity of the algorithm to the skip list structure allows straightforward balanced distributed implementation.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23
ISSN:	0162-8828 1939-3539 2160-9292 1939-3539
DOI:	10.1109/TPAMI.2018.2889473