Lower Bounds on Performance of Metric Tree Indexing Schemes for Exact Similarity Search in High Dimensions

Within a mathematically rigorous model, we analyse the curse of dimensionality for deterministic exact similarity search in the context of popular indexing schemes: metric trees. The datasets X are sampled randomly from a domain Ω , equipped with a distance, ρ , and an underlying probability distrib...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Algorithmica Jg. 66; H. 2; S. 310 - 328
1. Verfasser: Pestov, Vladimir
Format: Journal Article
Sprache:Englisch
Veröffentlicht: New York Springer-Verlag 01.06.2013
Springer
Schlagworte:
ISSN:0178-4617, 1432-0541
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:Within a mathematically rigorous model, we analyse the curse of dimensionality for deterministic exact similarity search in the context of popular indexing schemes: metric trees. The datasets X are sampled randomly from a domain Ω , equipped with a distance, ρ , and an underlying probability distribution, μ . While performing an asymptotic analysis, we send the intrinsic dimension d of Ω to infinity, and assume that the size of a dataset, n , grows superpolynomially yet subexponentially in d . Exact similarity search refers to finding the nearest neighbour in the dataset X to a query point ω ∈ Ω , where the query points are subject to the same probability distribution μ as datapoints. Let denote a class of all 1-Lipschitz functions on Ω that can be used as decision functions in constructing a hierarchical metric tree indexing scheme. Suppose the VC dimension of the class of all sets { ω : f ( ω )≥ a }, a ∈ℝ is o ( n 1/4 /log 2 n ). (In view of a 1995 result of Goldberg and Jerrum, even a stronger complexity assumption d O (1) is reasonable.) We deduce the Ω ( n 1/4 ) lower bound on the expected average case performance of hierarchical metric-tree based indexing schemes for exact similarity search in ( Ω , X ). In paricular, this bound is superpolynomial in d .
ISSN:0178-4617
1432-0541
DOI:10.1007/s00453-012-9638-2