Lower Bounds on Performance of Metric Tree Indexing Schemes for Exact Similarity Search in High Dimensions
Within a mathematically rigorous model, we analyse the curse of dimensionality for deterministic exact similarity search in the context of popular indexing schemes: metric trees. The datasets X are sampled randomly from a domain Ω , equipped with a distance, ρ , and an underlying probability distrib...
Gespeichert in:
| Veröffentlicht in: | Algorithmica Jg. 66; H. 2; S. 310 - 328 |
|---|---|
| 1. Verfasser: | |
| Format: | Journal Article |
| Sprache: | Englisch |
| Veröffentlicht: |
New York
Springer-Verlag
01.06.2013
Springer |
| Schlagworte: | |
| ISSN: | 0178-4617, 1432-0541 |
| Online-Zugang: | Volltext |
| Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
| Zusammenfassung: | Within a mathematically rigorous model, we analyse the curse of dimensionality for deterministic exact similarity search in the context of popular indexing schemes: metric trees. The datasets
X
are sampled randomly from a domain
Ω
, equipped with a distance,
ρ
, and an underlying probability distribution,
μ
. While performing an asymptotic analysis, we send the intrinsic dimension
d
of
Ω
to infinity, and assume that the size of a dataset,
n
, grows superpolynomially yet subexponentially in
d
. Exact similarity search refers to finding the nearest neighbour in the dataset
X
to a query point
ω
∈
Ω
, where the query points are subject to the same probability distribution
μ
as datapoints. Let
denote a class of all 1-Lipschitz functions on
Ω
that can be used as decision functions in constructing a hierarchical metric tree indexing scheme. Suppose the VC dimension of the class of all sets {
ω
:
f
(
ω
)≥
a
},
a
∈ℝ is
o
(
n
1/4
/log
2
n
). (In view of a 1995 result of Goldberg and Jerrum, even a stronger complexity assumption
d
O
(1)
is reasonable.) We deduce the
Ω
(
n
1/4
) lower bound on the expected average case performance of hierarchical metric-tree based indexing schemes for exact similarity search in (
Ω
,
X
). In paricular, this bound is superpolynomial in
d
. |
|---|---|
| ISSN: | 0178-4617 1432-0541 |
| DOI: | 10.1007/s00453-012-9638-2 |