Approximate Vector Set Search: A Bio-Inspired Approach for High-Dimensional Spaces

Vector set search, an underexplored similarity search paradigm, aims to find vector sets similar to a query set. This search paradigm leverages the inherent structural alignment between sets and real-world entities to model more fine-grained and consistent relationships for diverse applications. Thi...

Full description

Saved in:
Bibliographic Details
Published in:Data engineering pp. 891 - 903
Main Authors: Li, Yiqi, Wang, Sheng, Chen, Zhiyu, Chen, Shangfeng, Peng, Zhiyong
Format: Conference Proceeding
Language:English
Published: IEEE 19.05.2025
Subjects:
ISSN:2375-026X
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Vector set search, an underexplored similarity search paradigm, aims to find vector sets similar to a query set. This search paradigm leverages the inherent structural alignment between sets and real-world entities to model more fine-grained and consistent relationships for diverse applications. This task, however, faces more severe efficiency challenges than traditional single-vector search due to the combinatorial explosion of pairings in set-to-set comparisons. In this work, we aim to address the efficiency challenges posed by the combinatorial explosion in vector set search, as well as the curse of dimensionality inherited from single-vector search. To tackle these challenges, we present an efficient algorithm for vector set search, BioVSS (Bio-inspired Vector Set Search). BioVSS simulates the fly olfactory circuit to quantize vectors into sparse binary codes and then designs an index based on the set membership property of the Bloom filter. The quantization and indexing strategy enables BioVSS to efficiently perform vector set search by pruning the search space. Experimental results demonstrate over 50 times speedup compared to linear scanning on million-scale datasets while maintaining a high recall rate of up to 98.9%, making it an efficient solution for vector set search.
ISSN:2375-026X
DOI:10.1109/ICDE65448.2025.00072