Streaming Algorithms for Estimating High Set Similarities in LogLog Space

Estimating set similarity and detecting highly similar sets are fundamental problems in areas such as databases and machine learning. MinHash is a well-known technique for approximating Jaccard similarity of sets and has been successfully used for many applications. Its two compressed versions, <...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on knowledge and data engineering Jg. 33; H. 10; S. 3438 - 3452
Hauptverfasser: Qi, Yiyan, Wang, Pinghui, Zhang, Yuanming, Zhai, Qiaozhu, Wang, Chenxu, Tian, Guangjian, Lui, John C.S., Guan, Xiaohong
Format: Journal Article
Sprache:Englisch
Veröffentlicht: New York IEEE 01.10.2021
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Schlagworte:
ISSN:1041-4347, 1558-2191
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!