Streaming Algorithms for Estimating High Set Similarities in LogLog Space

Estimating set similarity and detecting highly similar sets are fundamental problems in areas such as databases and machine learning. MinHash is a well-known technique for approximating Jaccard similarity of sets and has been successfully used for many applications. Its two compressed versions, <...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:IEEE transactions on knowledge and data engineering Ročník 33; číslo 10; s. 3438 - 3452
Hlavní autori: Qi, Yiyan, Wang, Pinghui, Zhang, Yuanming, Zhai, Qiaozhu, Wang, Chenxu, Tian, Guangjian, Lui, John C.S., Guan, Xiaohong
Médium: Journal Article
Jazyk:English
Vydavateľské údaje: New York IEEE 01.10.2021
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Predmet:
ISSN:1041-4347, 1558-2191
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Abstract Estimating set similarity and detecting highly similar sets are fundamental problems in areas such as databases and machine learning. MinHash is a well-known technique for approximating Jaccard similarity of sets and has been successfully used for many applications. Its two compressed versions, <inline-formula><tex-math notation="LaTeX">b</tex-math> <mml:math><mml:mi>b</mml:mi></mml:math><inline-graphic xlink:href="qi-ieq1-2969423.gif"/> </inline-formula>-bit MinHash and Odd Sketch, can significantly reduce the memory usage of the MinHash, especially for estimating high similarities (i.e., similarities around 1). Although MinHash can be applied to static sets as well as streaming sets, of which elements are given in a streaming fashion, unfortunately, <inline-formula><tex-math notation="LaTeX">b</tex-math> <mml:math><mml:mi>b</mml:mi></mml:math><inline-graphic xlink:href="qi-ieq2-2969423.gif"/> </inline-formula>-bit MinHash and Odd Sketch fail to deal with streaming data. To solve this problem, we previously designed a memory-efficient sketch method, MaxLogHash , to accurately estimate Jaccard similarities in streaming sets. Compared with MinHash, our method uses smaller sized registers (each register consists of less than 7 bits) to build a compact sketch for each set. In this paper, we further develop a faster method, MaxLogOPH++ . Compared with MaxLogHash, MaxLogOPH++ reduces the time complexity for updating each coming element from <inline-formula><tex-math notation="LaTeX">O(k)</tex-math> <mml:math><mml:mrow><mml:mi>O</mml:mi><mml:mo>(</mml:mo><mml:mi>k</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math><inline-graphic xlink:href="qi-ieq3-2969423.gif"/> </inline-formula> to <inline-formula><tex-math notation="LaTeX">O(1)</tex-math> <mml:math><mml:mrow><mml:mi>O</mml:mi><mml:mo>(</mml:mo><mml:mn>1</mml:mn><mml:mo>)</mml:mo></mml:mrow></mml:math><inline-graphic xlink:href="qi-ieq4-2969423.gif"/> </inline-formula> with a small additional memory. We conduct experiments on a variety of datasets, and experimental results demonstrate the efficiency and effectiveness of our methods.
AbstractList Estimating set similarity and detecting highly similar sets are fundamental problems in areas such as databases and machine learning. MinHash is a well-known technique for approximating Jaccard similarity of sets and has been successfully used for many applications. Its two compressed versions, [Formula Omitted]-bit MinHash and Odd Sketch, can significantly reduce the memory usage of the MinHash, especially for estimating high similarities (i.e., similarities around 1). Although MinHash can be applied to static sets as well as streaming sets, of which elements are given in a streaming fashion, unfortunately, [Formula Omitted]-bit MinHash and Odd Sketch fail to deal with streaming data. To solve this problem, we previously designed a memory-efficient sketch method, MaxLogHash , to accurately estimate Jaccard similarities in streaming sets. Compared with MinHash, our method uses smaller sized registers (each register consists of less than 7 bits) to build a compact sketch for each set. In this paper, we further develop a faster method, MaxLogOPH++ . Compared with MaxLogHash, MaxLogOPH++ reduces the time complexity for updating each coming element from [Formula Omitted] to [Formula Omitted] with a small additional memory. We conduct experiments on a variety of datasets, and experimental results demonstrate the efficiency and effectiveness of our methods.
Estimating set similarity and detecting highly similar sets are fundamental problems in areas such as databases and machine learning. MinHash is a well-known technique for approximating Jaccard similarity of sets and has been successfully used for many applications. Its two compressed versions, <inline-formula><tex-math notation="LaTeX">b</tex-math> <mml:math><mml:mi>b</mml:mi></mml:math><inline-graphic xlink:href="qi-ieq1-2969423.gif"/> </inline-formula>-bit MinHash and Odd Sketch, can significantly reduce the memory usage of the MinHash, especially for estimating high similarities (i.e., similarities around 1). Although MinHash can be applied to static sets as well as streaming sets, of which elements are given in a streaming fashion, unfortunately, <inline-formula><tex-math notation="LaTeX">b</tex-math> <mml:math><mml:mi>b</mml:mi></mml:math><inline-graphic xlink:href="qi-ieq2-2969423.gif"/> </inline-formula>-bit MinHash and Odd Sketch fail to deal with streaming data. To solve this problem, we previously designed a memory-efficient sketch method, MaxLogHash , to accurately estimate Jaccard similarities in streaming sets. Compared with MinHash, our method uses smaller sized registers (each register consists of less than 7 bits) to build a compact sketch for each set. In this paper, we further develop a faster method, MaxLogOPH++ . Compared with MaxLogHash, MaxLogOPH++ reduces the time complexity for updating each coming element from <inline-formula><tex-math notation="LaTeX">O(k)</tex-math> <mml:math><mml:mrow><mml:mi>O</mml:mi><mml:mo>(</mml:mo><mml:mi>k</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:math><inline-graphic xlink:href="qi-ieq3-2969423.gif"/> </inline-formula> to <inline-formula><tex-math notation="LaTeX">O(1)</tex-math> <mml:math><mml:mrow><mml:mi>O</mml:mi><mml:mo>(</mml:mo><mml:mn>1</mml:mn><mml:mo>)</mml:mo></mml:mrow></mml:math><inline-graphic xlink:href="qi-ieq4-2969423.gif"/> </inline-formula> with a small additional memory. We conduct experiments on a variety of datasets, and experimental results demonstrate the efficiency and effectiveness of our methods.
Author Guan, Xiaohong
Tian, Guangjian
Wang, Chenxu
Lui, John C.S.
Zhai, Qiaozhu
Zhang, Yuanming
Qi, Yiyan
Wang, Pinghui
Author_xml – sequence: 1
  givenname: Yiyan
  orcidid: 0000-0002-8078-5834
  surname: Qi
  fullname: Qi, Yiyan
  email: qiyiyan@stu.xjtu.edu.cn
  organization: MOE Key Laboratory for Intelligent Networks and Network Security, Xi'an Jiaotong University, Xi'an, Shaanxi, China
– sequence: 2
  givenname: Pinghui
  orcidid: 0000-0001-5779-6108
  surname: Wang
  fullname: Wang, Pinghui
  email: phwang@mail.xjtu.edu.cn
  organization: MOE Key Laboratory for Intelligent Networks and Network Security, Xi'an Jiaotong University, Xi'an, Shaanxi, China
– sequence: 3
  givenname: Yuanming
  surname: Zhang
  fullname: Zhang, Yuanming
  email: zhangyuanming@stu.xjtu.edu.cn
  organization: MOE Key Laboratory for Intelligent Networks and Network Security, Xi'an Jiaotong University, Xi'an, Shaanxi, China
– sequence: 4
  givenname: Qiaozhu
  orcidid: 0000-0002-7312-4923
  surname: Zhai
  fullname: Zhai, Qiaozhu
  email: qzzhai@mail.xjtu.edu.cn
  organization: MOE Key Laboratory for Intelligent Networks and Network Security, Xi'an Jiaotong University, Xi'an, Shaanxi, China
– sequence: 5
  givenname: Chenxu
  orcidid: 0000-0002-9539-5046
  surname: Wang
  fullname: Wang, Chenxu
  email: cxwang@mail.xjtu.edu.cn
  organization: MOE Key Laboratory for Intelligent Networks and Network Security, Xi'an Jiaotong University, Xi'an, Shaanxi, China
– sequence: 6
  givenname: Guangjian
  surname: Tian
  fullname: Tian, Guangjian
  email: Tian.Guangjian@huawei.com
  organization: Shenzhen Research Institute, Xi'an Jiaotong University, Shenzhen, China
– sequence: 7
  givenname: John C.S.
  orcidid: 0000-0001-7466-0384
  surname: Lui
  fullname: Lui, John C.S.
  email: cslui@cse.cuhk.edu.hk
  organization: Tsinghua National Lab for Information Science and Technology, Center for Intelligent and Networked Systems, Tsinghua University, Beijing, China
– sequence: 8
  givenname: Xiaohong
  surname: Guan
  fullname: Guan, Xiaohong
  email: xhguan@mail.xjtu.edu.cn
  organization: Shenzhen Research Institute, Xi'an Jiaotong University, Shenzhen, China
BookMark eNp9kMFOwzAMhiMEEtvgARCXSJw7kjRNmuM0BpuYxKHjXGWp22Xq2pFkB96eVJs4cECybEv-f1v-xui66ztA6IGSKaVEPW_eXxZTRhiZMiUUZ-kVGtEsyxNGFb2OPeE04SmXt2js_Z4QksucjtCqCA70wXYNnrVN72zYHTyue4cXPtiDDsNkaZsdLiDgwh5sq6PIgse2w-u-iYGLozZwh25q3Xq4v9QJ-nxdbObLZP3xtprP1olhKg0xVxIYVFoYoTUYLmWVigxUzWHLSZYJMIRUUnFlQAhT6zojteScQbbNoUon6Om89-j6rxP4UO77k-viyZJlkjLGiCRRRc8q43rvHdTl0cV33HdJSTkQKwdi5UCsvBCLHvnHY2yIBPouOG3bf52PZ6cFgN9LuRJ5KkT6A6i9evg
CODEN ITKEEH
CitedBy_id crossref_primary_10_1007_s12065_025_01079_x
crossref_primary_10_1109_TETC_2022_3221872
crossref_primary_10_1145_3639281
crossref_primary_10_1109_TKDE_2024_3523034
crossref_primary_10_1109_TNSE_2023_3275809
crossref_primary_10_1109_TKDE_2023_3342747
Cites_doi 10.1145/543614.543615
10.1007/s10115-011-0428-y
10.1007/s41019-019-0095-7
10.1145/2566486.2568017
10.1007/978-3-030-18576-3_20
10.1145/78922.78925
10.1145/509961.509965
10.1016/0022-0000(85)90041-8
10.1145/1326561.1326564
10.1109/TKDE.2018.2886189
10.1145/276698.276876
10.1145/362686.362692
10.1109/ICDM.2010.80
10.1145/237814.237823
10.1006/jcss.1999.1690
10.1145/2806416.2806515
10.1109/69.908981
10.1145/3219819.3220089
10.1145/3097983.3097999
10.1145/2588555.2588565
10.14778/2140436.2140440
10.1016/j.dam.2008.06.020
10.1145/997817.997857
10.1017/CBO9780511813603
10.2307/3619617
10.1145/1989323.1989428
10.1145/3038912.3052598
10.1145/1526709.1526761
10.1145/1557019.1557049
10.1017/CBO9780511572050
10.1515/9783110226744
10.1109/ICDM.2016.0174
10.1109/ICDM.2017.64
10.1145/1772690.1772759
10.1109/INFOCOM.2017.8057088
10.1109/ICDE.2019.00172
10.1145/3292500.3330825
10.1145/2783258.2783406
10.1016/S0304-3975(97)00167-9
ContentType Journal Article
Copyright Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2021
Copyright_xml – notice: Copyright The Institute of Electrical and Electronics Engineers, Inc. (IEEE) 2021
DBID 97E
RIA
RIE
AAYXX
CITATION
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
DOI 10.1109/TKDE.2020.2969423
DatabaseName IEEE All-Society Periodicals Package (ASPP) 2005-present
IEEE All-Society Periodicals Package (ASPP) 1998-Present
IEEE Xplore
CrossRef
Computer and Information Systems Abstracts
Electronics & Communications Abstracts
Technology Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
DatabaseTitle CrossRef
Technology Research Database
Computer and Information Systems Abstracts – Academic
Electronics & Communications Abstracts
ProQuest Computer Science Collection
Computer and Information Systems Abstracts
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts Professional
DatabaseTitleList Technology Research Database

Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
Computer Science
EISSN 1558-2191
EndPage 3452
ExternalDocumentID 10_1109_TKDE_2020_2969423
8968366
Genre orig-research
GrantInformation_xml – fundername: GRF R4032-18
– fundername: MoE-CMCC
  grantid: MCM20190701
– fundername: National Natural Science Foundation of China
  grantid: 61922067; U1736205; 61902305
  funderid: 10.13039/501100001809
– fundername: Natural Science Basic Research Plan in ZheJiang Province of China
  grantid: LGG18F020016
– fundername: National Key Research and Development Program of China
  grantid: 2018YFC0830500
  funderid: 10.13039/501100012166
– fundername: Shenzhen Basic Research
  grantid: JCYJ20170816100819428
– fundername: Natural Science Basic Research Plan in Shaanxi Province of China
  grantid: 2019JM-159
GroupedDBID -~X
.DC
0R~
29I
4.4
5GY
6IK
97E
AAJGR
AARMG
AASAJ
AAWTH
ABAZT
ABQJQ
ABVLG
ACGFO
ACIWK
AENEX
AGQYO
AHBIQ
AKJIK
AKQYR
ALMA_UNASSIGNED_HOLDINGS
ASUFR
ATWAV
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CS3
DU5
EBS
EJD
F5P
HZ~
IEDLZ
IFIPE
IPLJI
JAVBF
LAI
M43
MS~
O9-
OCL
P2P
PQQKQ
RIA
RIE
RNS
RXW
TAE
TN5
UHB
AAYXX
CITATION
7SC
7SP
8FD
JQ2
L7M
L~C
L~D
ID FETCH-LOGICAL-c293t-c2d7e2eda6c6aaec477d365e9f4eb40556ec00d7949ce66cfaf50f7442e5b8ed3
IEDL.DBID RIE
ISICitedReferencesCount 6
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000694697300009&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 1041-4347
IngestDate Sun Nov 30 05:35:52 EST 2025
Tue Nov 18 22:35:39 EST 2025
Sat Nov 29 02:36:02 EST 2025
Wed Aug 27 02:27:34 EDT 2025
IsPeerReviewed true
IsScholarly true
Issue 10
Language English
License https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html
https://doi.org/10.15223/policy-029
https://doi.org/10.15223/policy-037
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c293t-c2d7e2eda6c6aaec477d365e9f4eb40556ec00d7949ce66cfaf50f7442e5b8ed3
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ORCID 0000-0002-8078-5834
0000-0001-7466-0384
0000-0002-9539-5046
0000-0002-7312-4923
0000-0001-5779-6108
PQID 2571222070
PQPubID 85438
PageCount 15
ParticipantIDs proquest_journals_2571222070
crossref_primary_10_1109_TKDE_2020_2969423
crossref_citationtrail_10_1109_TKDE_2020_2969423
ieee_primary_8968366
PublicationCentury 2000
PublicationDate 2021-10-01
PublicationDateYYYYMMDD 2021-10-01
PublicationDate_xml – month: 10
  year: 2021
  text: 2021-10-01
  day: 01
PublicationDecade 2020
PublicationPlace New York
PublicationPlace_xml – name: New York
PublicationTitle IEEE transactions on knowledge and data engineering
PublicationTitleAbbrev TKDE
PublicationYear 2021
Publisher IEEE
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Publisher_xml – name: IEEE
– name: The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
References ref13
ref15
ref14
li (ref18) 2012
ref53
ref52
ref55
ref11
ref54
ref10
ref17
ref16
shrivastava (ref24) 2014
ref50
ref46
ref48
ref47
gionis (ref51) 1999; 99
ref42
(ref28) 2010
ref41
ref44
ref43
manasse (ref31) 2010
ref49
ref8
ref7
ref9
ref3
ref6
haeupler (ref32) 2014
ref40
shrivastava (ref25) 2017
ref35
ref34
ref37
ref30
ref33
shrivastava (ref36) 2016
ref2
ref1
ref39
ref38
shrivastava (ref23) 2014
durand (ref45) 2003
broder (ref5) 1997
mai (ref26) 2019
flajolet (ref19) 2007
ref20
ref22
ref21
yu (ref12) 2017
ref27
li (ref4) 2011
massa (ref29) 2005
References_xml – ident: ref1
  doi: 10.1145/543614.543615
– ident: ref40
  doi: 10.1007/s10115-011-0428-y
– start-page: 732
  year: 2014
  ident: ref23
  article-title: Improved densification of one permutation hashing
  publication-title: Proc Conf Uncertainty of Artificial Intelligence
– ident: ref3
  doi: 10.1007/s41019-019-0095-7
– ident: ref11
  doi: 10.1145/2566486.2568017
– ident: ref6
  doi: 10.1007/978-3-030-18576-3_20
– ident: ref44
  doi: 10.1145/78922.78925
– start-page: 2672
  year: 2011
  ident: ref4
  article-title: Hashing algorithms for large-scale learning
  publication-title: Proc 24th Int Conf Neural Inf Process Syst
– ident: ref30
  doi: 10.1145/509961.509965
– year: 2014
  ident: ref32
  article-title: Consistent weighted sampling made fast, small, and easy
  publication-title: arXiv 1410 4266
– ident: ref13
  doi: 10.1016/0022-0000(85)90041-8
– ident: ref9
  doi: 10.1145/1326561.1326564
– volume: 99
  start-page: 518
  year: 1999
  ident: ref51
  article-title: Similarity search in high dimensions via hashing
  publication-title: Proc 5th Int Conf Very Large Data Bases
– ident: ref55
  doi: 10.1109/TKDE.2018.2886189
– ident: ref50
  doi: 10.1145/276698.276876
– ident: ref17
  doi: 10.1145/362686.362692
– ident: ref33
  doi: 10.1109/ICDM.2010.80
– start-page: 121
  year: 2005
  ident: ref29
  article-title: Controversial users demand local trust metrics: An experimental study on epinions.com community
  publication-title: Proc 20th Nat Conf Artif Intell
– start-page: 605
  year: 2003
  ident: ref45
  publication-title: Loglog Counting of Large Cardinalities
– ident: ref42
  doi: 10.1145/237814.237823
– year: 2010
  ident: ref28
  article-title: Text REtrieval Conference (TREC) English documents
– ident: ref7
  doi: 10.1006/jcss.1999.1690
– ident: ref41
  doi: 10.1145/2806416.2806515
– ident: ref27
  doi: 10.1109/69.908981
– ident: ref38
  doi: 10.1145/3219819.3220089
– start-page: 557
  year: 2014
  ident: ref24
  article-title: Densifying one permutation hashing via rotation for fast near neighbor search
  publication-title: Proc 31st Int Conf Mach Learn
– ident: ref48
  doi: 10.1145/3097983.3097999
– ident: ref54
  doi: 10.1145/2588555.2588565
– ident: ref52
  doi: 10.14778/2140436.2140440
– ident: ref47
  doi: 10.1016/j.dam.2008.06.020
– ident: ref39
  doi: 10.1145/997817.997857
– ident: ref22
  doi: 10.1017/CBO9780511813603
– ident: ref16
  doi: 10.2307/3619617
– ident: ref53
  doi: 10.1145/1989323.1989428
– ident: ref37
  doi: 10.1145/3038912.3052598
– year: 2017
  ident: ref12
  article-title: HyperMinHash: Jaccard index sketching in LogLog space
  publication-title: arXiv 1710 08436
– start-page: 21
  year: 1997
  ident: ref5
  article-title: On the resemblance and containment of documents
  publication-title: Proc Compression Complexity Sequences
– ident: ref8
  doi: 10.1145/1526709.1526761
– start-page: 3122
  year: 2012
  ident: ref18
  article-title: One permutation hashing
  publication-title: Proc 25th Int Conf Neural Inf Process Syst
– ident: ref2
  doi: 10.1145/1557019.1557049
– ident: ref15
  doi: 10.1017/CBO9780511572050
– ident: ref20
  doi: 10.1515/9783110226744
– ident: ref35
  doi: 10.1109/ICDM.2016.0174
– ident: ref43
  doi: 10.1109/ICDM.2017.64
– ident: ref10
  doi: 10.1145/1772690.1772759
– ident: ref46
  doi: 10.1109/INFOCOM.2017.8057088
– year: 2010
  ident: ref31
  article-title: Consistent weighted sampling
– start-page: 3154
  year: 2017
  ident: ref25
  article-title: Optimal densification for fast and accurate minwise hashing
  publication-title: Proc 34th Int Conf Mach Learn
– year: 2019
  ident: ref26
  article-title: On densification for minwise hashing
  publication-title: Proc Conf Uncertainty Artif Intell
– ident: ref49
  doi: 10.1109/ICDE.2019.00172
– start-page: 1498
  year: 2016
  ident: ref36
  article-title: Simple and efficient weighted minwise hashing
  publication-title: Proc 30th Int Conf Neural Inf Process Syst
– start-page: 127
  year: 2007
  ident: ref19
  article-title: HyperLogLog: The analysis of a near-optimal cardinality estimation algorithm
  publication-title: Proc Int Conf Anal Algorithms
– ident: ref14
  doi: 10.1145/3292500.3330825
– ident: ref34
  doi: 10.1145/2783258.2783406
– ident: ref21
  doi: 10.1016/S0304-3975(97)00167-9
SSID ssj0008781
Score 2.3877304
Snippet Estimating set similarity and detecting highly similar sets are fundamental problems in areas such as databases and machine learning. MinHash is a well-known...
SourceID proquest
crossref
ieee
SourceType Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 3438
SubjectTerms Algorithms
Estimation
Estimation error
jaccard similarity
Machine learning
Registers
Similarity
sketch
Streaming algorithms
Time complexity
Trajectory
Title Streaming Algorithms for Estimating High Set Similarities in LogLog Space
URI https://ieeexplore.ieee.org/document/8968366
https://www.proquest.com/docview/2571222070
Volume 33
WOSCitedRecordID wos000694697300009&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVIEE
  databaseName: IEEE Electronic Library (IEL)
  customDbUrl:
  eissn: 1558-2191
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0008781
  issn: 1041-4347
  databaseCode: RIE
  dateStart: 19890101
  isFulltext: true
  titleUrlDefault: https://ieeexplore.ieee.org/
  providerName: IEEE
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LS8QwEB5UPOjBt7i-yMGTWO22adIcRVcURYRV8FbayVQL-5Dd6u93ku0uiiIIpQw0gZKv8-q8AI60TMkWsQmMbkeBVEUS5K4zIlNSRqhYWvpC4Tt9f58-P5uHOTiZ1cIQkU8-o1NH-li-HeK7-1V2lhqVxkrNw7zWalKrNZO6qfYDSdm7YJ8olrqJYLZDc_Z4e9lhTzAKTyOjjIzibzrID1X5IYm9erla_d-LrcFKY0aK8wnu6zBHgw1YnY5oEA3HbsDyl36Dm3DjYtB5n2lx3nsZjqr6tT8WbLaKDnO6s135icv8EF2qRbfqV-z3-parohqIu-ELX6LLXjZtwdNV5_HiOmhmKQTICr3mu9UUkc0VqjwnlFrbWCVkSkmFdB11CMPQMncaJKWwzMskLDXjRUnBcMbbsDAYDmgHBJW5SQxiirZgOKOcCowtYZxa1KrAFoTT082waTTu5l30Mu9whCZzgGQOkKwBpAXHsy1vky4bfy3edAjMFjaH34L9KYRZw4fjjAVSmy0glmu7v-_ag6XIZan49Lx9WKhH73QAi_hRV-PRof_EPgFgTc76
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV3dS-NAEB-8nnDnw9XzA3vn3e2DT2I0TTa72UfxKoq1CK3gW0hmJxqw7dFG_35nt2m5QxEOQljIbgjzy8zO7HwBHGiZki1iExjdjQKpiiTIXWVEHkkZoWJp6ROF-3owSO_uzM0aHK1yYYjIB5_RsRt6X76d4pM7KjtJjUpjpT7Ax4RfFC6ytVZyN9W-JSnbF2wVxVI3PsxuaE5GV797bAtG4XFklJFR_M8u5NuqvJLFfoM5b__fp23Cl0aRFKcL5L_CGk22oL1s0iAant2Cjb8qDm7DpfNC52Mei9PH--msqh_Gc8GKq-gxrzvtlZ-42A8xpFoMq3HFlq8vuiqqiehP7_kSQ7azaQduz3ujs4ug6aYQIG_pNd-tpohsrlDlOaHU2sYqIVNKKqSrqUMYhpb50yAphWVeJmGpmdCUFAxovAutyXRCeyCozE1iEFO0BQMa5VRgbAnj1KJWBXYgXFI3w6bUuOt48Zh5kyM0mQMkc4BkDSAdOFwt-bOos_He5G2HwGpiQ_wO7C8hzBpOnGcskrqsA7Fk-_b2ql_w6WJ03c_6l4Or7_A5cjErPlhvH1r17Il-wDo-19V89tP_bi9Lw9JB
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Streaming+Algorithms+for+Estimating+High+Set+Similarities+in+LogLog+Space&rft.jtitle=IEEE+transactions+on+knowledge+and+data+engineering&rft.au=Qi%2C+Yiyan&rft.au=Wang%2C+Pinghui&rft.au=Zhang%2C+Yuanming&rft.au=Zhai%2C+Qiaozhu&rft.date=2021-10-01&rft.issn=1041-4347&rft.eissn=1558-2191&rft.volume=33&rft.issue=10&rft.spage=3438&rft.epage=3452&rft_id=info:doi/10.1109%2FTKDE.2020.2969423&rft.externalDBID=n%2Fa&rft.externalDocID=10_1109_TKDE_2020_2969423
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1041-4347&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1041-4347&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1041-4347&client=summon