A Scalable Similarity Join Algorithm Based on MapReduce and LSH
Similarity joins are recognized to be among the most useful data processing and analysis operations. A similarity join is used to retrieve all data pairs whose distances are smaller than a predefined threshold λ . In this paper, we introduce the MRS-join algorithm to perform similarity joins on larg...
Saved in:
| Published in: | International journal of parallel programming Vol. 50; no. 3-4; pp. 360 - 380 |
|---|---|
| Main Authors: | , , , |
| Format: | Journal Article |
| Language: | English |
| Published: |
New York
Springer US
01.08.2022
Springer Nature B.V Springer Verlag |
| Subjects: | |
| ISSN: | 0885-7458, 1573-7640 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Abstract | Similarity joins are recognized to be among the most useful data processing and analysis operations. A similarity join is used to retrieve all data pairs whose distances are smaller than a predefined threshold
λ
. In this paper, we introduce the
MRS-join
algorithm to perform similarity joins on large trajectory datasets. The MapReduce model and a randomized local sensitive hashing keys redistribution approach are used to balance load among processing nodes while reducing communications and computations to almost all relevant data by using distributed histograms. A cost analysis of the
MRS-join
algorithm shows that our approach is insensitive to data skew and guarantees perfect balancing properties, in large scale systems, during all stages of similarity join computations. These performances have been confirmed by a series of experiments using the Fréchet distance on large datasets of trajectories from real world and synthetic data benchmarks. |
|---|---|
| AbstractList | Similarity joins are recognized to be among the most useful data processing and analysis operations. A similarity join is used to retrieve all data pairs whose distances are smaller than a predefined threshold λ. In this paper, we introduce the MRS-join algorithm to perform similarity joins on large trajectory datasets. The MapReduce model and a randomized local sensitive hashing keys redistribution approach are used to balance load among processing nodes while reducing communications and computations to almost all relevant data by using distributed histograms. A cost analysis of the MRS-join algorithm shows that our approach is insensitive to data skew and guarantees perfect balancing properties, in large scale systems, during all stages of similarity join computations. These performances have been confirmed by a series of experiments using the Fréchet distance on large datasets of trajectories from real world and synthetic data benchmarks. Similarity joins are recognized to be among the most useful data processing and analysis operations. A similarity join is used to retrieve all data pairs whose distances are smaller than a predefined threshold λ . In this paper, we introduce the MRS-join algorithm to perform similarity joins on large trajectory datasets. The MapReduce model and a randomized local sensitive hashing keys redistribution approach are used to balance load among processing nodes while reducing communications and computations to almost all relevant data by using distributed histograms. A cost analysis of the MRS-join algorithm shows that our approach is insensitive to data skew and guarantees perfect balancing properties, in large scale systems, during all stages of similarity join computations. These performances have been confirmed by a series of experiments using the Fréchet distance on large datasets of trajectories from real world and synthetic data benchmarks. |
| Author | Robert, Sophie Rivault, Sébastien Bamha, Mostafa Limet, Sébastien |
| Author_xml | – sequence: 1 givenname: Sébastien surname: Rivault fullname: Rivault, Sébastien organization: Université Orléans, INSA Centre Val de Loire, LIFO, EA – sequence: 2 givenname: Mostafa surname: Bamha fullname: Bamha, Mostafa email: Mostafa.Bamha@univ-orleans.fr organization: Université Orléans, INSA Centre Val de Loire, LIFO, EA – sequence: 3 givenname: Sébastien surname: Limet fullname: Limet, Sébastien organization: Université Orléans, INSA Centre Val de Loire, LIFO, EA – sequence: 4 givenname: Sophie surname: Robert fullname: Robert, Sophie organization: Université Orléans, INSA Centre Val de Loire, LIFO, EA |
| BackLink | https://hal.science/hal-03677361$$DView record in HAL |
| BookMark | eNp9kE1LxDAQhoMouH78AU8BTx6qmabNx0lWUVdZEVw9h9k2XSPdZk26gv_erFUED57ChOcZ3nn3yHbnO0vIEbBTYEyeRWBSiIzleZZGzjOxRUZQSp5JUbBtMmJKlZksSrVL9mJ8ZYxpqdSInI_prMIW562lM7d0LQbXf9A77zo6bhc-TS9LeoHR1tR39B5Xj7ZeV5ZiV9PpbHJAdhpsoz38fvfJ8_XV0-Ukmz7c3F6Op1mVK95nFYBFpayyHGrdFBrmqqznBWvmORa5rjTWZQHIWYMSoNGoQUCpGgCty3THPjkZ9r5ga1bBLTF8GI_OTMZTs_ljXEjJBbxDYo8HdhX829rG3rz6dehSPJMLVao8RdKJUgNVBR9jsI2pXI-9810f0LUGmNlUa4ZqTarWfFVrRFLzP-pPon8lPkgxwd3Cht9U_1ifM1yKDw |
| CitedBy_id | crossref_primary_10_1145_3725403 crossref_primary_10_1007_s10766_024_00772_1 |
| Cites_doi | 10.1016/j.procs.2015.05.200 10.1145/3231541.3231549 10.1080/13658816.2016.1199806 10.1142/S0129626403001306 10.14778/3137628.3137655 10.1007/s00454-012-9402-z 10.1142/S0218195995000064 10.1007/s00454-017-9878-7 10.1145/1327452.1327492 10.2307/2226729 10.1016/j.procs.2014.05.014 10.14778/2212351.2212353 10.1007/978-3-030-24766-9_19 10.1145/276698.276876 10.1007/11546924_60 10.1109/ICDE.2019.00115 10.1145/1807167.1807273 10.1109/ICDAR.2007.4378752 10.1145/513400.513414 10.1109/FOCS.2014.76 10.1145/3139958.3140062 10.1145/3034786.3056110 |
| ContentType | Journal Article |
| Copyright | The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2022 The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2022. Distributed under a Creative Commons Attribution 4.0 International License |
| Copyright_xml | – notice: The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2022 – notice: The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2022. – notice: Distributed under a Creative Commons Attribution 4.0 International License |
| DBID | AAYXX CITATION 3V. 7SC 7WY 7WZ 7XB 87Z 8AL 8FD 8FE 8FG 8FK 8FL 8G5 ABUWG AFKRA ARAPS AZQEC BENPR BEZIV BGLVJ CCPQU DWQXO FRNLG F~G GNUQQ GUQSH HCIFZ JQ2 K60 K6~ K7- L.- L7M L~C L~D M0C M0N M2O MBDVC P5Z P62 PHGZM PHGZT PKEHL PQBIZ PQBZA PQEST PQGLB PQQKQ PQUKI Q9U 1XC |
| DOI | 10.1007/s10766-022-00733-6 |
| DatabaseName | CrossRef ProQuest Central (Corporate) Computer and Information Systems Abstracts ABI/INFORM Collection ABI/INFORM Global (PDF only) ProQuest Central (purchase pre-March 2016) ABI/INFORM Collection Computing Database (Alumni Edition) Technology Research Database ProQuest SciTech Collection ProQuest Technology Collection ProQuest Central (Alumni) (purchase pre-March 2016) ABI/INFORM Collection (Alumni Edition) Research Library (Alumni Edition) ProQuest Central (Alumni) ProQuest Central UK/Ireland Advanced Technologies & Computer Science Collection ProQuest Central Essentials ProQuest Central Business Premium Collection Technology collection ProQuest One Community College ProQuest Central Business Premium Collection (Alumni) ABI/INFORM Global (Corporate) ProQuest Central Student Research Library Prep SciTech Premium Collection ProQuest Computer Science Collection ProQuest Business Collection (Alumni Edition) ProQuest Business Collection Computer Science Database ABI/INFORM Professional Advanced Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional ABI/INFORM Global (OCUL) Computing Database ProQuest research library Research Library (Corporate) ProQuest advanced technologies & aerospace journals ProQuest Advanced Technologies & Aerospace Collection ProQuest Central Premium ProQuest One Academic ProQuest One Academic Middle East (New) ProQuest One Business (UW System Shared) ProQuest One Business (Alumni) ProQuest One Academic Eastern Edition (DO NOT USE) ProQuest One Applied & Life Sciences ProQuest One Academic (retired) ProQuest One Academic UKI Edition ProQuest Central Basic Hyper Article en Ligne (HAL) |
| DatabaseTitle | CrossRef ABI/INFORM Global (Corporate) ProQuest Business Collection (Alumni Edition) ProQuest One Business Research Library Prep Computer Science Database ProQuest Central Student Technology Collection Technology Research Database Computer and Information Systems Abstracts – Academic ProQuest One Academic Middle East (New) ProQuest Advanced Technologies & Aerospace Collection ProQuest Central Essentials ProQuest Computer Science Collection Computer and Information Systems Abstracts ProQuest Central (Alumni Edition) SciTech Premium Collection ProQuest One Community College Research Library (Alumni Edition) ABI/INFORM Complete ProQuest Central ABI/INFORM Professional Advanced ProQuest One Applied & Life Sciences ProQuest Central Korea ProQuest Research Library ProQuest Central (New) Advanced Technologies Database with Aerospace ABI/INFORM Complete (Alumni Edition) Advanced Technologies & Aerospace Collection Business Premium Collection ABI/INFORM Global ProQuest Computing ABI/INFORM Global (Alumni Edition) ProQuest Central Basic ProQuest Computing (Alumni Edition) ProQuest One Academic Eastern Edition ProQuest Technology Collection ProQuest SciTech Collection ProQuest Business Collection Computer and Information Systems Abstracts Professional Advanced Technologies & Aerospace Database ProQuest One Academic UKI Edition ProQuest One Business (Alumni) ProQuest One Academic ProQuest One Academic (New) ProQuest Central (Alumni) Business Premium Collection (Alumni) |
| DatabaseTitleList | ABI/INFORM Global (Corporate) |
| Database_xml | – sequence: 1 dbid: BENPR name: ProQuest Central - New (Subscription) url: https://www.proquest.com/central sourceTypes: Aggregation Database |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Computer Science |
| EISSN | 1573-7640 |
| EndPage | 380 |
| ExternalDocumentID | oai:HAL:hal-03677361v1 10_1007_s10766_022_00733_6 |
| GroupedDBID | -4Z -59 -5G -BR -EM -Y2 -~C -~X .4S .86 .DC .VR 06D 0R~ 0VY 199 1N0 2.D 203 28- 29J 2J2 2JN 2JY 2KG 2LR 2P1 2VQ 2~H 30V 3V. 4.4 406 408 409 40D 40E 5GY 5QI 5VS 67Z 6NX 78A 7WY 8FE 8FG 8FL 8G5 8TC 8UJ 95- 95. 95~ 96X AAAVM AABHQ AACDK AAHNG AAIAL AAJBT AAJKR AANZL AAOBN AARHV AARTL AASML AATNV AATVU AAUYE AAWCG AAYIU AAYJJ AAYQN AAYTO AAYZH ABAKF ABBBX ABBXA ABDBF ABDPE ABDZT ABECU ABFSI ABFTD ABFTV ABHLI ABHQN ABJNI ABJOX ABKCH ABKTR ABMNI ABMQK ABNWP ABQBU ABQSL ABSXP ABTAH ABTEG ABTHY ABTKH ABTMW ABULA ABUWG ABWNU ABXPI ACAOD ACBXY ACDTI ACGFO ACGFS ACHSB ACHXU ACIHN ACKNC ACMDZ ACMLO ACNCT ACOKC ACOMO ACPIV ACREN ACUHS ACZOJ ADHIR ADINQ ADKNI ADKPE ADMLS ADRFC ADTPH ADURQ ADYFF ADYOE ADZKW AEAQA AEBTG AEFIE AEFQL AEGAL AEGNC AEJHL AEJRE AEKMD AEMSY AENEX AEOHA AEPYU AESKC AETLH AEVLU AEXYK AFBBN AFEXP AFGCZ AFKRA AFLOW AFQWF AFWTZ AFYQB AFZKB AGAYW AGDGC AGGDS AGJBK AGMZJ AGQEE AGQMX AGRTI AGWIL AGWZB AGYKE AHAVH AHBYD AHKAY AHSBF AHYZX AIAKS AIGIU AIIXL AILAN AITGF AJBLW AJRNO AJZVZ ALMA_UNASSIGNED_HOLDINGS ALWAN AMKLP AMTXH AMXSW AMYLF AOCGG ARAPS ARCSS ARMRJ AXYYD AYJHY AZFZN AZQEC B-. B0M BA0 BBWZM BDATZ BENPR BEZIV BGLVJ BGNMA BKOMP BPHCQ BSONS CAG CCPQU COF CS3 CSCUP DDRTE DL5 DNIVK DPUIP DU5 DWQXO E.L EAD EAP EAS EBLON EBS EDO EIOEI EJD EMK EPL ESBYG ESX FEDTE FERAY FFXSO FIGPU FINBP FNLPD FRNLG FRRFC FSGXE FWDCC GGCAI GGRSB GJIRD GNUQQ GNWQR GQ6 GQ7 GQ8 GROUPED_ABI_INFORM_COMPLETE GROUPED_ABI_INFORM_RESEARCH GUQSH GXS H13 HCIFZ HF~ HG5 HG6 HMJXF HQYDN HRMNR HVGLF HZ~ H~9 I-F I09 IHE IJ- IKXTQ ITM IWAJR IXC IZIGR IZQ I~X I~Z J-C J0Z JBSCW JCJTX JZLTJ K60 K6V K6~ K7- KDC KOV KOW LAK LLZTM M0C M0N M2O M4Y MA- MS~ N2Q NB0 NDZJH NPVJJ NQJWS NU0 O9- O93 O9G O9I O9J OAM OVD P19 P62 P9O PF0 PQBIZ PQBZA PQQKQ PROAC PT4 PT5 Q2X QOK QOS R89 R9I RHV RNI RNS ROL RPX RSV RZC RZE RZK S16 S1Z S26 S27 S28 S3B SAP SCJ SCLPG SCO SDH SDM SHX SISQX SJYHP SNE SNPRN SNX SOHCF SOJ SPISZ SRMVM SSLCW STPWE SZN T13 T16 TAE TEORI TN5 TSG TSK TSV TUC TUS U2A U5U UG4 UOJIU UTJUX UZXMN VC2 VFIZW VXZ W23 W48 WH7 WK8 YLTOR Z45 Z7R Z7X Z81 Z83 Z88 Z8R Z8W Z92 ZMTXR ZY4 ~8M ~EX AAPKM AAYXX ABBRH ABDBE ABFSG ABRTQ ACSTC ADHKG AEZWR AFDZB AFFHD AFHIU AFOHR AGQPQ AHPBZ AHWEU AIXLP ATHPR AYFIA CITATION PHGZM PHGZT PQGLB 7SC 7XB 8AL 8FD 8FK JQ2 L.- L7M L~C L~D MBDVC PKEHL PQEST PQUKI Q9U 1XC |
| ID | FETCH-LOGICAL-c283t-c11ea88e8e31d9f491b85db40fb2a429c9ad541a30fa711f9a916158f11995573 |
| IEDL.DBID | RSV |
| ISICitedReferencesCount | 3 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000800989600002&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 0885-7458 |
| IngestDate | Tue Oct 14 20:11:51 EDT 2025 Wed Nov 05 01:10:17 EST 2025 Sat Nov 29 01:59:46 EST 2025 Tue Nov 18 22:40:09 EST 2025 Fri Feb 21 02:46:00 EST 2025 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 3-4 |
| Keywords | Data skew Similarity join operations Local sensitive hashing (LSH) Hadoop framework MapReduce model |
| Language | English |
| License | Distributed under a Creative Commons Attribution 4.0 International License: http://creativecommons.org/licenses/by/4.0 |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c283t-c11ea88e8e31d9f491b85db40fb2a429c9ad541a30fa711f9a916158f11995573 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| PQID | 2685822839 |
| PQPubID | 48389 |
| PageCount | 21 |
| ParticipantIDs | hal_primary_oai_HAL_hal_03677361v1 proquest_journals_2685822839 crossref_citationtrail_10_1007_s10766_022_00733_6 crossref_primary_10_1007_s10766_022_00733_6 springer_journals_10_1007_s10766_022_00733_6 |
| PublicationCentury | 2000 |
| PublicationDate | 2022-08-01 |
| PublicationDateYYYYMMDD | 2022-08-01 |
| PublicationDate_xml | – month: 08 year: 2022 text: 2022-08-01 day: 01 |
| PublicationDecade | 2020 |
| PublicationPlace | New York |
| PublicationPlace_xml | – name: New York |
| PublicationTitle | International journal of parallel programming |
| PublicationTitleAbbrev | Int J Parallel Prog |
| PublicationYear | 2022 |
| Publisher | Springer US Springer Nature B.V Springer Verlag |
| Publisher_xml | – name: Springer US – name: Springer Nature B.V – name: Springer Verlag |
| References | CR2 CR3 Buchin, Buchin, Meulemans, Mulzer (CR7) 2017; 58 CR6 Florence (CR12) 1950; 60 Alt, Godau (CR1) 1995; 05 CR5 Bamha, Exbrayat (CR4) 2003; 13 Driemel, Har-Peled, Wenk (CR10) 2012; 48 Hassan, Bamha, Loulergue (CR14) 2014; 29 CR17 CR16 CR15 Hassan, Bamha (CR13) 2015; 51 Metwally, Faloutsos (CR19) 2012; 5 CR23 CR11 Ceccarello, Driemel, Silvestri, Friggstad, Sack, Salavatipour (CR8) 2019 CR20 Konzack, Mcketterick, Ophelders, Buchin, Giuggioli, Long, Nelson, Westenberg, Buchin (CR18) 2017; 31 Dean, Ghemawat (CR9) 2008; 51 Xie, Li, Phillips (CR22) 2017; 10 Werner, Oliver (CR21) 2018; 10 A Driemel (733_CR10) 2012; 48 733_CR2 MAH Hassan (733_CR14) 2014; 29 A Metwally (733_CR19) 2012; 5 733_CR3 733_CR5 733_CR6 MAH Hassan (733_CR13) 2015; 51 H Alt (733_CR1) 1995; 05 M Ceccarello (733_CR8) 2019 M Konzack (733_CR18) 2017; 31 PS Florence (733_CR12) 1950; 60 733_CR20 733_CR11 D Xie (733_CR22) 2017; 10 M Bamha (733_CR4) 2003; 13 K Buchin (733_CR7) 2017; 58 733_CR23 J Dean (733_CR9) 2008; 51 M Werner (733_CR21) 2018; 10 733_CR15 733_CR16 733_CR17 |
| References_xml | – volume: 51 start-page: 70 year: 2015 end-page: 79 ident: CR13 article-title: Towards scalability and data skew handling in groupby-joins using mapreduce model publication-title: Procedia Comput. Sci. doi: 10.1016/j.procs.2015.05.200 – volume: 10 start-page: 24 issue: 1 year: 2018 end-page: 27 ident: CR21 article-title: ACM SIGSPATIAL GIS cup 2017: range queries under fréchet distance publication-title: SIGSPATIAL Special doi: 10.1145/3231541.3231549 – ident: CR3 – ident: CR15 – volume: 31 start-page: 320 issue: 2 year: 2017 end-page: 345 ident: CR18 article-title: Visual analytics of delays and interaction in movement data publication-title: Int. J. Geogr. Inf. Sci. doi: 10.1080/13658816.2016.1199806 – ident: CR2 – volume: 13 start-page: 317 issue: 3 year: 2003 end-page: 328 ident: CR4 article-title: Pipelining a skew-insensitive parallel join algorithm publication-title: Parallel Process. Lett. doi: 10.1142/S0129626403001306 – ident: CR16 – ident: CR17 – ident: CR11 – volume: 10 start-page: 1478 issue: 11 year: 2017 end-page: 1489 ident: CR22 article-title: Distributed trajectory similarity search publication-title: Proc. VLDB Endowment doi: 10.14778/3137628.3137655 – volume: 48 start-page: 94 issue: 1 year: 2012 end-page: 127 ident: CR10 article-title: Approximating the fréchet distance for realistic curves in near linear time publication-title: Discret. Comput. Geomet. doi: 10.1007/s00454-012-9402-z – volume: 05 start-page: 75 issue: 1 year: 1995 end-page: 91 ident: CR1 article-title: Computing the fréchet distance between two polygonal curves publication-title: Int. J. Comput. Geomet. Appl. doi: 10.1142/S0218195995000064 – volume: 58 start-page: 180 issue: 1 year: 2017 end-page: 216 ident: CR7 article-title: Four soviets walk the dog: Improved bounds for computing the fréchet distance publication-title: Discret. Comput. Geomet. doi: 10.1007/s00454-017-9878-7 – volume: 51 start-page: 107 issue: 1 year: 2008 end-page: 113 ident: CR9 article-title: Mapreduce: simplified data processing on large clusters publication-title: Commun. ACM doi: 10.1145/1327452.1327492 – ident: CR6 – volume: 60 start-page: 808 issue: 240 year: 1950 end-page: 810 ident: CR12 article-title: Human behaviour and the principle of least effort publication-title: Econ. J. doi: 10.2307/2226729 – volume: 29 start-page: 145 year: 2014 end-page: 158 ident: CR14 article-title: Handling data-skew effects in join operations using mapreduce publication-title: Procedia Comput. Sci. doi: 10.1016/j.procs.2014.05.014 – ident: CR5 – volume: 5 start-page: 704 issue: 8 year: 2012 end-page: 715 ident: CR19 article-title: V-smart-join: a scalable mapreduce framework for all-pair similarity joins of multisets and vectors publication-title: Proc. VLDB Endow. doi: 10.14778/2212351.2212353 – start-page: 254 year: 2019 end-page: 268 ident: CR8 article-title: Fresh: Fréchet similarity with hashing publication-title: Algorithms and Data Structures doi: 10.1007/978-3-030-24766-9_19 – ident: CR23 – ident: CR20 – ident: 733_CR17 doi: 10.1145/276698.276876 – volume: 5 start-page: 704 issue: 8 year: 2012 ident: 733_CR19 publication-title: Proc. VLDB Endow. doi: 10.14778/2212351.2212353 – ident: 733_CR3 doi: 10.1007/11546924_60 – volume: 10 start-page: 1478 issue: 11 year: 2017 ident: 733_CR22 publication-title: Proc. VLDB Endowment doi: 10.14778/3137628.3137655 – start-page: 254 volume-title: Algorithms and Data Structures year: 2019 ident: 733_CR8 doi: 10.1007/978-3-030-24766-9_19 – ident: 733_CR23 doi: 10.1109/ICDE.2019.00115 – ident: 733_CR5 doi: 10.1145/1807167.1807273 – volume: 58 start-page: 180 issue: 1 year: 2017 ident: 733_CR7 publication-title: Discret. Comput. Geomet. doi: 10.1007/s00454-017-9878-7 – ident: 733_CR20 doi: 10.1109/ICDAR.2007.4378752 – volume: 51 start-page: 107 issue: 1 year: 2008 ident: 733_CR9 publication-title: Commun. ACM doi: 10.1145/1327452.1327492 – ident: 733_CR16 doi: 10.1145/513400.513414 – volume: 29 start-page: 145 year: 2014 ident: 733_CR14 publication-title: Procedia Comput. Sci. doi: 10.1016/j.procs.2014.05.014 – ident: 733_CR6 doi: 10.1109/FOCS.2014.76 – volume: 31 start-page: 320 issue: 2 year: 2017 ident: 733_CR18 publication-title: Int. J. Geogr. Inf. Sci. doi: 10.1080/13658816.2016.1199806 – ident: 733_CR11 – volume: 60 start-page: 808 issue: 240 year: 1950 ident: 733_CR12 publication-title: Econ. J. doi: 10.2307/2226729 – volume: 13 start-page: 317 issue: 3 year: 2003 ident: 733_CR4 publication-title: Parallel Process. Lett. doi: 10.1142/S0129626403001306 – volume: 51 start-page: 70 year: 2015 ident: 733_CR13 publication-title: Procedia Comput. Sci. doi: 10.1016/j.procs.2015.05.200 – volume: 10 start-page: 24 issue: 1 year: 2018 ident: 733_CR21 publication-title: SIGSPATIAL Special doi: 10.1145/3231541.3231549 – volume: 48 start-page: 94 issue: 1 year: 2012 ident: 733_CR10 publication-title: Discret. Comput. Geomet. doi: 10.1007/s00454-012-9402-z – volume: 05 start-page: 75 issue: 1 year: 1995 ident: 733_CR1 publication-title: Int. J. Comput. Geomet. Appl. doi: 10.1142/S0218195995000064 – ident: 733_CR2 doi: 10.1145/3139958.3140062 – ident: 733_CR15 doi: 10.1145/3034786.3056110 |
| SSID | ssj0009788 |
| Score | 2.2747247 |
| Snippet | Similarity joins are recognized to be among the most useful data processing and analysis operations. A similarity join is used to retrieve all data pairs whose... |
| SourceID | hal proquest crossref springer |
| SourceType | Open Access Repository Aggregation Database Enrichment Source Index Database Publisher |
| StartPage | 360 |
| SubjectTerms | Algorithms Cognitive science Computer Science Cost analysis Data processing Datasets Histograms Processor Architectures Similarity Software Engineering/Programming and Operating Systems Special Issue on High-Level Parallel Programming and Applications 2021 Theory of Computation Time series |
| SummonAdditionalLinks | – databaseName: ABI/INFORM Global (OCUL) dbid: M0C link: http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1LS8QwEB509eDFt7i6ShBvGmz6SnKSVZRFVhEf4K2kSaoLa3fdXf39ZmrqqqAXb6VN09Av05kmM98HsB87A7KCh9RqLmkcW0NzxSOqUpPowHAXlOhKbIJfXYmHB3ntF9zGPq2y_iZWH2oz0LhGfhQiUTpytcjj4QtF1SjcXfUSGrMwh5ENpvRdBqdT0l1e6U46Q0oojxPhi2Z86RxPMf02pJVuIU2_OabZJ0yL_BJz_tgmrbzP-dJ_x70Miz7uJO2PibICM7ZchaVa04F4E1-D47Y7Vn0sqCK3veee-_F1cTq5GPRK0u4_up4nT8_kxPk-QwYluVTDGyR_tUSVhnRvO-twf352d9qhXmSBajeoCdUOKyWEFTZiRhYOu1wkJo-DIg-Vc1ZaKpPETEVBoThjhVQSg0RRMCzuTni0AY1yUNpNIFIxXYTchHEexUliXFe51qKQOihspIImsPoNZ9ozkKMQRj-bcicjKplDJatQydImHHzeM_zg3_iz9Z4D7rMhUmd32t0MzzlPzXmUsjfWhFaNVOYNdpxNYWrCYY319PLvj9z6u7dtWAirSYYpgy1oTEavdgfm9dukNx7tVtP1HbRd64c priority: 102 providerName: ProQuest |
| Title | A Scalable Similarity Join Algorithm Based on MapReduce and LSH |
| URI | https://link.springer.com/article/10.1007/s10766-022-00733-6 https://www.proquest.com/docview/2685822839 https://hal.science/hal-03677361 |
| Volume | 50 |
| WOSCitedRecordID | wos000800989600002&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVAVX databaseName: SpringerLINK Contemporary 1997-Present customDbUrl: eissn: 1573-7640 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0009788 issn: 0885-7458 databaseCode: RSV dateStart: 19970101 isFulltext: true titleUrlDefault: https://link.springer.com/search?facet-content-type=%22Journal%22 providerName: Springer Nature |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3dT9swED-tsIe9AINNlI_KmnhjluIkju0nVBCo4qOrWgZsL5FjO7RSSREt_P2cQ0IZgknbixUnjhPd-XIX-e73A9iJ0YCcFCF1Rigax87STIuI6sRyE1iBQYkpySZEtyuvrlSvKgqb1tnu9ZZk-aV-UewmEp8wG9KSaZAmDVhEdyc9YUN_cDGH2hUl2ySaD6ci5rIqlXl7jj_cUWPokyFfRJqvNkdLn3O0_H9vuwJLVYxJ2k-L4jN8cMUqLNf8DaQy5zXYa-OxHvviKTIY3YzwJxdjcnI8GRWkPb6eYG94Q_bRz1kyKciZvu17oFdHdGHJ6aDzBX4eHZ4fdGhFqEANRhEzalAvWkonXcSsylFPmeQ2i4M8CzU6JqO05THTUZBrwViutPIBocyZL-TmIvoKC8WkcOtAlGYmD4UN4yyKObc4VWaMzJUJchfpoAmslmtqKrRxT3oxTuc4yV5CKUooLSWUJk3Yfb7n9glr46-jv6G6ngd6mOxO-zT159ArCxEl7IE1YavWZloZ5zQNPea-h_1RTfhea29--f1Hbvzb8E34FJYLwKcLbsHC7O7ebcNH8zAbTe9a0BCXv1qwuH_Y7fWxdyIotmfBgW_DH9j2-O9WubQfAQxd58c |
| linkProvider | Springer Nature |
| linkToHtml | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V3db9MwED9tAwleGJ-ibICF4AksYseJ7YdpKh9TR7sKsSHtzXNsh1Xq0rJ2Q_xT-xs5p8kKSOxtD7xFie3Y8e_O5_judwAvBQpQUJLT4KSmQgRPCytTanOfucRLNEpcnWxCDofq8FB_XoGLNhYmulW2OrFW1H7i4j_ytzwSpUeuFr09_U5j1qh4utqm0FjAoh9-_sAt22xr9wPO7yvOdz4evO_RJqsAdVh7Th12zioVVEiZ1yV2tlCZL0RSFtyidnba-kwwmyallYyV2upoFamSxWjmTKbY7ircEKmSUa76ki5JfmWd5xIFN6NSZKoJ0mlC9WQe3X05rfMk0vyPhXD1OLph_mbj_nUsW692O-v_23e6C3cau5p0F4JwD1ZCdR_W25wVpFFhD2C7i9d2HAPGyP7oZIQbe9yHkE-TUUW64284kvnxCXmHa7snk4rs2emXSG4biK08Gez3HsLXaxnGI1irJlV4DERb5kouPRdFKrLMY1OFc6rULilDapMOsHZGjWsY1mOij7FZckNHFBhEgalRYPIOvL6sM13wi1xZ-gUC5bJgpAbvdQcm3kNLRMo0Z-esA5stMkyjkGZmCYsOvGmxtXz871c-ubq153Crd7A3MIPdYX8DbvMa4NE9chPW5qdn4SncdOfz0ez0WS0qBI6uG3O_ADVLRxU |
| linkToPdf | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V3db9MwED9tAyFeGJ-iMMBC8ATWEieO7Qc0FUbVsVJVDKRpL8bxB6vUpWUtQ_xr_HWc02QFJPa2B96ixHHs5Hfnu_jufgDPchQgLwWj3gpF89w7WhqRUVM4bhMn0CixNdmEGA7l4aEarcHPNhcmhlW2OrFW1G5q4z_ybRYLpcdaLWo7NGERo93ezuwrjQxScae1pdNYQmTf__iO7tv81d4ufuvnjPXefnzTpw3DALXY04JaHKiR0kufpU4FHHgpuSvzJJTMoKa2yjiepyZLghFpGpRR0UKSIY2ZzVxk2O86XBHoY8ZwwhE_WhX8FTXnJQoxpyLnsknYadL2RBFDfxmtORNp8ceiuH4cQzJ_s3f_2qKtV77e5v_8zm7CjcbeJt2lgNyCNV_dhs2Wy4I0qu0O7HTx2ExiIhk5GJ-M0eFH_4S8m44r0p18wZksjk_Ia1zzHZlW5L2ZfYhFbz0xlSODg_5d-HQp07gHG9W08veBKJPawIRjeZnlnDvsqrRWBmWT4DOTdCBtv662TeX1SAAy0aua0RERGhGha0ToogMvzu-ZLeuOXNj6KYLmvGEsGd7vDnQ8hxaKEFmRnqUd2GpRohtFNdcriHTgZYuz1eV_P_LBxb09gWsINT3YG-4_hOusxnqMmtyCjcXpN_8IrtqzxXh--riWGgKfLxtyvwCz4lA5 |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+Scalable+Similarity+Join+Algorithm+Based+on+MapReduce+and+LSH&rft.jtitle=International+journal+of+parallel+programming&rft.au=Rivault%2C+S%C3%A9bastien&rft.au=Bamha%2C+Mostafa&rft.au=Limet%2C+S%C3%A9bastien&rft.au=Robert%2C+Sophie&rft.date=2022-08-01&rft.pub=Springer+Verlag&rft.issn=0885-7458&rft.eissn=1573-7640&rft_id=info:doi/10.1007%2Fs10766-022-00733-6&rft.externalDBID=HAS_PDF_LINK&rft.externalDocID=oai%3AHAL%3Ahal-03677361v1 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0885-7458&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0885-7458&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0885-7458&client=summon |