Locality sensitive hashing for sampling-based algorithms in association rule mining
► A novel sampling approach with association rule mining can process very large databases in a reasonable time. ► Outliers are removed based on local properties of individual clusters. ► The proposed algorithms are shown to exhibit better accuracy or execution time than previously proposed algorithm...
Saved in:
| Published in: | Expert systems with applications Vol. 38; no. 10; pp. 12388 - 12397 |
|---|---|
| Main Authors: | , , |
| Format: | Journal Article |
| Language: | English |
| Published: |
Elsevier Ltd
15.09.2011
|
| Subjects: | |
| ISSN: | 0957-4174, 1873-6793 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Abstract | ► A novel sampling approach with association rule mining can process very large databases in a reasonable time. ► Outliers are removed based on local properties of individual clusters. ► The proposed algorithms are shown to exhibit better accuracy or execution time than previously proposed algorithms.
Association rule mining is one of the most important techniques for intelligent system design and has been widely applied in a large number of real applications. However, classical mining algorithms cannot process very large databases in a reasonable amount of time. The sampling approach that processes a subset of the whole database is a viable alternative. Obviously, such an approach cannot extract perfectly accurate rules. Previous works have tried to improve the accuracy by removing “outliers” from the initial sample based on global statistical properties in the sample. In this paper, we take the view that the initial sample may actually consist of multiple possibly overlapping subsets or clusters. It is more reasonable to apply data clustering techniques to the initial sample before outlier removal is performed on the resulting clusters, so that outliers are removed based on local properties of individual clusters. However, clustering transactional data with very high dimensions is a difficult problem by itself. We solve this problem by interpreting locality sensitive hashing as a means for data clustering. Previously proposed algorithms may be then optionally used to remove the outliers in the individual clusters. We propose several concrete algorithms based on this general strategy. Using an extensive set of synthetic data and real datasets, we evaluate our proposed algorithms and find that our proposals exhibit better accuracy or execution time, or both, than previously proposed algorithms. |
|---|---|
| AbstractList | ► A novel sampling approach with association rule mining can process very large databases in a reasonable time. ► Outliers are removed based on local properties of individual clusters. ► The proposed algorithms are shown to exhibit better accuracy or execution time than previously proposed algorithms.
Association rule mining is one of the most important techniques for intelligent system design and has been widely applied in a large number of real applications. However, classical mining algorithms cannot process very large databases in a reasonable amount of time. The sampling approach that processes a subset of the whole database is a viable alternative. Obviously, such an approach cannot extract perfectly accurate rules. Previous works have tried to improve the accuracy by removing “outliers” from the initial sample based on global statistical properties in the sample. In this paper, we take the view that the initial sample may actually consist of multiple possibly overlapping subsets or clusters. It is more reasonable to apply data clustering techniques to the initial sample before outlier removal is performed on the resulting clusters, so that outliers are removed based on local properties of individual clusters. However, clustering transactional data with very high dimensions is a difficult problem by itself. We solve this problem by interpreting locality sensitive hashing as a means for data clustering. Previously proposed algorithms may be then optionally used to remove the outliers in the individual clusters. We propose several concrete algorithms based on this general strategy. Using an extensive set of synthetic data and real datasets, we evaluate our proposed algorithms and find that our proposals exhibit better accuracy or execution time, or both, than previously proposed algorithms. Association rule mining is one of the most important techniques for intelligent system design and has been widely applied in a large number of real applications. However, classical mining algorithms cannot process very large databases in a reasonable amount of time. The sampling approach that processes a subset of the whole database is a viable alternative. Obviously, such an approach cannot extract perfectly accurate rules. Previous works have tried to improve the accuracy by removing "outliers" from the initial sample based on global statistical properties in the sample. In this paper, we take the view that the initial sample may actually consist of multiple possibly overlapping subsets or clusters. It is more reasonable to apply data clustering techniques to the initial sample before outlier removal is performed on the resulting clusters, so that outliers are removed based on local properties of individual clusters. However, clustering transactional data with very high dimensions is a difficult problem by itself. We solve this problem by interpreting locality sensitive hashing as a means for data clustering. Previously proposed algorithms may be then optionally used to remove the outliers in the individual clusters. We propose several concrete algorithms based on this general strategy. Using an extensive set of synthetic data and real datasets, we evaluate our proposed algorithms and find that our proposals exhibit better accuracy or execution time, or both, than previously proposed algorithms. |
| Author | Chen, Chyouhwa Huang, Chin-Pin Horng, Shi-Jinn |
| Author_xml | – sequence: 1 givenname: Chyouhwa surname: Chen fullname: Chen, Chyouhwa – sequence: 2 givenname: Shi-Jinn surname: Horng fullname: Horng, Shi-Jinn email: horngsj@yahoo.com.tw – sequence: 3 givenname: Chin-Pin surname: Huang fullname: Huang, Chin-Pin |
| BookMark | eNp9kE1rGzEQQEVJIE7aP5CTbu1lt9JKu9JCLiX0Cww5pD0LWTuKZXYlVyOn5N9HrnPqwadh4L2BedfkIqYIhNxy1nLGh8-7FvCvbTvGectky7h-R1ZcK9EMahQXZMXGXjWSK3lFrhF3jHHFmFqRx3Vydg7lhSJEDCU8A91a3Ib4RH3KFO2yn-vSbCzCRO38lHIo2wVpiNQiJhdsCSnSfJiBLiFW9j259HZG-PA2b8jvb19_3f9o1g_ff95_WTdODH1pHINOAPSO-frCqISbJjUCV45JBZ1XwLXwvd_oAZzgTPZu3DjrpfK-59aLG_LxdHef058DYDFLQAfzbCOkAxqtR9lJLXglP50lawzOdNfpoaLdCXU5IWbwZp_DYvOL4cwcW5udObY2x9aGSVNbV0n_J7lQ_nUp2Yb5vHp3UqGWeg6QDboA0cEUMrhiphTO6a-lRZ4U |
| CitedBy_id | crossref_primary_10_1145_2629586 crossref_primary_10_1155_2015_217216 crossref_primary_10_1016_j_engappai_2017_07_016 crossref_primary_10_1016_j_scico_2017_04_006 crossref_primary_10_3233_IDA_184183 crossref_primary_10_1109_TCYB_2021_3125196 crossref_primary_10_1002_cpe_2863 crossref_primary_10_1088_1674_1056_23_8_080203 crossref_primary_10_1016_j_asoc_2024_112606 crossref_primary_10_1016_j_eswa_2016_06_008 crossref_primary_10_1145_3725312 |
| Cites_doi | 10.1109/ICPP.2008.25 10.1007/s10618-008-0093-2 10.1145/1514894.1514927 10.1016/j.datak.2007.07.011 10.1145/956750.956761 10.1016/j.ejor.2005.06.056 10.1109/2.781635 10.1090/conm/026/737400 10.1109/LGRS.2008.915595 10.1016/j.eswa.2005.06.006 10.1109/ICDM.2002.1183923 10.1007/s00778-004-0125-5 10.1016/j.eswa.2007.12.031 10.1145/1497577.1497578 10.1145/775047.775114 10.1016/j.patcog.2007.03.026 10.1145/997817.997857 10.1145/1081870.1081932 10.1109/RIDE.1997.583696 10.1145/1327452.1327494 10.1145/233269.233311 10.1016/j.eswa.2006.08.035 10.1016/j.eswa.2007.10.016 |
| ContentType | Journal Article |
| Copyright | 2011 Elsevier Ltd |
| Copyright_xml | – notice: 2011 Elsevier Ltd |
| DBID | AAYXX CITATION 7SC 8FD JQ2 L7M L~C L~D |
| DOI | 10.1016/j.eswa.2011.04.018 |
| DatabaseName | CrossRef Computer and Information Systems Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional |
| DatabaseTitle | CrossRef Computer and Information Systems Abstracts Technology Research Database Computer and Information Systems Abstracts – Academic Advanced Technologies Database with Aerospace ProQuest Computer Science Collection Computer and Information Systems Abstracts Professional |
| DatabaseTitleList | Computer and Information Systems Abstracts Computer and Information Systems Abstracts |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Computer Science |
| EISSN | 1873-6793 |
| EndPage | 12397 |
| ExternalDocumentID | 10_1016_j_eswa_2011_04_018 S0957417411005343 |
| GrantInformation_xml | – fundername: National Science Council grantid: NSC 96-2918-I-011-002; 96-2221-E-011-022; 95-2221-E-011-032-MY3 |
| GroupedDBID | --K --M .DC .~1 0R~ 13V 1B1 1RT 1~. 1~5 4.4 457 4G. 5GY 5VS 7-5 71M 8P~ 9JN 9JO AAAKF AABNK AACTN AAEDT AAEDW AAIAV AAIKJ AAKOC AALRI AAOAW AAQFI AARIN AAXUO AAYFN ABBOA ABFNM ABMAC ABMVD ABUCO ABXDB ABYKQ ACDAQ ACGFS ACHRH ACNNM ACRLP ACZNC ADBBV ADEZE ADMUD ADTZH AEBSH AECPX AEKER AENEX AFKWA AFTJW AGHFR AGJBL AGUBO AGUMN AGYEJ AHHHB AHJVU AHZHX AIALX AIEXJ AIKHN AITUG AJBFU AJOXV ALEQD ALMA_UNASSIGNED_HOLDINGS AMFUW AMRAJ AOUOD APLSM AXJTR BJAXD BKOJK BLXMC BNSAS CS3 DU5 EBS EFJIC EFLBG EJD EO8 EO9 EP2 EP3 F5P FDB FIRID FNPLU FYGXN G-Q GBLVA GBOLZ HAMUX HZ~ IHE J1W JJJVA KOM LG9 LY1 LY7 M41 MO0 N9A O-L O9- OAUVE OZT P-8 P-9 P2P PC. PQQKQ Q38 RIG ROL RPZ SDF SDG SDP SDS SES SPC SPCBC SSB SSD SSL SST SSV SSZ T5K TN5 ~G- 29G 9DU AAAKG AAQXK AATTM AAXKI AAYWO AAYXX ABJNI ABKBG ABUFD ABWVN ACLOT ACNTT ACRPL ACVFH ADCNI ADJOM ADNMO AEIPS AEUPX AFJKZ AFPUW AGQPQ AIGII AIIUN AKBMS AKRWK AKYEP ANKPU APXCP ASPBG AVWKF AZFZN CITATION EFKBS FEDTE FGOYB G-2 HLZ HVGLF R2- SBC SET SEW WUQ XPP ZMT ~HD 7SC 8FD JQ2 L7M L~C L~D |
| ID | FETCH-LOGICAL-c365t-c0e23ee5c0f016973cdd79e17c047e2f7e183f5fb86ec31045c9bcaf47ff51af3 |
| ISICitedReferencesCount | 12 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000292169500042&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 0957-4174 |
| IngestDate | Sun Sep 28 06:04:39 EDT 2025 Sun Sep 28 06:35:53 EDT 2025 Tue Nov 18 21:04:28 EST 2025 Sat Nov 29 07:04:53 EST 2025 Fri Feb 23 02:40:17 EST 2024 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 10 |
| Keywords | Outlier removal Association rule mining Sampling Clustering Locality-sensitive hashing |
| Language | English |
| LinkModel | OpenURL |
| MergedId | FETCHMERGED-LOGICAL-c365t-c0e23ee5c0f016973cdd79e17c047e2f7e183f5fb86ec31045c9bcaf47ff51af3 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 ObjectType-Article-2 ObjectType-Feature-1 |
| PQID | 1701082286 |
| PQPubID | 23500 |
| PageCount | 10 |
| ParticipantIDs | proquest_miscellaneous_889424831 proquest_miscellaneous_1701082286 crossref_primary_10_1016_j_eswa_2011_04_018 crossref_citationtrail_10_1016_j_eswa_2011_04_018 elsevier_sciencedirect_doi_10_1016_j_eswa_2011_04_018 |
| PublicationCentury | 2000 |
| PublicationDate | 2011-09-15 |
| PublicationDateYYYYMMDD | 2011-09-15 |
| PublicationDate_xml | – month: 09 year: 2011 text: 2011-09-15 day: 15 |
| PublicationDecade | 2010 |
| PublicationTitle | Expert systems with applications |
| PublicationYear | 2011 |
| Publisher | Elsevier Ltd |
| Publisher_xml | – name: Elsevier Ltd |
| References | Bronnimann, H., Chen, B., Dash, M., Haas, P. J., & Scheuermann, P. (2003). Efficient data reduction with EASE. In IBM almaden, (2001). Choi, Ahn, Kim (b0060) 2005; 29 Banerjee, A., Krumpelman, C., Ghosh, J., Basu, S., & Mooney, R. J. (2005). Model-based overlapping clustering. In Tan, Steinbach, Kumar (b0140) 2006 (pp. 194–205). Hellerstein, Avnur, Chou, Hidber, Olston, Raman (b0080) 1999; 32 (pp. 487–499). Chen, Hu (b0050) 2006; 173 Bandyopadhyay, Saha (b0025) 2007; 40 (pp. 644–651). Liu, Shih, Liau, Lai (b0110) 2009; 36 Sayood (b0130) 2005 Saha, Bandyopadhyay (b0125) 2008; 5 Datar, M., Immorlica, N., Indyk, P., & Mirrokni, V. S. (2004). Locality-sensitive hashing scheme based on Toivonen, H. (1996). Sampling large databases for association rules. In Weber, R., Schek, H. J., & Blott, S. (1998). A quantitative analysis and performance study for similarity search methods in high-dimensional space. In (pp. 42–49). Parthasarathy, S. (2002). Efficient progressive sampling for association rules. In (pp. 518–529). . Knorr, E., & Ng, R. (1998). Algorithms for mining distance-based outliers in large data sets. In Agrawal, R., & Srikant, R. (1994). Fast algorithms for mining association rules. In (pp. 276–283). (pp. 253–262). (pp. 532–537). Akcan, Astashyn, Bronnimann (b0015) 2008; 64 stable distributions. In Kuo, Lin, Shih (b0105) 2007; 33 Ghoting, Parthasarathy, Otey (b0070) 2008; 16 Chen, B., Haas, P., & Scheuermann, P. (2002). A new two-phase sampling based algorithm for discovering association rules. In Srikant, R., & Agrawal, R. (1996). Mining quantitative association rules in large relational tables. In Chakaravarthy, V. T., Pandit, V., & Sabharwal, Y. (2009). Analysis of sampling techniques for association rule mining. In Aggarwal, Philip (b0005) 2005; 14 (pp. 1–12). (pp. 59–68). (pp. 134–145). Johnson, W. B., & Lindenstrauss, J. (1984). Extensions of Lipchitz mappings into a hilbert space. In Andoni, Indyk (b0020) 2008; 51 Kriegel, Kröger, Zimek (b0100) 2009; 3 (pp. 189–206). FIMI Repository, (2003). Gionis, A., Indyk, P., & Motwani, R. (1999). Similarity search in high dimensions via hashing. In (pp. 462–468). (pp. 392–403). Hua, Y. X., Dan, B. F., & Yu, B. (2008). Bounded LSH for similarity search in peer-to-peer file systems. In (pp. 354–361). Chen, Hung (b0055) 2009; 36 Zaki, J., Parthasarathy, S., Lin, W., & Ogihara, M. (1997). Evaluation of sampling for data mining of association rules. In 10.1016/j.eswa.2011.04.018_b0160 10.1016/j.eswa.2011.04.018_b0040 Chen (10.1016/j.eswa.2011.04.018_b0050) 2006; 173 10.1016/j.eswa.2011.04.018_b0085 Ghoting (10.1016/j.eswa.2011.04.018_b0070) 2008; 16 Kuo (10.1016/j.eswa.2011.04.018_b0105) 2007; 33 Liu (10.1016/j.eswa.2011.04.018_b0110) 2009; 36 Tan (10.1016/j.eswa.2011.04.018_b0140) 2006 Sayood (10.1016/j.eswa.2011.04.018_b0130) 2005 Akcan (10.1016/j.eswa.2011.04.018_b0015) 2008; 64 10.1016/j.eswa.2011.04.018_b0145 Choi (10.1016/j.eswa.2011.04.018_b0060) 2005; 29 10.1016/j.eswa.2011.04.018_b0065 10.1016/j.eswa.2011.04.018_b0120 Saha (10.1016/j.eswa.2011.04.018_b0125) 2008; 5 Andoni (10.1016/j.eswa.2011.04.018_b0020) 2008; 51 10.1016/j.eswa.2011.04.018_b0165 10.1016/j.eswa.2011.04.018_b0045 Aggarwal (10.1016/j.eswa.2011.04.018_b0005) 2005; 14 Bandyopadhyay (10.1016/j.eswa.2011.04.018_b0025) 2007; 40 Kriegel (10.1016/j.eswa.2011.04.018_b0100) 2009; 3 Hellerstein (10.1016/j.eswa.2011.04.018_b0080) 1999; 32 10.1016/j.eswa.2011.04.018_b0095 10.1016/j.eswa.2011.04.018_b0150 10.1016/j.eswa.2011.04.018_b0030 10.1016/j.eswa.2011.04.018_b0090 10.1016/j.eswa.2011.04.018_b0035 10.1016/j.eswa.2011.04.018_b0135 10.1016/j.eswa.2011.04.018_b0115 10.1016/j.eswa.2011.04.018_b0075 10.1016/j.eswa.2011.04.018_b0010 Chen (10.1016/j.eswa.2011.04.018_b0055) 2009; 36 |
| References_xml | – year: 2006 ident: b0140 article-title: Introduction to data mining – reference: Agrawal, R., & Srikant, R. (1994). Fast algorithms for mining association rules. In – reference: (pp. 462–468). – reference: (pp. 532–537). – reference: Bronnimann, H., Chen, B., Dash, M., Haas, P. J., & Scheuermann, P. (2003). Efficient data reduction with EASE. In – reference: (pp. 253–262). – reference: IBM almaden, (2001). – volume: 33 start-page: 794 year: 2007 end-page: 808 ident: b0105 article-title: Mining association rules through integration of clustering analysis and ant colony system for health insurance database in Taiwan publication-title: Expert Systems with Applications – reference: Johnson, W. B., & Lindenstrauss, J. (1984). Extensions of Lipchitz mappings into a hilbert space. In – reference: (pp. 276–283). – reference: Knorr, E., & Ng, R. (1998). Algorithms for mining distance-based outliers in large data sets. In – reference: (pp. 194–205). – reference: Srikant, R., & Agrawal, R. (1996). Mining quantitative association rules in large relational tables. In – reference: Zaki, J., Parthasarathy, S., Lin, W., & Ogihara, M. (1997). Evaluation of sampling for data mining of association rules. In – year: 2005 ident: b0130 article-title: Introduction to data compression – volume: 40 start-page: 3430 year: 2007 end-page: 3451 ident: b0025 article-title: GAPS: A clustering method using a new point symmetry-based distance measure publication-title: Pattern Recognition – reference: Gionis, A., Indyk, P., & Motwani, R. (1999). Similarity search in high dimensions via hashing. In – reference: (pp. 487–499). – volume: 3 start-page: 1 year: 2009 end-page: 58 ident: b0100 article-title: Clustering high-dimensional data: A survey on subspace clustering, pattern-based clustering, and correlation clustering publication-title: ACM Transactions on Knowledge Discovery from Data (ACM) – reference: . – volume: 14 start-page: 211 year: 2005 end-page: 221 ident: b0005 article-title: An effective and efficient algorithm for high-dimensional outlier detection publication-title: The VLDB Journal – reference: (pp. 644–651). – volume: 5 start-page: 166 year: 2008 end-page: 170 ident: b0125 article-title: Application of a new symmetry-based cluster validity index for satellite image segmentation publication-title: IEEE Geoscience and Remote Sensing Letters – reference: Banerjee, A., Krumpelman, C., Ghosh, J., Basu, S., & Mooney, R. J. (2005). Model-based overlapping clustering. In – reference: (pp. 59–68). – reference: Weber, R., Schek, H. J., & Blott, S. (1998). A quantitative analysis and performance study for similarity search methods in high-dimensional space. In – volume: 16 start-page: 349 year: 2008 end-page: 364 ident: b0070 article-title: Fast mining of distance-based outliers in high-dimensional datasets publication-title: Data Mining and Knowledge Discovery – reference: (pp. 354–361). – reference: (pp. 189–206). – reference: Toivonen, H. (1996). Sampling large databases for association rules. In – volume: 32 start-page: 51 year: 1999 end-page: 59 ident: b0080 article-title: Interactive data analysis: The control project publication-title: Computer – volume: 51 start-page: 117 year: 2008 end-page: 122 ident: b0020 article-title: Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions publication-title: Communications of the ACM – reference: Chakaravarthy, V. T., Pandit, V., & Sabharwal, Y. (2009). Analysis of sampling techniques for association rule mining. In – reference: Chen, B., Haas, P., & Scheuermann, P. (2002). A new two-phase sampling based algorithm for discovering association rules. In – reference: Hua, Y. X., Dan, B. F., & Yu, B. (2008). Bounded LSH for similarity search in peer-to-peer file systems. In – reference: (pp. 392–403). – reference: (pp. 1–12). – volume: 173 start-page: 762 year: 2006 end-page: 780 ident: b0050 article-title: An overlapping cluster algorithm to provide non-exhaustive clustering publication-title: European Journal of Operational Research – volume: 64 start-page: 405 year: 2008 end-page: 418 ident: b0015 article-title: Deterministic algorithms for sampling count data publication-title: Data and Knowledge Engineering – volume: 36 start-page: 972 year: 2009 end-page: 984 ident: b0110 article-title: Mining the change of event trends for decision support in environmental scanning publication-title: Expert Systems with Applications – reference: (pp. 42–49). – reference: -stable distributions. In – reference: Datar, M., Immorlica, N., Indyk, P., & Mirrokni, V. S. (2004). Locality-sensitive hashing scheme based on – reference: . – volume: 29 start-page: 867 year: 2005 end-page: 878 ident: b0060 article-title: Prioritization of association rules in data mining: Multiple criteria decision approach publication-title: Expert Systems with Applications – reference: (pp. 518–529). – reference: Parthasarathy, S. (2002). Efficient progressive sampling for association rules. In – reference: FIMI Repository, (2003). – reference: (pp. 134–145). – volume: 36 start-page: 2338 year: 2009 end-page: 2351 ident: b0055 article-title: Using decision trees to summarize associative classification rules publication-title: Expert Systems with Applications – ident: 10.1016/j.eswa.2011.04.018_b0085 doi: 10.1109/ICPP.2008.25 – year: 2006 ident: 10.1016/j.eswa.2011.04.018_b0140 – volume: 16 start-page: 349 issue: 3 year: 2008 ident: 10.1016/j.eswa.2011.04.018_b0070 article-title: Fast mining of distance-based outliers in high-dimensional datasets publication-title: Data Mining and Knowledge Discovery doi: 10.1007/s10618-008-0093-2 – ident: 10.1016/j.eswa.2011.04.018_b0040 doi: 10.1145/1514894.1514927 – volume: 64 start-page: 405 issue: 2 year: 2008 ident: 10.1016/j.eswa.2011.04.018_b0015 article-title: Deterministic algorithms for sampling count data publication-title: Data and Knowledge Engineering doi: 10.1016/j.datak.2007.07.011 – ident: 10.1016/j.eswa.2011.04.018_b0120 – ident: 10.1016/j.eswa.2011.04.018_b0145 – year: 2005 ident: 10.1016/j.eswa.2011.04.018_b0130 – ident: 10.1016/j.eswa.2011.04.018_b0035 doi: 10.1145/956750.956761 – volume: 173 start-page: 762 issue: 3 year: 2006 ident: 10.1016/j.eswa.2011.04.018_b0050 article-title: An overlapping cluster algorithm to provide non-exhaustive clustering publication-title: European Journal of Operational Research doi: 10.1016/j.ejor.2005.06.056 – volume: 32 start-page: 51 issue: 8 year: 1999 ident: 10.1016/j.eswa.2011.04.018_b0080 article-title: Interactive data analysis: The control project publication-title: Computer doi: 10.1109/2.781635 – ident: 10.1016/j.eswa.2011.04.018_b0090 doi: 10.1090/conm/026/737400 – volume: 5 start-page: 166 issue: 2 year: 2008 ident: 10.1016/j.eswa.2011.04.018_b0125 article-title: Application of a new symmetry-based cluster validity index for satellite image segmentation publication-title: IEEE Geoscience and Remote Sensing Letters doi: 10.1109/LGRS.2008.915595 – volume: 29 start-page: 867 issue: 4 year: 2005 ident: 10.1016/j.eswa.2011.04.018_b0060 article-title: Prioritization of association rules in data mining: Multiple criteria decision approach publication-title: Expert Systems with Applications doi: 10.1016/j.eswa.2005.06.006 – ident: 10.1016/j.eswa.2011.04.018_b0115 doi: 10.1109/ICDM.2002.1183923 – volume: 14 start-page: 211 issue: 2 year: 2005 ident: 10.1016/j.eswa.2011.04.018_b0005 article-title: An effective and efficient algorithm for high-dimensional outlier detection publication-title: The VLDB Journal doi: 10.1007/s00778-004-0125-5 – volume: 36 start-page: 2338 issue: 2 year: 2009 ident: 10.1016/j.eswa.2011.04.018_b0055 article-title: Using decision trees to summarize associative classification rules publication-title: Expert Systems with Applications doi: 10.1016/j.eswa.2007.12.031 – volume: 3 start-page: 1 issue: 1 year: 2009 ident: 10.1016/j.eswa.2011.04.018_b0100 article-title: Clustering high-dimensional data: A survey on subspace clustering, pattern-based clustering, and correlation clustering publication-title: ACM Transactions on Knowledge Discovery from Data (ACM) doi: 10.1145/1497577.1497578 – ident: 10.1016/j.eswa.2011.04.018_b0045 doi: 10.1145/775047.775114 – ident: 10.1016/j.eswa.2011.04.018_b0075 – ident: 10.1016/j.eswa.2011.04.018_b0010 – ident: 10.1016/j.eswa.2011.04.018_b0165 – volume: 40 start-page: 3430 issue: 12 year: 2007 ident: 10.1016/j.eswa.2011.04.018_b0025 article-title: GAPS: A clustering method using a new point symmetry-based distance measure publication-title: Pattern Recognition doi: 10.1016/j.patcog.2007.03.026 – ident: 10.1016/j.eswa.2011.04.018_b0065 doi: 10.1145/997817.997857 – ident: 10.1016/j.eswa.2011.04.018_b0030 doi: 10.1145/1081870.1081932 – ident: 10.1016/j.eswa.2011.04.018_b0160 doi: 10.1109/RIDE.1997.583696 – ident: 10.1016/j.eswa.2011.04.018_b0095 – volume: 51 start-page: 117 issue: 1 year: 2008 ident: 10.1016/j.eswa.2011.04.018_b0020 article-title: Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions publication-title: Communications of the ACM doi: 10.1145/1327452.1327494 – ident: 10.1016/j.eswa.2011.04.018_b0150 – ident: 10.1016/j.eswa.2011.04.018_b0135 doi: 10.1145/233269.233311 – volume: 33 start-page: 794 issue: 3 year: 2007 ident: 10.1016/j.eswa.2011.04.018_b0105 article-title: Mining association rules through integration of clustering analysis and ant colony system for health insurance database in Taiwan publication-title: Expert Systems with Applications doi: 10.1016/j.eswa.2006.08.035 – volume: 36 start-page: 972 issue: 2 year: 2009 ident: 10.1016/j.eswa.2011.04.018_b0110 article-title: Mining the change of event trends for decision support in environmental scanning publication-title: Expert Systems with Applications doi: 10.1016/j.eswa.2007.10.016 |
| SSID | ssj0017007 |
| Score | 2.0750403 |
| Snippet | ► A novel sampling approach with association rule mining can process very large databases in a reasonable time. ► Outliers are removed based on local... Association rule mining is one of the most important techniques for intelligent system design and has been widely applied in a large number of real... |
| SourceID | proquest crossref elsevier |
| SourceType | Aggregation Database Enrichment Source Index Database Publisher |
| StartPage | 12388 |
| SubjectTerms | Algorithms Association rule mining Clustering Clusters Locality-sensitive hashing Mining Outlier removal Samples Sampling Statistical analysis Statistical methods |
| Title | Locality sensitive hashing for sampling-based algorithms in association rule mining |
| URI | https://dx.doi.org/10.1016/j.eswa.2011.04.018 https://www.proquest.com/docview/1701082286 https://www.proquest.com/docview/889424831 |
| Volume | 38 |
| WOSCitedRecordID | wos000292169500042&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVESC databaseName: Elsevier SD Freedom Collection Journals 2021 customDbUrl: eissn: 1873-6793 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0017007 issn: 0957-4174 databaseCode: AIEXJ dateStart: 19950101 isFulltext: true titleUrlDefault: https://www.sciencedirect.com providerName: Elsevier |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1bb9MwFLag44EX7mjlJiMhXipPSXN_nFCnMVVlEh3qm-U49pqpc0qSsvHvOY6dpB3agAdeoiZyEtXny_G5-XwIfQCnIQRchCTyAk780A1JKiUnjg9gzoQjWEPT-W0azWbxYpGc2lRM1dAJRErF19fJ-r-KGq6BsPXW2X8Qd_dQuAC_QehwBLHD8a8EP9Wrk7atK12b3lQGLQ1jUlNSWDFdQ67OiV6_shFbnRdlXi8vm7pY1gtrVG5WYnTZ8EfshO91b-TadoBu98ZtZcH7egFh8_k_i83yqtP-x0VpI9TLnJzkSvXAsqFrTelNTm1H8KyPsCbEbMnsIosR8V1DvtMqWS_eBpOzpTJh6TTEfnb9hXNTsfubcjdxhosDUV0x23zVP3Cs-t7ppD37Qo_OplM6nyzmH9ffiSYZ08l4y7hyH-2NwU9yBmjv8PNkcdKlnSLH7K9v_4HdZWUKAm--9jZL5saa3hgq8yfokfUw8KFBxlN0T6hn6HHL3oGtMn-OvrZAwR1QsAUKBqDgXaDgHig4V3gLKFgDBRugvEBnR5P5p2NiKTYI98KgJtwRY0-IgDtSt-WJPJ5lUSLciMPHKsYyEqDyZSDTOBQcPAE_4EnKmfQjKQOXSe8lGqhCiX2EWQbORMqcVIDZx1KRxNyVaZimbsLBTHeHyG2ni3Lbf17ToKxoW2h4QfUUUz3F1PEpTPEQjbp71qb7yp2jg1YK1NqPxi6kgKA773vfioyCctUZM6ZEsamoJivQlAhxOET4ljFxnPhjP_bcV38e8ho97L-YN2hQlxvxFj3gP-q8Kt9ZNP4CtxGr-g |
| linkProvider | Elsevier |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Locality+sensitive+hashing+for+sampling-based+algorithms+in+association+rule+mining&rft.jtitle=Expert+systems+with+applications&rft.au=Chen%2C+Chyouhwa&rft.au=Horng%2C+Shi-Jinn&rft.au=Huang%2C+Chin-Pin&rft.date=2011-09-15&rft.issn=0957-4174&rft.volume=38&rft.issue=10&rft.spage=12388&rft.epage=12397&rft_id=info:doi/10.1016%2Fj.eswa.2011.04.018&rft.externalDBID=NO_FULL_TEXT |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0957-4174&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0957-4174&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0957-4174&client=summon |