Locality sensitive hashing for sampling-based algorithms in association rule mining

► A novel sampling approach with association rule mining can process very large databases in a reasonable time. ► Outliers are removed based on local properties of individual clusters. ► The proposed algorithms are shown to exhibit better accuracy or execution time than previously proposed algorithm...

Full description

Saved in:
Bibliographic Details
Published in:Expert systems with applications Vol. 38; no. 10; pp. 12388 - 12397
Main Authors: Chen, Chyouhwa, Horng, Shi-Jinn, Huang, Chin-Pin
Format: Journal Article
Language:English
Published: Elsevier Ltd 15.09.2011
Subjects:
ISSN:0957-4174, 1873-6793
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Abstract ► A novel sampling approach with association rule mining can process very large databases in a reasonable time. ► Outliers are removed based on local properties of individual clusters. ► The proposed algorithms are shown to exhibit better accuracy or execution time than previously proposed algorithms. Association rule mining is one of the most important techniques for intelligent system design and has been widely applied in a large number of real applications. However, classical mining algorithms cannot process very large databases in a reasonable amount of time. The sampling approach that processes a subset of the whole database is a viable alternative. Obviously, such an approach cannot extract perfectly accurate rules. Previous works have tried to improve the accuracy by removing “outliers” from the initial sample based on global statistical properties in the sample. In this paper, we take the view that the initial sample may actually consist of multiple possibly overlapping subsets or clusters. It is more reasonable to apply data clustering techniques to the initial sample before outlier removal is performed on the resulting clusters, so that outliers are removed based on local properties of individual clusters. However, clustering transactional data with very high dimensions is a difficult problem by itself. We solve this problem by interpreting locality sensitive hashing as a means for data clustering. Previously proposed algorithms may be then optionally used to remove the outliers in the individual clusters. We propose several concrete algorithms based on this general strategy. Using an extensive set of synthetic data and real datasets, we evaluate our proposed algorithms and find that our proposals exhibit better accuracy or execution time, or both, than previously proposed algorithms.
AbstractList ► A novel sampling approach with association rule mining can process very large databases in a reasonable time. ► Outliers are removed based on local properties of individual clusters. ► The proposed algorithms are shown to exhibit better accuracy or execution time than previously proposed algorithms. Association rule mining is one of the most important techniques for intelligent system design and has been widely applied in a large number of real applications. However, classical mining algorithms cannot process very large databases in a reasonable amount of time. The sampling approach that processes a subset of the whole database is a viable alternative. Obviously, such an approach cannot extract perfectly accurate rules. Previous works have tried to improve the accuracy by removing “outliers” from the initial sample based on global statistical properties in the sample. In this paper, we take the view that the initial sample may actually consist of multiple possibly overlapping subsets or clusters. It is more reasonable to apply data clustering techniques to the initial sample before outlier removal is performed on the resulting clusters, so that outliers are removed based on local properties of individual clusters. However, clustering transactional data with very high dimensions is a difficult problem by itself. We solve this problem by interpreting locality sensitive hashing as a means for data clustering. Previously proposed algorithms may be then optionally used to remove the outliers in the individual clusters. We propose several concrete algorithms based on this general strategy. Using an extensive set of synthetic data and real datasets, we evaluate our proposed algorithms and find that our proposals exhibit better accuracy or execution time, or both, than previously proposed algorithms.
Association rule mining is one of the most important techniques for intelligent system design and has been widely applied in a large number of real applications. However, classical mining algorithms cannot process very large databases in a reasonable amount of time. The sampling approach that processes a subset of the whole database is a viable alternative. Obviously, such an approach cannot extract perfectly accurate rules. Previous works have tried to improve the accuracy by removing "outliers" from the initial sample based on global statistical properties in the sample. In this paper, we take the view that the initial sample may actually consist of multiple possibly overlapping subsets or clusters. It is more reasonable to apply data clustering techniques to the initial sample before outlier removal is performed on the resulting clusters, so that outliers are removed based on local properties of individual clusters. However, clustering transactional data with very high dimensions is a difficult problem by itself. We solve this problem by interpreting locality sensitive hashing as a means for data clustering. Previously proposed algorithms may be then optionally used to remove the outliers in the individual clusters. We propose several concrete algorithms based on this general strategy. Using an extensive set of synthetic data and real datasets, we evaluate our proposed algorithms and find that our proposals exhibit better accuracy or execution time, or both, than previously proposed algorithms.
Author Chen, Chyouhwa
Huang, Chin-Pin
Horng, Shi-Jinn
Author_xml – sequence: 1
  givenname: Chyouhwa
  surname: Chen
  fullname: Chen, Chyouhwa
– sequence: 2
  givenname: Shi-Jinn
  surname: Horng
  fullname: Horng, Shi-Jinn
  email: horngsj@yahoo.com.tw
– sequence: 3
  givenname: Chin-Pin
  surname: Huang
  fullname: Huang, Chin-Pin
BookMark eNp9kE1rGzEQQEVJIE7aP5CTbu1lt9JKu9JCLiX0Cww5pD0LWTuKZXYlVyOn5N9HrnPqwadh4L2BedfkIqYIhNxy1nLGh8-7FvCvbTvGectky7h-R1ZcK9EMahQXZMXGXjWSK3lFrhF3jHHFmFqRx3Vydg7lhSJEDCU8A91a3Ib4RH3KFO2yn-vSbCzCRO38lHIo2wVpiNQiJhdsCSnSfJiBLiFW9j259HZG-PA2b8jvb19_3f9o1g_ff95_WTdODH1pHINOAPSO-frCqISbJjUCV45JBZ1XwLXwvd_oAZzgTPZu3DjrpfK-59aLG_LxdHef058DYDFLQAfzbCOkAxqtR9lJLXglP50lawzOdNfpoaLdCXU5IWbwZp_DYvOL4cwcW5udObY2x9aGSVNbV0n_J7lQ_nUp2Yb5vHp3UqGWeg6QDboA0cEUMrhiphTO6a-lRZ4U
CitedBy_id crossref_primary_10_1145_2629586
crossref_primary_10_1155_2015_217216
crossref_primary_10_1016_j_engappai_2017_07_016
crossref_primary_10_1016_j_scico_2017_04_006
crossref_primary_10_3233_IDA_184183
crossref_primary_10_1109_TCYB_2021_3125196
crossref_primary_10_1002_cpe_2863
crossref_primary_10_1088_1674_1056_23_8_080203
crossref_primary_10_1016_j_asoc_2024_112606
crossref_primary_10_1016_j_eswa_2016_06_008
crossref_primary_10_1145_3725312
Cites_doi 10.1109/ICPP.2008.25
10.1007/s10618-008-0093-2
10.1145/1514894.1514927
10.1016/j.datak.2007.07.011
10.1145/956750.956761
10.1016/j.ejor.2005.06.056
10.1109/2.781635
10.1090/conm/026/737400
10.1109/LGRS.2008.915595
10.1016/j.eswa.2005.06.006
10.1109/ICDM.2002.1183923
10.1007/s00778-004-0125-5
10.1016/j.eswa.2007.12.031
10.1145/1497577.1497578
10.1145/775047.775114
10.1016/j.patcog.2007.03.026
10.1145/997817.997857
10.1145/1081870.1081932
10.1109/RIDE.1997.583696
10.1145/1327452.1327494
10.1145/233269.233311
10.1016/j.eswa.2006.08.035
10.1016/j.eswa.2007.10.016
ContentType Journal Article
Copyright 2011 Elsevier Ltd
Copyright_xml – notice: 2011 Elsevier Ltd
DBID AAYXX
CITATION
7SC
8FD
JQ2
L7M
L~C
L~D
DOI 10.1016/j.eswa.2011.04.018
DatabaseName CrossRef
Computer and Information Systems Abstracts
Technology Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
DatabaseTitle CrossRef
Computer and Information Systems Abstracts
Technology Research Database
Computer and Information Systems Abstracts – Academic
Advanced Technologies Database with Aerospace
ProQuest Computer Science Collection
Computer and Information Systems Abstracts Professional
DatabaseTitleList
Computer and Information Systems Abstracts
Computer and Information Systems Abstracts
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISSN 1873-6793
EndPage 12397
ExternalDocumentID 10_1016_j_eswa_2011_04_018
S0957417411005343
GrantInformation_xml – fundername: National Science Council
  grantid: NSC 96-2918-I-011-002; 96-2221-E-011-022; 95-2221-E-011-032-MY3
GroupedDBID --K
--M
.DC
.~1
0R~
13V
1B1
1RT
1~.
1~5
4.4
457
4G.
5GY
5VS
7-5
71M
8P~
9JN
9JO
AAAKF
AABNK
AACTN
AAEDT
AAEDW
AAIAV
AAIKJ
AAKOC
AALRI
AAOAW
AAQFI
AARIN
AAXUO
AAYFN
ABBOA
ABFNM
ABMAC
ABMVD
ABUCO
ABXDB
ABYKQ
ACDAQ
ACGFS
ACHRH
ACNNM
ACRLP
ACZNC
ADBBV
ADEZE
ADMUD
ADTZH
AEBSH
AECPX
AEKER
AENEX
AFKWA
AFTJW
AGHFR
AGJBL
AGUBO
AGUMN
AGYEJ
AHHHB
AHJVU
AHZHX
AIALX
AIEXJ
AIKHN
AITUG
AJBFU
AJOXV
ALEQD
ALMA_UNASSIGNED_HOLDINGS
AMFUW
AMRAJ
AOUOD
APLSM
AXJTR
BJAXD
BKOJK
BLXMC
BNSAS
CS3
DU5
EBS
EFJIC
EFLBG
EJD
EO8
EO9
EP2
EP3
F5P
FDB
FIRID
FNPLU
FYGXN
G-Q
GBLVA
GBOLZ
HAMUX
HZ~
IHE
J1W
JJJVA
KOM
LG9
LY1
LY7
M41
MO0
N9A
O-L
O9-
OAUVE
OZT
P-8
P-9
P2P
PC.
PQQKQ
Q38
RIG
ROL
RPZ
SDF
SDG
SDP
SDS
SES
SPC
SPCBC
SSB
SSD
SSL
SST
SSV
SSZ
T5K
TN5
~G-
29G
9DU
AAAKG
AAQXK
AATTM
AAXKI
AAYWO
AAYXX
ABJNI
ABKBG
ABUFD
ABWVN
ACLOT
ACNTT
ACRPL
ACVFH
ADCNI
ADJOM
ADNMO
AEIPS
AEUPX
AFJKZ
AFPUW
AGQPQ
AIGII
AIIUN
AKBMS
AKRWK
AKYEP
ANKPU
APXCP
ASPBG
AVWKF
AZFZN
CITATION
EFKBS
FEDTE
FGOYB
G-2
HLZ
HVGLF
R2-
SBC
SET
SEW
WUQ
XPP
ZMT
~HD
7SC
8FD
JQ2
L7M
L~C
L~D
ID FETCH-LOGICAL-c365t-c0e23ee5c0f016973cdd79e17c047e2f7e183f5fb86ec31045c9bcaf47ff51af3
ISICitedReferencesCount 12
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000292169500042&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 0957-4174
IngestDate Sun Sep 28 06:04:39 EDT 2025
Sun Sep 28 06:35:53 EDT 2025
Tue Nov 18 21:04:28 EST 2025
Sat Nov 29 07:04:53 EST 2025
Fri Feb 23 02:40:17 EST 2024
IsPeerReviewed true
IsScholarly true
Issue 10
Keywords Outlier removal
Association rule mining
Sampling
Clustering
Locality-sensitive hashing
Language English
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-c365t-c0e23ee5c0f016973cdd79e17c047e2f7e183f5fb86ec31045c9bcaf47ff51af3
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ObjectType-Article-2
ObjectType-Feature-1
PQID 1701082286
PQPubID 23500
PageCount 10
ParticipantIDs proquest_miscellaneous_889424831
proquest_miscellaneous_1701082286
crossref_primary_10_1016_j_eswa_2011_04_018
crossref_citationtrail_10_1016_j_eswa_2011_04_018
elsevier_sciencedirect_doi_10_1016_j_eswa_2011_04_018
PublicationCentury 2000
PublicationDate 2011-09-15
PublicationDateYYYYMMDD 2011-09-15
PublicationDate_xml – month: 09
  year: 2011
  text: 2011-09-15
  day: 15
PublicationDecade 2010
PublicationTitle Expert systems with applications
PublicationYear 2011
Publisher Elsevier Ltd
Publisher_xml – name: Elsevier Ltd
References Bronnimann, H., Chen, B., Dash, M., Haas, P. J., & Scheuermann, P. (2003). Efficient data reduction with EASE. In
IBM almaden, (2001).
Choi, Ahn, Kim (b0060) 2005; 29
Banerjee, A., Krumpelman, C., Ghosh, J., Basu, S., & Mooney, R. J. (2005). Model-based overlapping clustering. In
Tan, Steinbach, Kumar (b0140) 2006
(pp. 194–205).
Hellerstein, Avnur, Chou, Hidber, Olston, Raman (b0080) 1999; 32
(pp. 487–499).
Chen, Hu (b0050) 2006; 173
Bandyopadhyay, Saha (b0025) 2007; 40
(pp. 644–651).
Liu, Shih, Liau, Lai (b0110) 2009; 36
Sayood (b0130) 2005
Saha, Bandyopadhyay (b0125) 2008; 5
Datar, M., Immorlica, N., Indyk, P., & Mirrokni, V. S. (2004). Locality-sensitive hashing scheme based on
Toivonen, H. (1996). Sampling large databases for association rules. In
Weber, R., Schek, H. J., & Blott, S. (1998). A quantitative analysis and performance study for similarity search methods in high-dimensional space. In
(pp. 42–49).
Parthasarathy, S. (2002). Efficient progressive sampling for association rules. In
(pp. 518–529).
.
Knorr, E., & Ng, R. (1998). Algorithms for mining distance-based outliers in large data sets. In
Agrawal, R., & Srikant, R. (1994). Fast algorithms for mining association rules. In
(pp. 276–283).
(pp. 253–262).
(pp. 532–537).
Akcan, Astashyn, Bronnimann (b0015) 2008; 64
stable distributions. In
Kuo, Lin, Shih (b0105) 2007; 33
Ghoting, Parthasarathy, Otey (b0070) 2008; 16
Chen, B., Haas, P., & Scheuermann, P. (2002). A new two-phase sampling based algorithm for discovering association rules. In
Srikant, R., & Agrawal, R. (1996). Mining quantitative association rules in large relational tables. In
Chakaravarthy, V. T., Pandit, V., & Sabharwal, Y. (2009). Analysis of sampling techniques for association rule mining. In
Aggarwal, Philip (b0005) 2005; 14
(pp. 1–12).
(pp. 59–68).
(pp. 134–145).
Johnson, W. B., & Lindenstrauss, J. (1984). Extensions of Lipchitz mappings into a hilbert space. In
Andoni, Indyk (b0020) 2008; 51
Kriegel, Kröger, Zimek (b0100) 2009; 3
(pp. 189–206).
FIMI Repository, (2003).
Gionis, A., Indyk, P., & Motwani, R. (1999). Similarity search in high dimensions via hashing. In
(pp. 462–468).
(pp. 392–403).
Hua, Y. X., Dan, B. F., & Yu, B. (2008). Bounded LSH for similarity search in peer-to-peer file systems. In
(pp. 354–361).
Chen, Hung (b0055) 2009; 36
Zaki, J., Parthasarathy, S., Lin, W., & Ogihara, M. (1997). Evaluation of sampling for data mining of association rules. In
10.1016/j.eswa.2011.04.018_b0160
10.1016/j.eswa.2011.04.018_b0040
Chen (10.1016/j.eswa.2011.04.018_b0050) 2006; 173
10.1016/j.eswa.2011.04.018_b0085
Ghoting (10.1016/j.eswa.2011.04.018_b0070) 2008; 16
Kuo (10.1016/j.eswa.2011.04.018_b0105) 2007; 33
Liu (10.1016/j.eswa.2011.04.018_b0110) 2009; 36
Tan (10.1016/j.eswa.2011.04.018_b0140) 2006
Sayood (10.1016/j.eswa.2011.04.018_b0130) 2005
Akcan (10.1016/j.eswa.2011.04.018_b0015) 2008; 64
10.1016/j.eswa.2011.04.018_b0145
Choi (10.1016/j.eswa.2011.04.018_b0060) 2005; 29
10.1016/j.eswa.2011.04.018_b0065
10.1016/j.eswa.2011.04.018_b0120
Saha (10.1016/j.eswa.2011.04.018_b0125) 2008; 5
Andoni (10.1016/j.eswa.2011.04.018_b0020) 2008; 51
10.1016/j.eswa.2011.04.018_b0165
10.1016/j.eswa.2011.04.018_b0045
Aggarwal (10.1016/j.eswa.2011.04.018_b0005) 2005; 14
Bandyopadhyay (10.1016/j.eswa.2011.04.018_b0025) 2007; 40
Kriegel (10.1016/j.eswa.2011.04.018_b0100) 2009; 3
Hellerstein (10.1016/j.eswa.2011.04.018_b0080) 1999; 32
10.1016/j.eswa.2011.04.018_b0095
10.1016/j.eswa.2011.04.018_b0150
10.1016/j.eswa.2011.04.018_b0030
10.1016/j.eswa.2011.04.018_b0090
10.1016/j.eswa.2011.04.018_b0035
10.1016/j.eswa.2011.04.018_b0135
10.1016/j.eswa.2011.04.018_b0115
10.1016/j.eswa.2011.04.018_b0075
10.1016/j.eswa.2011.04.018_b0010
Chen (10.1016/j.eswa.2011.04.018_b0055) 2009; 36
References_xml – year: 2006
  ident: b0140
  article-title: Introduction to data mining
– reference: Agrawal, R., & Srikant, R. (1994). Fast algorithms for mining association rules. In
– reference: (pp. 462–468).
– reference: (pp. 532–537).
– reference: Bronnimann, H., Chen, B., Dash, M., Haas, P. J., & Scheuermann, P. (2003). Efficient data reduction with EASE. In
– reference: (pp. 253–262).
– reference: IBM almaden, (2001).
– volume: 33
  start-page: 794
  year: 2007
  end-page: 808
  ident: b0105
  article-title: Mining association rules through integration of clustering analysis and ant colony system for health insurance database in Taiwan
  publication-title: Expert Systems with Applications
– reference: Johnson, W. B., & Lindenstrauss, J. (1984). Extensions of Lipchitz mappings into a hilbert space. In
– reference: (pp. 276–283).
– reference: Knorr, E., & Ng, R. (1998). Algorithms for mining distance-based outliers in large data sets. In
– reference: (pp. 194–205).
– reference: Srikant, R., & Agrawal, R. (1996). Mining quantitative association rules in large relational tables. In
– reference: Zaki, J., Parthasarathy, S., Lin, W., & Ogihara, M. (1997). Evaluation of sampling for data mining of association rules. In
– year: 2005
  ident: b0130
  article-title: Introduction to data compression
– volume: 40
  start-page: 3430
  year: 2007
  end-page: 3451
  ident: b0025
  article-title: GAPS: A clustering method using a new point symmetry-based distance measure
  publication-title: Pattern Recognition
– reference: Gionis, A., Indyk, P., & Motwani, R. (1999). Similarity search in high dimensions via hashing. In
– reference: (pp. 487–499).
– volume: 3
  start-page: 1
  year: 2009
  end-page: 58
  ident: b0100
  article-title: Clustering high-dimensional data: A survey on subspace clustering, pattern-based clustering, and correlation clustering
  publication-title: ACM Transactions on Knowledge Discovery from Data (ACM)
– reference: .
– volume: 14
  start-page: 211
  year: 2005
  end-page: 221
  ident: b0005
  article-title: An effective and efficient algorithm for high-dimensional outlier detection
  publication-title: The VLDB Journal
– reference: (pp. 644–651).
– volume: 5
  start-page: 166
  year: 2008
  end-page: 170
  ident: b0125
  article-title: Application of a new symmetry-based cluster validity index for satellite image segmentation
  publication-title: IEEE Geoscience and Remote Sensing Letters
– reference: Banerjee, A., Krumpelman, C., Ghosh, J., Basu, S., & Mooney, R. J. (2005). Model-based overlapping clustering. In
– reference: (pp. 59–68).
– reference: Weber, R., Schek, H. J., & Blott, S. (1998). A quantitative analysis and performance study for similarity search methods in high-dimensional space. In
– volume: 16
  start-page: 349
  year: 2008
  end-page: 364
  ident: b0070
  article-title: Fast mining of distance-based outliers in high-dimensional datasets
  publication-title: Data Mining and Knowledge Discovery
– reference: (pp. 354–361).
– reference: (pp. 189–206).
– reference: Toivonen, H. (1996). Sampling large databases for association rules. In
– volume: 32
  start-page: 51
  year: 1999
  end-page: 59
  ident: b0080
  article-title: Interactive data analysis: The control project
  publication-title: Computer
– volume: 51
  start-page: 117
  year: 2008
  end-page: 122
  ident: b0020
  article-title: Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions
  publication-title: Communications of the ACM
– reference: Chakaravarthy, V. T., Pandit, V., & Sabharwal, Y. (2009). Analysis of sampling techniques for association rule mining. In
– reference: Chen, B., Haas, P., & Scheuermann, P. (2002). A new two-phase sampling based algorithm for discovering association rules. In
– reference: Hua, Y. X., Dan, B. F., & Yu, B. (2008). Bounded LSH for similarity search in peer-to-peer file systems. In
– reference: (pp. 392–403).
– reference: (pp. 1–12).
– volume: 173
  start-page: 762
  year: 2006
  end-page: 780
  ident: b0050
  article-title: An overlapping cluster algorithm to provide non-exhaustive clustering
  publication-title: European Journal of Operational Research
– volume: 64
  start-page: 405
  year: 2008
  end-page: 418
  ident: b0015
  article-title: Deterministic algorithms for sampling count data
  publication-title: Data and Knowledge Engineering
– volume: 36
  start-page: 972
  year: 2009
  end-page: 984
  ident: b0110
  article-title: Mining the change of event trends for decision support in environmental scanning
  publication-title: Expert Systems with Applications
– reference: (pp. 42–49).
– reference: -stable distributions. In
– reference: Datar, M., Immorlica, N., Indyk, P., & Mirrokni, V. S. (2004). Locality-sensitive hashing scheme based on
– reference: .
– volume: 29
  start-page: 867
  year: 2005
  end-page: 878
  ident: b0060
  article-title: Prioritization of association rules in data mining: Multiple criteria decision approach
  publication-title: Expert Systems with Applications
– reference: (pp. 518–529).
– reference: Parthasarathy, S. (2002). Efficient progressive sampling for association rules. In
– reference: FIMI Repository, (2003).
– reference: (pp. 134–145).
– volume: 36
  start-page: 2338
  year: 2009
  end-page: 2351
  ident: b0055
  article-title: Using decision trees to summarize associative classification rules
  publication-title: Expert Systems with Applications
– ident: 10.1016/j.eswa.2011.04.018_b0085
  doi: 10.1109/ICPP.2008.25
– year: 2006
  ident: 10.1016/j.eswa.2011.04.018_b0140
– volume: 16
  start-page: 349
  issue: 3
  year: 2008
  ident: 10.1016/j.eswa.2011.04.018_b0070
  article-title: Fast mining of distance-based outliers in high-dimensional datasets
  publication-title: Data Mining and Knowledge Discovery
  doi: 10.1007/s10618-008-0093-2
– ident: 10.1016/j.eswa.2011.04.018_b0040
  doi: 10.1145/1514894.1514927
– volume: 64
  start-page: 405
  issue: 2
  year: 2008
  ident: 10.1016/j.eswa.2011.04.018_b0015
  article-title: Deterministic algorithms for sampling count data
  publication-title: Data and Knowledge Engineering
  doi: 10.1016/j.datak.2007.07.011
– ident: 10.1016/j.eswa.2011.04.018_b0120
– ident: 10.1016/j.eswa.2011.04.018_b0145
– year: 2005
  ident: 10.1016/j.eswa.2011.04.018_b0130
– ident: 10.1016/j.eswa.2011.04.018_b0035
  doi: 10.1145/956750.956761
– volume: 173
  start-page: 762
  issue: 3
  year: 2006
  ident: 10.1016/j.eswa.2011.04.018_b0050
  article-title: An overlapping cluster algorithm to provide non-exhaustive clustering
  publication-title: European Journal of Operational Research
  doi: 10.1016/j.ejor.2005.06.056
– volume: 32
  start-page: 51
  issue: 8
  year: 1999
  ident: 10.1016/j.eswa.2011.04.018_b0080
  article-title: Interactive data analysis: The control project
  publication-title: Computer
  doi: 10.1109/2.781635
– ident: 10.1016/j.eswa.2011.04.018_b0090
  doi: 10.1090/conm/026/737400
– volume: 5
  start-page: 166
  issue: 2
  year: 2008
  ident: 10.1016/j.eswa.2011.04.018_b0125
  article-title: Application of a new symmetry-based cluster validity index for satellite image segmentation
  publication-title: IEEE Geoscience and Remote Sensing Letters
  doi: 10.1109/LGRS.2008.915595
– volume: 29
  start-page: 867
  issue: 4
  year: 2005
  ident: 10.1016/j.eswa.2011.04.018_b0060
  article-title: Prioritization of association rules in data mining: Multiple criteria decision approach
  publication-title: Expert Systems with Applications
  doi: 10.1016/j.eswa.2005.06.006
– ident: 10.1016/j.eswa.2011.04.018_b0115
  doi: 10.1109/ICDM.2002.1183923
– volume: 14
  start-page: 211
  issue: 2
  year: 2005
  ident: 10.1016/j.eswa.2011.04.018_b0005
  article-title: An effective and efficient algorithm for high-dimensional outlier detection
  publication-title: The VLDB Journal
  doi: 10.1007/s00778-004-0125-5
– volume: 36
  start-page: 2338
  issue: 2
  year: 2009
  ident: 10.1016/j.eswa.2011.04.018_b0055
  article-title: Using decision trees to summarize associative classification rules
  publication-title: Expert Systems with Applications
  doi: 10.1016/j.eswa.2007.12.031
– volume: 3
  start-page: 1
  issue: 1
  year: 2009
  ident: 10.1016/j.eswa.2011.04.018_b0100
  article-title: Clustering high-dimensional data: A survey on subspace clustering, pattern-based clustering, and correlation clustering
  publication-title: ACM Transactions on Knowledge Discovery from Data (ACM)
  doi: 10.1145/1497577.1497578
– ident: 10.1016/j.eswa.2011.04.018_b0045
  doi: 10.1145/775047.775114
– ident: 10.1016/j.eswa.2011.04.018_b0075
– ident: 10.1016/j.eswa.2011.04.018_b0010
– ident: 10.1016/j.eswa.2011.04.018_b0165
– volume: 40
  start-page: 3430
  issue: 12
  year: 2007
  ident: 10.1016/j.eswa.2011.04.018_b0025
  article-title: GAPS: A clustering method using a new point symmetry-based distance measure
  publication-title: Pattern Recognition
  doi: 10.1016/j.patcog.2007.03.026
– ident: 10.1016/j.eswa.2011.04.018_b0065
  doi: 10.1145/997817.997857
– ident: 10.1016/j.eswa.2011.04.018_b0030
  doi: 10.1145/1081870.1081932
– ident: 10.1016/j.eswa.2011.04.018_b0160
  doi: 10.1109/RIDE.1997.583696
– ident: 10.1016/j.eswa.2011.04.018_b0095
– volume: 51
  start-page: 117
  issue: 1
  year: 2008
  ident: 10.1016/j.eswa.2011.04.018_b0020
  article-title: Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions
  publication-title: Communications of the ACM
  doi: 10.1145/1327452.1327494
– ident: 10.1016/j.eswa.2011.04.018_b0150
– ident: 10.1016/j.eswa.2011.04.018_b0135
  doi: 10.1145/233269.233311
– volume: 33
  start-page: 794
  issue: 3
  year: 2007
  ident: 10.1016/j.eswa.2011.04.018_b0105
  article-title: Mining association rules through integration of clustering analysis and ant colony system for health insurance database in Taiwan
  publication-title: Expert Systems with Applications
  doi: 10.1016/j.eswa.2006.08.035
– volume: 36
  start-page: 972
  issue: 2
  year: 2009
  ident: 10.1016/j.eswa.2011.04.018_b0110
  article-title: Mining the change of event trends for decision support in environmental scanning
  publication-title: Expert Systems with Applications
  doi: 10.1016/j.eswa.2007.10.016
SSID ssj0017007
Score 2.0750403
Snippet ► A novel sampling approach with association rule mining can process very large databases in a reasonable time. ► Outliers are removed based on local...
Association rule mining is one of the most important techniques for intelligent system design and has been widely applied in a large number of real...
SourceID proquest
crossref
elsevier
SourceType Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 12388
SubjectTerms Algorithms
Association rule mining
Clustering
Clusters
Locality-sensitive hashing
Mining
Outlier removal
Samples
Sampling
Statistical analysis
Statistical methods
Title Locality sensitive hashing for sampling-based algorithms in association rule mining
URI https://dx.doi.org/10.1016/j.eswa.2011.04.018
https://www.proquest.com/docview/1701082286
https://www.proquest.com/docview/889424831
Volume 38
WOSCitedRecordID wos000292169500042&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVESC
  databaseName: Elsevier SD Freedom Collection Journals 2021
  customDbUrl:
  eissn: 1873-6793
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0017007
  issn: 0957-4174
  databaseCode: AIEXJ
  dateStart: 19950101
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1bb9MwFLag44EX7mjlJiMhXipPSXN_nFCnMVVlEh3qm-U49pqpc0qSsvHvOY6dpB3agAdeoiZyEtXny_G5-XwIfQCnIQRchCTyAk780A1JKiUnjg9gzoQjWEPT-W0azWbxYpGc2lRM1dAJRErF19fJ-r-KGq6BsPXW2X8Qd_dQuAC_QehwBLHD8a8EP9Wrk7atK12b3lQGLQ1jUlNSWDFdQ67OiV6_shFbnRdlXi8vm7pY1gtrVG5WYnTZ8EfshO91b-TadoBu98ZtZcH7egFh8_k_i83yqtP-x0VpI9TLnJzkSvXAsqFrTelNTm1H8KyPsCbEbMnsIosR8V1DvtMqWS_eBpOzpTJh6TTEfnb9hXNTsfubcjdxhosDUV0x23zVP3Cs-t7ppD37Qo_OplM6nyzmH9ffiSYZ08l4y7hyH-2NwU9yBmjv8PNkcdKlnSLH7K9v_4HdZWUKAm--9jZL5saa3hgq8yfokfUw8KFBxlN0T6hn6HHL3oGtMn-OvrZAwR1QsAUKBqDgXaDgHig4V3gLKFgDBRugvEBnR5P5p2NiKTYI98KgJtwRY0-IgDtSt-WJPJ5lUSLciMPHKsYyEqDyZSDTOBQcPAE_4EnKmfQjKQOXSe8lGqhCiX2EWQbORMqcVIDZx1KRxNyVaZimbsLBTHeHyG2ni3Lbf17ToKxoW2h4QfUUUz3F1PEpTPEQjbp71qb7yp2jg1YK1NqPxi6kgKA773vfioyCctUZM6ZEsamoJivQlAhxOET4ljFxnPhjP_bcV38e8ho97L-YN2hQlxvxFj3gP-q8Kt9ZNP4CtxGr-g
linkProvider Elsevier
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Locality+sensitive+hashing+for+sampling-based+algorithms+in+association+rule+mining&rft.jtitle=Expert+systems+with+applications&rft.au=Chen%2C+Chyouhwa&rft.au=Horng%2C+Shi-Jinn&rft.au=Huang%2C+Chin-Pin&rft.date=2011-09-15&rft.issn=0957-4174&rft.volume=38&rft.issue=10&rft.spage=12388&rft.epage=12397&rft_id=info:doi/10.1016%2Fj.eswa.2011.04.018&rft.externalDBID=NO_FULL_TEXT
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0957-4174&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0957-4174&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0957-4174&client=summon