Cure: an efficient clustering algorithm for large databases
Clustering, in data mining, is useful for discovering groups and identifying interesting distributions in the underlying data. Traditional clustering algorithms either favor clusters with spherical shapes and similar sizes, or are very fragile in the presence of outliers. We propose a new clustering...
Uložené v:
| Vydané v: | Information systems (Oxford) Ročník 26; číslo 1; s. 35 - 58 |
|---|---|
| Hlavní autori: | , , |
| Médium: | Journal Article |
| Jazyk: | English |
| Vydavateľské údaje: |
Elsevier Ltd
01.03.2001
|
| Predmet: | |
| ISSN: | 0306-4379, 1873-6076 |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Abstract | Clustering, in data mining, is useful for discovering groups and identifying interesting distributions in the underlying data. Traditional clustering algorithms either favor clusters with spherical shapes and similar sizes, or are very fragile in the presence of outliers. We propose a new clustering algorithm called CURE that is more robust to outliers, and identifies clusters having non-spherical shapes and wide variances in size. CURE achieves this by representing each cluster by a certain fixed number of points that are generated by selecting
well scattered points from the cluster and then shrinking them toward the center of the cluster by a specified fraction. Having more than one representative point per cluster allows CURE to adjust well to the geometry of non-spherical shapes and the shrinking helps to dampen the effects of outliers. To handle large databases, CURE employs a combination of
random sampling and
partitioning. A random sample drawn from the data set is first partitioned and each partition is
partially clustered. The partial clusters are then clustered in a second pass to yield the desired clusters. Our experimental results confirm that the quality of clusters produced by CURE is much better than those found by existing algorithms. Furthermore, they demonstrate that random sampling and partitioning enable CURE to not only outperform existing algorithms but also to scale well for large databases without sacrificing clustering quality. |
|---|---|
| AbstractList | Clustering, in data mining, is useful for discovering groups and identifying interesting distributions in the underlying data. Traditional clustering algorithms either favor clusters with spherical shapes and similar sizes, or are very fragile in the presence of outliers. We propose a new clustering algorithm called CURE that is more robust to outliers, and identifies clusters having non-spherical shapes and wide variances in size. CURE achieves this by representing each cluster by a certain fixed number of points that are generated by selecting well scattered points from the cluster and then shrinking them toward the center of the cluster by a specified fraction. Having more than one representative point per cluster allows CURE to adjust well to the geometry of non-spherical shapes and the shrinking helps to dampen the effects of outliers. To handle large databases, CURE employs a combination of random sampling and partitioning. A random sample drawn from the data set is first partitioned and each partition is partially clustered. The partial clusters are then clustered in a second pass to yield the desired clusters. Our experimental results confirm that the quality of clusters produced by CURE is much better than those found by existing algorithms. Furthermore, they demonstrate that random sampling and partitioning enable CURE to not only outperform existing algorithms but also to scale well for large databases without sacrificing clustering quality. copyright 2001 Published by Elsevier Science Ltd. Clustering, in data mining, is useful for discovering groups and identifying interesting distributions in the underlying data. Traditional clustering algorithms either favour clusters with spherical shapes and similar sizes, or are very fragile in the presence of outliers. Proposes a new clustering algorithm called CURE that is more robust to outliers and identifies clusters having non-spherical shapes and wide variances in size. CURE achieves this by representing each cluster by a certain fixed number of points that are generated by selecting well scattered points from the cluster and then shrinking them toward the centre of the cluster by a specified fraction. Experimental results confirm that the quality of clusters produced by CURE is much better than those found by existing algorithms. (Original abstract - amended) Clustering, in data mining, is useful for discovering groups and identifying interesting distributions in the underlying data. Traditional clustering algorithms either favor clusters with spherical shapes and similar sizes, or are very fragile in the presence of outliers. We propose a new clustering algorithm called CURE that is more robust to outliers, and identifies clusters having non-spherical shapes and wide variances in size. CURE achieves this by representing each cluster by a certain fixed number of points that are generated by selecting well scattered points from the cluster and then shrinking them toward the center of the cluster by a specified fraction. Having more than one representative point per cluster allows CURE to adjust well to the geometry of non-spherical shapes and the shrinking helps to dampen the effects of outliers. To handle large databases, CURE employs a combination of random sampling and partitioning. A random sample drawn from the data set is first partitioned and each partition is partially clustered. The partial clusters are then clustered in a second pass to yield the desired clusters. Our experimental results confirm that the quality of clusters produced by CURE is much better than those found by existing algorithms. Furthermore, they demonstrate that random sampling and partitioning enable CURE to not only outperform existing algorithms but also to scale well for large databases without sacrificing clustering quality. |
| Author | Rastogi, Rajeev Shim, Kyuseok Guha, Sudipto |
| Author_xml | – sequence: 1 givenname: Sudipto surname: Guha fullname: Guha, Sudipto organization: Stanford University, Stanford, CA 94305, USA – sequence: 2 givenname: Rajeev surname: Rastogi fullname: Rastogi, Rajeev organization: Bell Laboratories, Murray Hill, NJ 07974, USA – sequence: 3 givenname: Kyuseok surname: Shim fullname: Shim, Kyuseok organization: Korea Advanced Institute of Science and Technology and, Advanced Information Technology Research Center, Taejon 305-701, Korea |
| BookMark | eNqFkE1LAzEQhoNUsK3-BGFPoofVZPOxiT2IFL-g4EE9h2w-amS7qUlW8N-7bcWDl85lmOF5h-GZgFEXOgvAKYKXCCJ29QIxZCXBtTiH6AIOxUtyAMaI17hksGYjMP5DjsAkpY-BqagQYzCb99FeF6orrHNee9vlQrd9yjb6blmodhmiz--rwoVYtCoubWFUVo1KNh2DQ6faZE9--xS83d-9zh_LxfPD0_x2UWrMRC6NI6ypqGYGklpgZLBonBCCczVsGq3V8BeFhqqKYs6ZqTWqmoZVkFAnMMVTcLa7u47hs7cpy5VP2rat6mzok6Q1JQhysResGOOEVBuQ7kAdQ0rROrmOfqXit0RQbpzKrVO5ESYhklunwzQFs3857bPKPnQ5Kt_uTd_s0naQ9eVtlGkjXFvjo9VZmuD3XPgBHJuRBQ |
| CitedBy_id | crossref_primary_10_1007_s11042_021_10594_9 crossref_primary_10_1016_j_infsof_2020_106456 crossref_primary_10_1109_TCBB_2020_2978188 crossref_primary_10_1016_j_eij_2015_11_004 crossref_primary_10_1007_s10462_016_9477_7 crossref_primary_10_1016_j_eswa_2020_113367 crossref_primary_10_1109_ACCESS_2021_3081500 crossref_primary_10_1007_s11432_010_3112_z crossref_primary_10_1016_j_measurement_2020_108216 crossref_primary_10_1007_s11042_019_7663_8 crossref_primary_10_1016_j_eswa_2019_112947 crossref_primary_10_1007_s00521_019_04297_4 crossref_primary_10_1002_cpe_3545 crossref_primary_10_1057_ori_2011_15 crossref_primary_10_1002_cpe_6121 crossref_primary_10_1002_jbio_201500285 crossref_primary_10_1109_ACCESS_2019_2957807 crossref_primary_10_1007_s11750_014_0333_0 crossref_primary_10_3390_electronics12102316 crossref_primary_10_1016_j_eswa_2023_120633 crossref_primary_10_1109_TPAMI_2013_28 crossref_primary_10_1080_17445760_2018_1446210 crossref_primary_10_1109_TKDE_2018_2842191 crossref_primary_10_1109_TFUZZ_2013_2294355 crossref_primary_10_1016_j_is_2016_02_007 crossref_primary_10_1016_j_infsof_2008_06_004 crossref_primary_10_1080_0951192X_2014_880809 crossref_primary_10_1109_69_877502 crossref_primary_10_3390_ijgi6090272 crossref_primary_10_3390_en14206778 crossref_primary_10_1007_s13042_015_0451_5 crossref_primary_10_1007_s13042_020_01206_3 crossref_primary_10_3390_ijgi12030117 crossref_primary_10_3390_s23115350 crossref_primary_10_1016_j_ins_2021_02_017 crossref_primary_10_1109_TPAMI_2014_2343223 crossref_primary_10_1016_j_engappai_2023_107438 crossref_primary_10_1016_j_fss_2017_11_003 crossref_primary_10_1016_j_neucom_2006_10_034 crossref_primary_10_1080_13658816_2025_2478463 crossref_primary_10_1016_j_patcog_2020_107589 crossref_primary_10_3390_informatics4030024 crossref_primary_10_1016_j_is_2012_09_001 crossref_primary_10_1038_s41467_022_33136_9 crossref_primary_10_1002_cpe_6717 crossref_primary_10_1109_TPAMI_2025_3535743 crossref_primary_10_3390_ijgi10090589 crossref_primary_10_1007_s10489_006_6925_0 crossref_primary_10_1016_j_ejor_2004_05_020 crossref_primary_10_1080_13658816_2018_1541177 crossref_primary_10_3390_s20010023 crossref_primary_10_1002_prs_10060 crossref_primary_10_1007_s10853_021_05848_8 crossref_primary_10_32604_cmc_2024_046314 crossref_primary_10_1007_s10489_016_0814_y crossref_primary_10_1109_ACCESS_2025_3583240 crossref_primary_10_4018_jdwm_2013010101 crossref_primary_10_1016_j_jvcir_2018_07_009 crossref_primary_10_1108_IJCS_09_2019_0024 crossref_primary_10_1109_RBME_2010_2083647 crossref_primary_10_3390_math11234735 crossref_primary_10_1145_3480972 crossref_primary_10_1109_ACCESS_2019_2900260 crossref_primary_10_3390_ijerph16111988 crossref_primary_10_3390_ijerph16122083 crossref_primary_10_1109_ACCESS_2020_3002153 crossref_primary_10_1109_TITB_2010_2040286 crossref_primary_10_1016_j_patcog_2022_109230 crossref_primary_10_1016_j_eswa_2023_120799 crossref_primary_10_1016_j_cosrev_2020_100276 crossref_primary_10_1016_j_neucom_2021_04_029 crossref_primary_10_1002_ecm_1589 crossref_primary_10_1016_j_is_2013_11_002 crossref_primary_10_1038_s41598_025_13848_w crossref_primary_10_3390_bdcc4030019 crossref_primary_10_1109_ACCESS_2021_3075682 crossref_primary_10_1016_j_patcog_2020_107265 crossref_primary_10_1017_dce_2024_36 crossref_primary_10_1155_2023_7493623 crossref_primary_10_1016_j_asoc_2023_110261 crossref_primary_10_1177_1063293X15580857 crossref_primary_10_1016_j_ipm_2015_11_003 crossref_primary_10_1109_ACCESS_2019_2925460 crossref_primary_10_3233_JIFS_201792 crossref_primary_10_3390_sym12010079 crossref_primary_10_1016_j_compchemeng_2024_108712 crossref_primary_10_3390_min14111089 crossref_primary_10_1016_j_eswa_2016_09_015 crossref_primary_10_1155_2021_5592323 crossref_primary_10_1109_TKDE_2018_2792021 crossref_primary_10_1016_j_ergon_2012_05_003 crossref_primary_10_1016_j_knosys_2016_06_032 crossref_primary_10_3390_ijgi13030093 crossref_primary_10_1016_j_eswa_2014_06_016 crossref_primary_10_1007_s11390_017_1797_9 crossref_primary_10_3390_rs16050870 crossref_primary_10_3390_jmse8030224 crossref_primary_10_1038_s41598_023_33214_y crossref_primary_10_1007_s10844_013_0268_1 crossref_primary_10_1016_j_patcog_2021_108305 crossref_primary_10_1007_s10844_021_00668_3 crossref_primary_10_1186_s13638_021_01910_w crossref_primary_10_1109_TSG_2023_3315690 crossref_primary_10_1016_j_mri_2014_02_023 crossref_primary_10_1088_1757_899X_807_1_012017 crossref_primary_10_1007_s11219_020_09505_2 crossref_primary_10_1080_0951192X_2011_592991 crossref_primary_10_1007_s00267_011_9690_8 crossref_primary_10_1016_j_eswa_2019_03_051 crossref_primary_10_3390_ijgi10100669 crossref_primary_10_1109_ACCESS_2023_3296533 crossref_primary_10_1007_s10489_022_03705_y crossref_primary_10_1007_s13369_021_06177_3 crossref_primary_10_1016_j_fss_2004_09_014 crossref_primary_10_1016_j_simpat_2008_10_005 crossref_primary_10_1080_10580530_2011_585583 crossref_primary_10_1007_s00357_015_9166_2 crossref_primary_10_1007_s12559_022_10002_w crossref_primary_10_1109_ACCESS_2019_2928628 crossref_primary_10_1016_j_physa_2019_121505 crossref_primary_10_1016_j_knosys_2022_108288 crossref_primary_10_1109_ACCESS_2025_3574066 crossref_primary_10_1016_j_neucom_2016_01_009 crossref_primary_10_1145_3678878 crossref_primary_10_1007_s11042_024_20105_1 crossref_primary_10_3390_electronics11172735 crossref_primary_10_3390_e13020450 crossref_primary_10_1007_s11227_023_05688_0 crossref_primary_10_1155_2021_8157293 crossref_primary_10_1016_j_asoc_2017_12_024 crossref_primary_10_1016_j_cej_2024_151828 crossref_primary_10_1109_ACCESS_2021_3108450 crossref_primary_10_3390_w11020317 crossref_primary_10_1016_j_ecoinf_2022_101935 crossref_primary_10_1016_j_is_2017_10_006 crossref_primary_10_3390_s19163540 crossref_primary_10_1007_s12083_014_0323_x crossref_primary_10_1016_j_knosys_2022_108150 crossref_primary_10_1007_s11042_017_4396_4 crossref_primary_10_1016_j_chemolab_2007_01_005 crossref_primary_10_1080_10170660809509092 crossref_primary_10_1007_s10044_019_00783_6 crossref_primary_10_1016_j_eswa_2023_123041 crossref_primary_10_1016_j_knosys_2019_104905 crossref_primary_10_3390_w12010294 crossref_primary_10_3233_JCM_190015 crossref_primary_10_1007_s00521_020_05395_4 crossref_primary_10_1007_s13042_018_0836_3 crossref_primary_10_2478_acss_2020_0011 crossref_primary_10_1016_j_knosys_2017_09_034 crossref_primary_10_3233_WEB_160340 crossref_primary_10_3389_fninf_2021_727859 crossref_primary_10_32362_2500_316X_2019_7_6_134_150 crossref_primary_10_1371_journal_pone_0210236 crossref_primary_10_1007_s10922_022_09650_y crossref_primary_10_1016_j_physa_2018_02_084 crossref_primary_10_1016_j_ins_2019_03_022 crossref_primary_10_1007_s00500_013_1128_1 crossref_primary_10_3233_JIFS_152647 crossref_primary_10_1111_j_1745_4603_2012_00345_x crossref_primary_10_1109_ACCESS_2018_2866364 crossref_primary_10_1016_j_neuroimage_2017_10_058 crossref_primary_10_1007_s00500_025_10458_6 crossref_primary_10_1371_journal_pone_0083847 crossref_primary_10_1002_widm_49 crossref_primary_10_1109_TITB_2009_2021262 crossref_primary_10_1145_1754428_1754430 crossref_primary_10_1007_s00170_005_0137_3 crossref_primary_10_1016_j_patrec_2005_08_015 crossref_primary_10_3390_s17102226 crossref_primary_10_1155_2021_8178495 crossref_primary_10_1007_s10586_025_05350_9 crossref_primary_10_1016_j_ymssp_2015_02_017 crossref_primary_10_1016_j_envsoft_2018_09_021 crossref_primary_10_1109_2_781637 crossref_primary_10_1007_s10844_007_0044_1 crossref_primary_10_1007_s11205_020_02329_4 crossref_primary_10_1016_j_neunet_2009_08_007 crossref_primary_10_1109_2_781633 crossref_primary_10_1016_j_infsof_2003_07_003 crossref_primary_10_1007_s10100_022_00824_2 crossref_primary_10_1108_GS_11_2013_0027 crossref_primary_10_1080_10106049_2018_1508313 crossref_primary_10_3390_ijgi7030094 crossref_primary_10_1109_TFUZZ_2012_2201485 crossref_primary_10_1109_TITS_2022_3142778 crossref_primary_10_1007_s42154_022_00205_0 crossref_primary_10_1631_jzus_A0720058 crossref_primary_10_1016_j_eswa_2019_04_048 crossref_primary_10_1016_j_engappai_2024_108635 crossref_primary_10_15558_fir_v5i8_109 crossref_primary_10_1016_j_ins_2020_04_016 crossref_primary_10_1002_cpe_5094 crossref_primary_10_1007_s10586_022_03917_4 crossref_primary_10_1088_1755_1315_793_1_012017 crossref_primary_10_1109_ACCESS_2022_3217769 crossref_primary_10_1007_s10489_021_02389_0 crossref_primary_10_1016_j_asoc_2023_110665 crossref_primary_10_3390_app12042191 crossref_primary_10_1016_j_patcog_2025_112243 crossref_primary_10_1007_s00500_015_1698_1 |
| Cites_doi | 10.1145/3147.3165 10.1145/355744.355745 10.1090/pspum/007/0164283 |
| ContentType | Journal Article |
| Copyright | 2001 |
| Copyright_xml | – notice: 2001 |
| DBID | AAYXX CITATION 7SC 8FD JQ2 L7M L~C L~D E3H F2A |
| DOI | 10.1016/S0306-4379(01)00008-4 |
| DatabaseName | CrossRef Computer and Information Systems Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional Library & Information Sciences Abstracts (LISA) Library & Information Science Abstracts (LISA) |
| DatabaseTitle | CrossRef Computer and Information Systems Abstracts Technology Research Database Computer and Information Systems Abstracts – Academic Advanced Technologies Database with Aerospace ProQuest Computer Science Collection Computer and Information Systems Abstracts Professional Library and Information Science Abstracts (LISA) |
| DatabaseTitleList | Computer and Information Systems Abstracts Library and Information Science Abstracts (LISA) |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Engineering Computer Science |
| EISSN | 1873-6076 |
| EndPage | 58 |
| ExternalDocumentID | 10_1016_S0306_4379_01_00008_4 S0306437901000084 |
| GroupedDBID | --K --M -~X .DC .~1 0R~ 13V 1B1 1~. 1~5 29I 4.4 457 4G. 5GY 5VS 63O 7-5 71M 77K 8P~ 9JN 9JO AAAKF AAAKG AACTN AAEDT AAEDW AAIAV AAIKJ AAKOC AALRI AAOAW AAQFI AAQXK AARIN AAXUO AAYFN ABBOA ABFNM ABKBG ABMAC ABMVD ABTAH ABUCO ABXDB ABYKQ ACDAQ ACGFS ACHRH ACNNM ACNTT ACRLP ACZNC ADBBV ADEZE ADJOM ADMUD AEBSH AEKER AENEX AFFNX AFKWA AFTJW AGHFR AGJBL AGUBO AGUMN AGYEJ AHHHB AHZHX AI. AIALX AIEXJ AIKHN AITUG AJBFU AJOXV ALEQD ALMA_UNASSIGNED_HOLDINGS AMFUW AMRAJ AOUOD APLSM ASPBG AVWKF AXJTR AZFZN BKOJK BLXMC BNSAS CS3 DU5 EBS EFJIC EFLBG EJD EO8 EO9 EP2 EP3 F5P FDB FEDTE FGOYB FIRID FNPLU FYGXN G-Q G8K GBLVA GBOLZ HAMUX HF~ HLZ HVGLF HZ~ H~9 IHE J1W KOM LG9 M41 MO0 N9A O-L O9- OAUVE OZT P-8 P-9 P2P PC. PQQKQ Q38 R2- RIG RNS ROL RPZ SBC SDF SDG SDP SES SEW SPC SPCBC SSB SSD SSL SSV SSZ T5K TN5 UHS VH1 WUQ XSW ZCG ZY4 ~G- 77I 9DU AATTM AAXKI AAYWO AAYXX ABDPE ABJNI ABWVN ACLOT ACRPL ACVFH ADCNI ADNMO ADVLN AEIPS AEUPX AFJKZ AFPUW AGQPQ AIGII AIIUN AKBMS AKRWK AKYEP ANKPU APXCP CITATION EFKBS ~HD 7SC 8FD JQ2 L7M L~C L~D E3H F2A |
| ID | FETCH-LOGICAL-c369t-df46b25c6d047931d39bf99988ad04bcca43750d5a253886d7c12bb62045f9353 |
| ISICitedReferencesCount | 520 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000167690000003&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 0306-4379 |
| IngestDate | Thu Oct 02 11:27:38 EDT 2025 Wed Oct 01 14:41:32 EDT 2025 Tue Nov 18 21:53:57 EST 2025 Sat Nov 29 06:20:11 EST 2025 Fri Feb 23 02:35:47 EST 2024 |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 1 |
| Keywords | Clustering Algorithms Data Mining Knowledge Discovery |
| Language | English |
| License | https://www.elsevier.com/tdm/userlicense/1.0 |
| LinkModel | OpenURL |
| MergedId | FETCHMERGED-LOGICAL-c369t-df46b25c6d047931d39bf99988ad04bcca43750d5a253886d7c12bb62045f9353 |
| Notes | ObjectType-Article-2 SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 23 ObjectType-Article-1 ObjectType-Feature-2 |
| PQID | 26684429 |
| PQPubID | 23500 |
| PageCount | 24 |
| ParticipantIDs | proquest_miscellaneous_57541089 proquest_miscellaneous_26684429 crossref_primary_10_1016_S0306_4379_01_00008_4 crossref_citationtrail_10_1016_S0306_4379_01_00008_4 elsevier_sciencedirect_doi_10_1016_S0306_4379_01_00008_4 |
| PublicationCentury | 2000 |
| PublicationDate | 2001-03-01 |
| PublicationDateYYYYMMDD | 2001-03-01 |
| PublicationDate_xml | – month: 03 year: 2001 text: 2001-03-01 day: 01 |
| PublicationDecade | 2000 |
| PublicationTitle | Information systems (Oxford) |
| PublicationYear | 2001 |
| Publisher | Elsevier Ltd |
| Publisher_xml | – name: Elsevier Ltd |
| References | Ng, Han (BIB10) 1994 Sellis, Roussopoulos, Faloutsos (BIB14) 1987 Jain, Dubes (BIB8) 1988 Samet (BIB12) 1989 Cormen, Leiserson, Rivest (BIB2) 1990 Han, Karypis, Kumar, Mobasher (BIB7) 1997 Ester, Kriegel, Sander, Xu (BIB4) 1996 Motwani, Raghavan (BIB9) 1995 Zhang, Ramakrishnan, Livny (BIB17) 1996 Vitter (BIB16) 1985; 11 Ester, Kriegel, Xu (BIB5) 1995 Friedman, Bentley, Finkel (BIB6) 1977; 3 Toivonen (BIB15) 1996 Coxeter (BIB3) 1964; 7 Samet (BIB13) 1990 Olson (BIB11) 1993 Beckmann, Kriegel, Schneider, Seeger (BIB1) 1990 Samet (10.1016/S0306-4379(01)00008-4_BIB12) 1989 Ester (10.1016/S0306-4379(01)00008-4_BIB4) 1996 Vitter (10.1016/S0306-4379(01)00008-4_BIB16) 1985; 11 Sellis (10.1016/S0306-4379(01)00008-4_BIB14) 1987 Toivonen (10.1016/S0306-4379(01)00008-4_BIB15) 1996 Cormen (10.1016/S0306-4379(01)00008-4_BIB2) 1990 Jain (10.1016/S0306-4379(01)00008-4_BIB8) 1988 Friedman (10.1016/S0306-4379(01)00008-4_BIB6) 1977; 3 Samet (10.1016/S0306-4379(01)00008-4_BIB13) 1990 Han (10.1016/S0306-4379(01)00008-4_BIB7) 1997 Ng (10.1016/S0306-4379(01)00008-4_BIB10) 1994 Coxeter (10.1016/S0306-4379(01)00008-4_BIB3) 1964; 7 Olson (10.1016/S0306-4379(01)00008-4_BIB11) 1993 Ester (10.1016/S0306-4379(01)00008-4_BIB5) 1995 Motwani (10.1016/S0306-4379(01)00008-4_BIB9) 1995 Beckmann (10.1016/S0306-4379(01)00008-4_BIB1) 1990 Zhang (10.1016/S0306-4379(01)00008-4_BIB17) 1996 |
| References_xml | – year: 1989 ident: BIB12 publication-title: The Design and Analysis of Spatial Data Structures – start-page: 322 year: 1990 end-page: 331 ident: BIB1 article-title: The publication-title: Proceedings of ACM SIGMOD – year: 1995 ident: BIB9 publication-title: Randomized Algorithms – year: 1990 ident: BIB2 publication-title: Introduction to Algorithms – year: 1993 ident: BIB11 publication-title: Parallel Algorithms for Hierarchical Clustering – volume: 11 start-page: 37 year: 1985 end-page: 57 ident: BIB16 article-title: Random sampling with a reservoir publication-title: ACM Transactions on Mathematical Software – start-page: 103 year: 1996 end-page: 114 ident: BIB17 article-title: Birch: An efficient data clustering method for very large databases publication-title: Proceedings of the ACM SIGMOD Conference on Management of Data – start-page: 94 year: 1995 end-page: 99 ident: BIB5 article-title: A database interface for clustering in large spatial databases publication-title: International Conference on Knowledge Discovery in Databases and Data Mining (KDD-95) – start-page: 226 year: 1996 end-page: 231 ident: BIB4 article-title: A density-based algorithm for discovering clusters in large spatial database with noise publication-title: International Conference on Knowledge Discovery in Databases and Data Mining (KDD-96) – volume: 3 start-page: 209 year: 1977 end-page: 226 ident: BIB6 article-title: An algorithm for finding best matches in logarithmic expected time publication-title: ACM Transactions on Mathematical Software – start-page: 134 year: 1996 end-page: 145 ident: BIB15 article-title: Sampling large databases for association rules publication-title: Proceedings of the VLDB Conference – start-page: 9 year: 1997 end-page: 13 ident: BIB7 article-title: Clustering based on association rule hypergraphs publication-title: 1997 SIGMOD Workshop on Research Issues on Data Mining and Knowledge Discovery – year: 1988 ident: BIB8 publication-title: Algorithms for Clustering Data – start-page: 507 year: 1987 end-page: 518 ident: BIB14 article-title: The publication-title: Proceedings of the 13th International Conference on VLDB – year: 1990 ident: BIB13 publication-title: The Design and Analysis of Spatial Data Structures – start-page: 144 year: 1994 end-page: 155 ident: BIB10 article-title: Efficient and effective clustering methods for spatial data mining publication-title: Proceedings of the VLDB Conference – volume: 7 start-page: 53 year: 1964 end-page: 71 ident: BIB3 article-title: An upper bound for the number of equal nonoverlaping spheres that can touch another of the same size publication-title: Symposia in Pure Mathematics – volume: 11 start-page: 37 issue: 1 year: 1985 ident: 10.1016/S0306-4379(01)00008-4_BIB16 article-title: Random sampling with a reservoir publication-title: ACM Transactions on Mathematical Software doi: 10.1145/3147.3165 – year: 1989 ident: 10.1016/S0306-4379(01)00008-4_BIB12 – start-page: 134 year: 1996 ident: 10.1016/S0306-4379(01)00008-4_BIB15 article-title: Sampling large databases for association rules – start-page: 94 year: 1995 ident: 10.1016/S0306-4379(01)00008-4_BIB5 article-title: A database interface for clustering in large spatial databases – start-page: 103 year: 1996 ident: 10.1016/S0306-4379(01)00008-4_BIB17 article-title: Birch: An efficient data clustering method for very large databases – year: 1990 ident: 10.1016/S0306-4379(01)00008-4_BIB2 – year: 1993 ident: 10.1016/S0306-4379(01)00008-4_BIB11 – start-page: 322 year: 1990 ident: 10.1016/S0306-4379(01)00008-4_BIB1 article-title: The R*-tree an efficient and robust access method for points and rectangles – volume: 3 start-page: 209 year: 1977 ident: 10.1016/S0306-4379(01)00008-4_BIB6 article-title: An algorithm for finding best matches in logarithmic expected time publication-title: ACM Transactions on Mathematical Software doi: 10.1145/355744.355745 – start-page: 507 year: 1987 ident: 10.1016/S0306-4379(01)00008-4_BIB14 article-title: The R+ tree: a dynamic index for multi-dimensional objects – year: 1995 ident: 10.1016/S0306-4379(01)00008-4_BIB9 – start-page: 144 year: 1994 ident: 10.1016/S0306-4379(01)00008-4_BIB10 article-title: Efficient and effective clustering methods for spatial data mining – volume: 7 start-page: 53 year: 1964 ident: 10.1016/S0306-4379(01)00008-4_BIB3 article-title: An upper bound for the number of equal nonoverlaping spheres that can touch another of the same size publication-title: Symposia in Pure Mathematics doi: 10.1090/pspum/007/0164283 – year: 1990 ident: 10.1016/S0306-4379(01)00008-4_BIB13 – start-page: 226 year: 1996 ident: 10.1016/S0306-4379(01)00008-4_BIB4 article-title: A density-based algorithm for discovering clusters in large spatial database with noise – start-page: 9 year: 1997 ident: 10.1016/S0306-4379(01)00008-4_BIB7 article-title: Clustering based on association rule hypergraphs – year: 1988 ident: 10.1016/S0306-4379(01)00008-4_BIB8 |
| SSID | ssj0002599 |
| Score | 2.175904 |
| Snippet | Clustering, in data mining, is useful for discovering groups and identifying interesting distributions in the underlying data. Traditional clustering... |
| SourceID | proquest crossref elsevier |
| SourceType | Aggregation Database Enrichment Source Index Database Publisher |
| StartPage | 35 |
| SubjectTerms | Clustering Clustering Algorithms Computer applications Data Mining Knowledge Discovery |
| Title | Cure: an efficient clustering algorithm for large databases |
| URI | https://dx.doi.org/10.1016/S0306-4379(01)00008-4 https://www.proquest.com/docview/26684429 https://www.proquest.com/docview/57541089 |
| Volume | 26 |
| WOSCitedRecordID | wos000167690000003&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVESC databaseName: Elsevier SD Freedom Collection Journals 2021 customDbUrl: eissn: 1873-6076 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0002599 issn: 0306-4379 databaseCode: AIEXJ dateStart: 19950301 isFulltext: true titleUrlDefault: https://www.sciencedirect.com providerName: Elsevier |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Jb9QwFLag5UAPpRRQW0rxASFQlRInTmLDqaqmbNWA2hlpblbsJO1I08x0FtSf3-cly4CqgQOXKLLiRPF7fovf8iH0RuWByiJJPSKTyKMKtiIjiniJzzIq00SHsgzYRNLtssGA_3QonTMDJ5CUJbu95ZP_SmoYA2Lr0tl_IHf9UhiAeyA6XIHscP0rwp_0zzumhLnUyRpDU_B4qEYL3RHBVCSOLsfT4fzq2mQYjnQm-KHOE9X6bNa2VV2lkmEQ2_DZnNDaEsPWCcLnhQ0a1WbleQoW5eVwKR3x4srCNn9fOmZo5VlV5VV-7OnmhW3RaYvdl1jEykHbgsRpVNub_Q9ZbY8NLuoXwx9o1ABuzFhX-rPUH7v7Q5z2z85ErzPovZ3ceBo6TIfYHY7KQ7QeJBEH6bx-_LUz-FYrZPDwuA0m2Q81hVwfmq-_88l79-X7TJTflLWxQHpbaNO5DvjYkvwpepCX2-hJBcuBnZTeRhutHpPP0CfNDx9xWuKaG3DDDbjmBgxUxYYbcM0Nz1H_tNM7-eI5xAxPhTGfe1lBYxlEKs40ckBIspDLAlwAxlIYkbBb4VcjP4vSABQdi7NEkUBKg0lQ8DAKX6C1clzmOwiDJx7QpAiLXAdmlUx9mRIVgG7MY_Ap-C6i1SIJ5drJa1STkWjyBmFthV5b4RNh1lbQXXRUT5vYfiqrJrCKAsIZhdbYE8BHq6a-rigmQGjqSFha5uPFTIBVyihYYvc_AV4MJT7jeyufeIkeN_tlH63Np4v8FXqkfs2Hs-mB48U7EKWOaA |
| linkProvider | Elsevier |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=CURE%3A+an+efficient+clustering+algorithm+for+large+databases&rft.jtitle=Information+systems+%28Oxford%29&rft.au=Guha%2C+S&rft.au=Rastogi%2C+R&rft.au=Shim%2C+K&rft.date=2001-03-01&rft.issn=0306-4379&rft.volume=26&rft.issue=1&rft.spage=35&rft.epage=58&rft_id=info:doi/10.1016%2FS0306-4379%2801%2900008-4&rft.externalDBID=NO_FULL_TEXT |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0306-4379&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0306-4379&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0306-4379&client=summon |