A fast and effective partitional clustering algorithm for large categorical datasets using a k-means based approach
Partitional clustering algorithms represent an interesting issue in pattern recognition due to their high scalability and efficiency. The k-means, proposed since 1965, had shown great efficiency for numeric clustering but is unfortunately inadequate for categorical clustering. In 1998, the k-modes w...
Uloženo v:
| Vydáno v: | Computers & electrical engineering Ročník 68; s. 463 - 483 |
|---|---|
| Hlavní autoři: | , , |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
Amsterdam
Elsevier Ltd
01.05.2018
Elsevier BV |
| Témata: | |
| ISSN: | 0045-7906, 1879-0755 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Abstract | Partitional clustering algorithms represent an interesting issue in pattern recognition due to their high scalability and efficiency. The k-means, proposed since 1965, had shown great efficiency for numeric clustering but is unfortunately inadequate for categorical clustering. In 1998, the k-modes was proposed as an extension of the k-means to cluster categorical datasets. In this paper, a new categorical method based on partitions called Manhattan Frequency k-Means (MFk-M) is detailed. It aims to convert the initial categorical data into numeric values using the relative frequency of each modality in the attributes. The L1 (Manhattan distance) norm was also used as an evaluation distance measure to compute the distance between the observations and the centroids. Finally, an approximation is defined to evaluate each resulting partition during the execution of the algorithm to avoid trivial clusterings such as cluster death. Experimental analysis performed on real life datasets highlights the reduced complexity costs and high efficiency of our proposal when compared to the standard k-means and k-modes algorithms. |
|---|---|
| AbstractList | Partitional clustering algorithms represent an interesting issue in pattern recognition due to their high scalability and efficiency. The k-means, proposed since 1965, had shown great efficiency for numeric clustering but is unfortunately inadequate for categorical clustering. In 1998, the k-modes was proposed as an extension of the k-means to cluster categorical datasets. In this paper, a new categorical method based on partitions called Manhattan Frequency k-Means (MFk-M) is detailed. It aims to convert the initial categorical data into numeric values using the relative frequency of each modality in the attributes. The L1 (Manhattan distance) norm was also used as an evaluation distance measure to compute the distance between the observations and the centroids. Finally, an approximation is defined to evaluate each resulting partition during the execution of the algorithm to avoid trivial clusterings such as cluster death. Experimental analysis performed on real life datasets highlights the reduced complexity costs and high efficiency of our proposal when compared to the standard k-means and k-modes algorithms. |
| Author | Chtourou, Zied Ben Salem, Semeh Naouali, Sami |
| Author_xml | – sequence: 1 givenname: Semeh surname: Ben Salem fullname: Ben Salem, Semeh email: semehbensalem0@gmail.com organization: Virtual Reality and Information Technology (VRIT), Military Academy of Fandouk Jedid, Tunisia – sequence: 2 givenname: Sami surname: Naouali fullname: Naouali, Sami email: snaouali@gmail.com organization: Virtual Reality and Information Technology (VRIT), Military Academy of Fandouk Jedid, Tunisia – sequence: 3 givenname: Zied surname: Chtourou fullname: Chtourou, Zied email: ziedchtourou@gmail.com organization: Digital Research Center of Sfax, B.P. 275, Sakiet Ezzit, Sfax 3021, Tunisia |
| BookMark | eNqNkUFv2zAMhYUhA5Zk-w8adrZH2ZFsn4oiaLsBBXrZzgItU4kyx_IkpUD_fZWlh6GnnAQS7z2RH1dsMfmJGPsqoBQg1PdDafxxppEMTbuyAtGWsCmhqj-wpWibroBGygVbAmxk0XSgPrFVjAfItRLtksVbbjEmjtPAyVoyyT0TnzEkl5yfcORmPMVEwU07juPOB5f2R2594COGHXGDic5dk6UDJoyUIj_Ff3L-pzgSTpH3uT1wnOfg0ew_s48Wx0hf3t41-31_92v7o3h8evi5vX0szAYgFUMjqVZtZ2pUJs_W96pXgqzopKqkVJ3tu7YiApSQt8zqoW-MFHW2GWFtvWbfLrn5278nikkf_CnknaKuoG2hgbqCrOouKhN8jIGsnoM7YnjRAvSZsT7o_xjrM2MNG50ZZ-_NO69xCc_gUkA3XpWwvSRQBvHsKOhoHE2GBhfyMfTg3RUprwuKpN0 |
| CitedBy_id | crossref_primary_10_32604_jai_2023_043229 crossref_primary_10_1155_2023_2206625 crossref_primary_10_1016_j_bdr_2020_100170 crossref_primary_10_1007_s13042_021_01293_w crossref_primary_10_1016_j_eswa_2025_126608 crossref_primary_10_1088_1757_899X_928_3_032081 crossref_primary_10_1016_j_procs_2024_09_444 crossref_primary_10_1155_2020_6617597 crossref_primary_10_1016_j_segan_2023_101091 crossref_primary_10_1016_j_patrec_2022_04_026 crossref_primary_10_1007_s13369_020_04620_5 crossref_primary_10_1002_srin_202000719 crossref_primary_10_1007_s10462_024_10920_1 crossref_primary_10_1016_j_eswa_2020_113555 crossref_primary_10_1016_j_uclim_2024_102234 crossref_primary_10_1080_1206212X_2019_1587892 crossref_primary_10_1016_j_marpolbul_2022_114329 crossref_primary_10_1007_s11869_022_01254_4 crossref_primary_10_1016_j_compenvurbsys_2023_101969 crossref_primary_10_1016_j_eswa_2021_115054 crossref_primary_10_1016_j_procs_2019_08_082 crossref_primary_10_1016_j_eswa_2019_112910 crossref_primary_10_4018_IJSWIS_346377 crossref_primary_10_1007_s42405_024_00814_5 crossref_primary_10_1080_0951192X_2023_2177748 crossref_primary_10_3390_make6020047 crossref_primary_10_1051_matecconf_201925506008 crossref_primary_10_2478_cait_2023_0010 |
| Cites_doi | 10.1016/j.patrec.2017.03.008 10.1148/radiol.2016160293 10.1007/s10295-013-1368-1 10.1016/j.eswa.2013.07.002 10.1016/S0031-3203(02)00060-2 10.1016/j.neucom.2013.11.024 10.1023/A:1009769707641 10.1016/j.eswa.2012.07.021 10.1016/j.neucom.2012.11.009 10.1016/j.cose.2015.09.005 10.1016/j.datak.2007.03.016 10.1016/j.ijleo.2015.09.093 10.1016/j.patcog.2014.01.015 10.1016/j.knosys.2011.07.011 10.1109/TPAMI.2015.2462338 10.1016/0167-8655(95)00075-R 10.1109/TPAMI.2007.53 |
| ContentType | Journal Article |
| Copyright | 2018 Elsevier Ltd Copyright Elsevier BV May 2018 |
| Copyright_xml | – notice: 2018 Elsevier Ltd – notice: Copyright Elsevier BV May 2018 |
| DBID | AAYXX CITATION 7SC 7SP 8FD JQ2 L7M L~C L~D |
| DOI | 10.1016/j.compeleceng.2018.04.023 |
| DatabaseName | CrossRef Computer and Information Systems Abstracts Electronics & Communications Abstracts Technology Research Database ProQuest Computer Science Collection Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Academic Computer and Information Systems Abstracts Professional |
| DatabaseTitle | CrossRef Technology Research Database Computer and Information Systems Abstracts – Academic Electronics & Communications Abstracts ProQuest Computer Science Collection Computer and Information Systems Abstracts Advanced Technologies Database with Aerospace Computer and Information Systems Abstracts Professional |
| DatabaseTitleList | Technology Research Database |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Engineering |
| EISSN | 1879-0755 |
| EndPage | 483 |
| ExternalDocumentID | 10_1016_j_compeleceng_2018_04_023 S0045790617327131 |
| GroupedDBID | --K --M .DC .~1 0R~ 1B1 1~. 1~5 29F 4.4 457 4G. 5GY 5VS 7-5 71M 8P~ 9JN AACTN AAEDT AAEDW AAIAV AAIKJ AAKOC AALRI AAOAW AAQFI AAQXK AAXUO AAYFN ABBOA ABEFU ABFNM ABJNI ABMAC ABXDB ABYKQ ACDAQ ACGFO ACGFS ACNNM ACRLP ACZNC ADBBV ADEZE ADJOM ADMUD ADTZH AEBSH AECPX AEKER AENEX AFFNX AFKWA AFTJW AGHFR AGUBO AGYEJ AHHHB AHJVU AHZHX AIALX AIEXJ AIKHN AITUG AJBFU AJOXV ALMA_UNASSIGNED_HOLDINGS AMFUW AMRAJ AOUOD ASPBG AVWKF AXJTR AZFZN BJAXD BKOJK BLXMC CS3 DU5 EBS EFJIC EFLBG EJD EO8 EO9 EP2 EP3 FDB FEDTE FGOYB FIRID FNPLU FYGXN G-2 G-Q GBLVA GBOLZ HLZ HVGLF HZ~ IHE J1W JJJVA KOM LG9 LY7 M41 MO0 N9A O-L O9- OAUVE OZT P-8 P-9 P2P PC. PQQKQ Q38 R2- RIG ROL RPZ RXW SBC SDF SDG SDP SES SET SEW SPC SPCBC SST SSV SSZ T5K TAE TN5 UHS VOH WH7 WUQ XPP ZMT ~G- ~S- 9DU AATTM AAXKI AAYWO AAYXX ABWVN ACLOT ACRPL ACVFH ADCNI ADNMO AEIPS AEUPX AFJKZ AFPUW AGQPQ AIGII AIIUN AKBMS AKRWK AKYEP ANKPU APXCP CITATION EFKBS ~HD 7SC 7SP 8FD AFXIZ AGCQF AGRNS JQ2 L7M L~C L~D SSH |
| ID | FETCH-LOGICAL-c400t-d75e3689c3a6ceffbb6b61ef195625569fb982ee0a5007575edb7c5135e3c1ff3 |
| ISICitedReferencesCount | 34 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000437999300036&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 0045-7906 |
| IngestDate | Mon Jul 14 10:32:45 EDT 2025 Sat Nov 29 03:04:34 EST 2025 Tue Nov 18 22:12:43 EST 2025 Fri Feb 23 02:25:59 EST 2024 |
| IsDoiOpenAccess | false |
| IsOpenAccess | true |
| IsPeerReviewed | true |
| IsScholarly | true |
| Keywords | k-means Crime Mining Pattern recognition Categorical clustering k-modes Unsupervised learning |
| Language | English |
| LinkModel | OpenURL |
| MergedId | FETCHMERGED-LOGICAL-c400t-d75e3689c3a6ceffbb6b61ef195625569fb982ee0a5007575edb7c5135e3c1ff3 |
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| OpenAccessLink | https://hdl.handle.net/11323/5136 |
| PQID | 2088070320 |
| PQPubID | 2045266 |
| PageCount | 21 |
| ParticipantIDs | proquest_journals_2088070320 crossref_primary_10_1016_j_compeleceng_2018_04_023 crossref_citationtrail_10_1016_j_compeleceng_2018_04_023 elsevier_sciencedirect_doi_10_1016_j_compeleceng_2018_04_023 |
| PublicationCentury | 2000 |
| PublicationDate | May 2018 2018-05-00 20180501 |
| PublicationDateYYYYMMDD | 2018-05-01 |
| PublicationDate_xml | – month: 05 year: 2018 text: May 2018 |
| PublicationDecade | 2010 |
| PublicationPlace | Amsterdam |
| PublicationPlace_xml | – name: Amsterdam |
| PublicationTitle | Computers & electrical engineering |
| PublicationYear | 2018 |
| Publisher | Elsevier Ltd Elsevier BV |
| Publisher_xml | – name: Elsevier Ltd – name: Elsevier BV |
| References | Salem, Naouali, Sallami (bib0028) 2017; 11 Ben Salem, Naouali (bib0030) 2016; 7 Pizutti (bib0024) 2016 Tzortzis, Likas (bib0022) 2014; 47 Mostafa, Karray, Mohamed (bib0019) 2012 Cao, liang, Li, Zhao (bib0012) 2013; 108 Boddy (bib0006) 2014; 41 Ienco, Pensa, Meo (bib0020) 2009 Bai, Liang (bib0010) 2014; 133 Ng, Li, Huang, He (bib0027) 2007; 29 Ding, Choi, Tao, Larry (bib0005) 2016; 38 Rostami, Badkoobe, Mohanna (bib0007) 2017 Gan, Kwok-PoNg (bib0014) 2017; 90 West, Bhattacharya (bib0002) 2016; 57 Khan, Ahmad (bib0011) 2013; 40 Cao, Liang, Li, Bai, Dang (bib0013) 2012; 26 Ralambondrainy (bib0026) 1995; 16 Romeo, Tagarelli, Ienco (bib0018) 2014 Ahmad, Dey (bib0016) 2007 Salem, Naouali, Sallami (bib0029) 2017; 11 Shmueli, Bruce, Yahav, Patel, Lichtendahl (bib0001) 2017 Naouali, Ben Salem (bib0009) 2016; 6 Likas, Vlassis, Verbeek (bib0023) 2003; 36 Huang (bib0015) 1998; 2 Xia (bib0025) 2015; 126 X. Dong Kuan, & T. Yingjie A comprehensive survey of clustering algorithms. Springer Verlag Berlin Heidelberg, 2015. Ben Salem, Naouali (bib0008) 2015 Celebi, Kingravi, Vela (bib0021) 2013; 40 Hazra, Chowdhury, Dutta (bib0003) 2016 Thrall (bib0004) 2016; 279 Romeo (10.1016/j.compeleceng.2018.04.023_bib0018) 2014 Rostami (10.1016/j.compeleceng.2018.04.023_bib0007) 2017 Ng (10.1016/j.compeleceng.2018.04.023_bib0027) 2007; 29 Naouali (10.1016/j.compeleceng.2018.04.023_bib0009) 2016; 6 Mostafa (10.1016/j.compeleceng.2018.04.023_bib0019) 2012 Ienco (10.1016/j.compeleceng.2018.04.023_bib0020) 2009 Ben Salem (10.1016/j.compeleceng.2018.04.023_bib0030) 2016; 7 10.1016/j.compeleceng.2018.04.023_bib0017 Salem (10.1016/j.compeleceng.2018.04.023_bib0029) 2017; 11 Huang (10.1016/j.compeleceng.2018.04.023_bib0015) 1998; 2 Cao (10.1016/j.compeleceng.2018.04.023_bib0012) 2013; 108 Celebi (10.1016/j.compeleceng.2018.04.023_bib0021) 2013; 40 Ralambondrainy (10.1016/j.compeleceng.2018.04.023_bib0026) 1995; 16 Ding (10.1016/j.compeleceng.2018.04.023_bib0005) 2016; 38 Bai (10.1016/j.compeleceng.2018.04.023_bib0010) 2014; 133 Pizutti (10.1016/j.compeleceng.2018.04.023_bib0024) 2016 West (10.1016/j.compeleceng.2018.04.023_bib0002) 2016; 57 Boddy (10.1016/j.compeleceng.2018.04.023_bib0006) 2014; 41 Gan (10.1016/j.compeleceng.2018.04.023_bib0014) 2017; 90 Likas (10.1016/j.compeleceng.2018.04.023_bib0023) 2003; 36 Cao (10.1016/j.compeleceng.2018.04.023_bib0013) 2012; 26 Tzortzis (10.1016/j.compeleceng.2018.04.023_bib0022) 2014; 47 Hazra (10.1016/j.compeleceng.2018.04.023_bib0003) 2016 Salem (10.1016/j.compeleceng.2018.04.023_bib0028) 2017; 11 Ahmad (10.1016/j.compeleceng.2018.04.023_bib0016) 2007 Thrall (10.1016/j.compeleceng.2018.04.023_bib0004) 2016; 279 Khan (10.1016/j.compeleceng.2018.04.023_bib0011) 2013; 40 Shmueli (10.1016/j.compeleceng.2018.04.023_bib0001) 2017 Ben Salem (10.1016/j.compeleceng.2018.04.023_bib0008) 2015 Xia (10.1016/j.compeleceng.2018.04.023_bib0025) 2015; 126 |
| References_xml | – volume: 2 start-page: 283 year: 1998 end-page: 304 ident: bib0015 article-title: Extension to the k-means algorithm for clustering large datasets with categorical values publication-title: Data Min Knowl Discov – volume: 36 start-page: 451 year: 2003 end-page: 461 ident: bib0023 article-title: The global k-means clustering algorithm publication-title: Pattern Recognit – year: 2017 ident: bib0001 article-title: Data mining for business analytics: concepts, techniques, and applications in R – volume: 11 start-page: 691 year: 2017 end-page: 696 ident: bib0028 article-title: Clustering categorical data using the k-means algorithm and the attribute's relative frequency publication-title: World Academy of Science, 19th international conference on machine learning and applications – year: 2014 ident: bib0018 article-title: Clustering view-segmented documents via tensor modeling publication-title: Foundations of intelligent systems, 21st international symposium, ISMIS – volume: 29 year: 2007 ident: bib0027 article-title: On the impact of dissimilarity measure in K-modes clustering algorithm publication-title: IEEE Trans Pattern Anal Mach Intell – volume: 38 start-page: 518 year: 2016 end-page: 531 ident: bib0005 article-title: Davis multi-directional multi-level dual-cross patterns for robust face recognition publication-title: IEEE Trans Pattern Anal Mach Intell – year: 2017 ident: bib0007 article-title: Survey on clustering in heterogeneous and homogeneous wireless sensor networks publication-title: J Supercomput – start-page: 452 year: 2015 end-page: 459 ident: bib0008 article-title: Reducing the multidimensionality of OLAP cubes with genetic algorithms and multiple correspondence analysis publication-title: The international conference on advanced wireless, information, and communication technologies (AWICT 2015) – volume: 40 start-page: 7444 year: 2013 end-page: 7456 ident: bib0011 article-title: Cluster center initialization algorithm for k-modes clustering publication-title: Expert Syst Appl – year: 2012 ident: bib0019 article-title: An improved k-means document clustering using Wikipedia hierarchical ontology publication-title: 21st international conference on pattern recognition (ICPR) – volume: 47 start-page: 2505 year: 2014 end-page: 2516 ident: bib0022 article-title: The min-max k-means clustering algorithm publication-title: Pattern Recognit – year: 2016 ident: bib0003 article-title: Cluster based medical image registration using optimized neural network publication-title: Handbook of research on advanced hybrid intelligent techniques and applications – volume: 26 start-page: 120 year: 2012 end-page: 127 ident: bib0013 article-title: A dissimilarity measure for the k-modes clustering algorithm publication-title: Knowl-Based Syst – volume: 16 start-page: 1147 year: 1995 end-page: 1157 ident: bib0026 article-title: A conceptual version of the k-means algorithm publication-title: Pattern Recognit Lett – volume: 11 start-page: 691 year: 2017 end-page: 696 ident: bib0029 article-title: A computational cost-effective clustering algorithm in multidimensional space using the manhattan metric: application to the global terrorism database publication-title: World Academy of Science, 19th international conference on machine learning and applications – volume: 90 start-page: 8 year: 2017 end-page: 14 ident: bib0014 article-title: k-means clustering with outlier removal publication-title: Pattern Recognit Lett – volume: 108 start-page: 23 year: 2013 end-page: 30 ident: bib0012 article-title: A weighting k-modes algorithm for subspace clustering of categorical data publication-title: Neurocomputing – volume: 6 year: 2016 ident: bib0009 article-title: Towards reducing the multidimensionality of OLAP cubes using the evolutionary algorithms and factor analysis methods publication-title: Int J Data Min Knowl Manage Process (IJDKP) – volume: 133 start-page: 111 year: 2014 end-page: 121 ident: bib0010 article-title: The k-modes type clustering plus between-cluster information for categorical data publication-title: Neurocomputing – reference: X. Dong Kuan, & T. Yingjie A comprehensive survey of clustering algorithms. Springer Verlag Berlin Heidelberg, 2015. – year: 2009 ident: bib0020 article-title: Context-based distance learning for categorical data clustering publication-title: Advances in intelligent data analysis: 8th international symposium on intelligent data analysis, IDA – volume: 7 year: 2016 ident: bib0030 article-title: Pattern recognition approach in multidimensional databases: application to the global terrorism database publication-title: Int J Adv Comput Sci Appl (IJACSA) – volume: 57 start-page: 47 year: 2016 end-page: 66 ident: bib0002 article-title: Intelligent financial fraud detection: a comprehensive review publication-title: Comput Secur – start-page: 211 year: 2016 end-page: 222 ident: bib0024 article-title: A k-means based genetic algorithm for data clustering publication-title: International joint conference SOCO’16-CISIS’16-ICEUTE’16 – volume: 126 start-page: 5614 year: 2015 end-page: 5619 ident: bib0025 article-title: Effectiveness of the Euclidean distance in high dimensional spaces publication-title: Int J Light Electron Opt – volume: 40 start-page: 200 year: 2013 end-page: 210 ident: bib0021 article-title: A comparative study of efficient initialization methods for the k-means clustering algorithm publication-title: Expert Syst Appl – volume: 279 year: 2016 ident: bib0004 article-title: Trends and developments shaping the future of diagnostic medical imaging: 2015 annual oration in diagnostic publication-title: Radiology – volume: 41 start-page: 443 year: 2014 end-page: 450 ident: bib0006 article-title: Bioinformatics tools for genome mining of polyketide and non-ribosomal peptides publication-title: J Ind Microbiol Biotechnol – start-page: 503 year: 2007 end-page: 527 ident: bib0016 article-title: A k-means clustering algorithm for mixed numeric and categorical data publication-title: Data Knowl Eng – volume: 90 start-page: 8 year: 2017 ident: 10.1016/j.compeleceng.2018.04.023_bib0014 article-title: k-means clustering with outlier removal publication-title: Pattern Recognit Lett doi: 10.1016/j.patrec.2017.03.008 – year: 2016 ident: 10.1016/j.compeleceng.2018.04.023_bib0003 article-title: Cluster based medical image registration using optimized neural network – volume: 279 issue: 3 year: 2016 ident: 10.1016/j.compeleceng.2018.04.023_bib0004 article-title: Trends and developments shaping the future of diagnostic medical imaging: 2015 annual oration in diagnostic publication-title: Radiology doi: 10.1148/radiol.2016160293 – volume: 41 start-page: 443 issue: 2 year: 2014 ident: 10.1016/j.compeleceng.2018.04.023_bib0006 article-title: Bioinformatics tools for genome mining of polyketide and non-ribosomal peptides publication-title: J Ind Microbiol Biotechnol doi: 10.1007/s10295-013-1368-1 – volume: 40 start-page: 7444 year: 2013 ident: 10.1016/j.compeleceng.2018.04.023_bib0011 article-title: Cluster center initialization algorithm for k-modes clustering publication-title: Expert Syst Appl doi: 10.1016/j.eswa.2013.07.002 – volume: 36 start-page: 451 year: 2003 ident: 10.1016/j.compeleceng.2018.04.023_bib0023 article-title: The global k-means clustering algorithm publication-title: Pattern Recognit doi: 10.1016/S0031-3203(02)00060-2 – volume: 133 start-page: 111 year: 2014 ident: 10.1016/j.compeleceng.2018.04.023_bib0010 article-title: The k-modes type clustering plus between-cluster information for categorical data publication-title: Neurocomputing doi: 10.1016/j.neucom.2013.11.024 – year: 2014 ident: 10.1016/j.compeleceng.2018.04.023_bib0018 article-title: Clustering view-segmented documents via tensor modeling – year: 2012 ident: 10.1016/j.compeleceng.2018.04.023_bib0019 article-title: An improved k-means document clustering using Wikipedia hierarchical ontology – start-page: 211 year: 2016 ident: 10.1016/j.compeleceng.2018.04.023_bib0024 article-title: A k-means based genetic algorithm for data clustering – start-page: 452 year: 2015 ident: 10.1016/j.compeleceng.2018.04.023_bib0008 article-title: Reducing the multidimensionality of OLAP cubes with genetic algorithms and multiple correspondence analysis – volume: 2 start-page: 283 year: 1998 ident: 10.1016/j.compeleceng.2018.04.023_bib0015 article-title: Extension to the k-means algorithm for clustering large datasets with categorical values publication-title: Data Min Knowl Discov doi: 10.1023/A:1009769707641 – year: 2009 ident: 10.1016/j.compeleceng.2018.04.023_bib0020 article-title: Context-based distance learning for categorical data clustering – volume: 11 start-page: 691 year: 2017 ident: 10.1016/j.compeleceng.2018.04.023_bib0029 article-title: A computational cost-effective clustering algorithm in multidimensional space using the manhattan metric: application to the global terrorism database – volume: 40 start-page: 200 year: 2013 ident: 10.1016/j.compeleceng.2018.04.023_bib0021 article-title: A comparative study of efficient initialization methods for the k-means clustering algorithm publication-title: Expert Syst Appl doi: 10.1016/j.eswa.2012.07.021 – volume: 108 start-page: 23 year: 2013 ident: 10.1016/j.compeleceng.2018.04.023_bib0012 article-title: A weighting k-modes algorithm for subspace clustering of categorical data publication-title: Neurocomputing doi: 10.1016/j.neucom.2012.11.009 – volume: 57 start-page: 47 year: 2016 ident: 10.1016/j.compeleceng.2018.04.023_bib0002 article-title: Intelligent financial fraud detection: a comprehensive review publication-title: Comput Secur doi: 10.1016/j.cose.2015.09.005 – start-page: 503 year: 2007 ident: 10.1016/j.compeleceng.2018.04.023_bib0016 article-title: A k-means clustering algorithm for mixed numeric and categorical data publication-title: Data Knowl Eng doi: 10.1016/j.datak.2007.03.016 – volume: 7 issue: 8 year: 2016 ident: 10.1016/j.compeleceng.2018.04.023_bib0030 article-title: Pattern recognition approach in multidimensional databases: application to the global terrorism database publication-title: Int J Adv Comput Sci Appl (IJACSA) – volume: 126 start-page: 5614 year: 2015 ident: 10.1016/j.compeleceng.2018.04.023_bib0025 article-title: Effectiveness of the Euclidean distance in high dimensional spaces publication-title: Int J Light Electron Opt doi: 10.1016/j.ijleo.2015.09.093 – year: 2017 ident: 10.1016/j.compeleceng.2018.04.023_bib0001 – volume: 47 start-page: 2505 year: 2014 ident: 10.1016/j.compeleceng.2018.04.023_bib0022 article-title: The min-max k-means clustering algorithm publication-title: Pattern Recognit doi: 10.1016/j.patcog.2014.01.015 – ident: 10.1016/j.compeleceng.2018.04.023_bib0017 – volume: 11 start-page: 691 year: 2017 ident: 10.1016/j.compeleceng.2018.04.023_bib0028 article-title: Clustering categorical data using the k-means algorithm and the attribute's relative frequency – volume: 26 start-page: 120 year: 2012 ident: 10.1016/j.compeleceng.2018.04.023_bib0013 article-title: A dissimilarity measure for the k-modes clustering algorithm publication-title: Knowl-Based Syst doi: 10.1016/j.knosys.2011.07.011 – volume: 38 start-page: 518 issue: 3 year: 2016 ident: 10.1016/j.compeleceng.2018.04.023_bib0005 article-title: Davis multi-directional multi-level dual-cross patterns for robust face recognition publication-title: IEEE Trans Pattern Anal Mach Intell doi: 10.1109/TPAMI.2015.2462338 – volume: 16 start-page: 1147 year: 1995 ident: 10.1016/j.compeleceng.2018.04.023_bib0026 article-title: A conceptual version of the k-means algorithm publication-title: Pattern Recognit Lett doi: 10.1016/0167-8655(95)00075-R – volume: 29 issue: 3 year: 2007 ident: 10.1016/j.compeleceng.2018.04.023_bib0027 article-title: On the impact of dissimilarity measure in K-modes clustering algorithm publication-title: IEEE Trans Pattern Anal Mach Intell doi: 10.1109/TPAMI.2007.53 – volume: 6 issue: 1 year: 2016 ident: 10.1016/j.compeleceng.2018.04.023_bib0009 article-title: Towards reducing the multidimensionality of OLAP cubes using the evolutionary algorithms and factor analysis methods publication-title: Int J Data Min Knowl Manage Process (IJDKP) – year: 2017 ident: 10.1016/j.compeleceng.2018.04.023_bib0007 article-title: Survey on clustering in heterogeneous and homogeneous wireless sensor networks publication-title: J Supercomput |
| SSID | ssj0004618 |
| Score | 2.3694174 |
| Snippet | Partitional clustering algorithms represent an interesting issue in pattern recognition due to their high scalability and efficiency. The k-means, proposed... |
| SourceID | proquest crossref elsevier |
| SourceType | Aggregation Database Enrichment Source Index Database Publisher |
| StartPage | 463 |
| SubjectTerms | Algorithms Artificial intelligence Categorical clustering Centroids Clustering Complexity theory Cost analysis Crime Mining Datasets Distance measurement Efficiency k-means k-modes Partitions Pattern recognition Unsupervised learning |
| Title | A fast and effective partitional clustering algorithm for large categorical datasets using a k-means based approach |
| URI | https://dx.doi.org/10.1016/j.compeleceng.2018.04.023 https://www.proquest.com/docview/2088070320 |
| Volume | 68 |
| WOSCitedRecordID | wos000437999300036&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVESC databaseName: Elsevier SD Freedom Collection Journals 2021 customDbUrl: eissn: 1879-0755 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0004618 issn: 0045-7906 databaseCode: AIEXJ dateStart: 19950101 isFulltext: true titleUrlDefault: https://www.sciencedirect.com providerName: Elsevier |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV3di9QwEA_Lnog-iJ_c6SkRfCuFfqYJ-LLIifpwCHfCvpWkTc89u91l2z3u7_EvdSZJP1AOVsSXsnSbbNr57cxk-psZQt5FOlacKeGXiYANShqVvgwD4WslpSp1kgiuTLOJ7PycL5fi62z2s8-FuamzpuG3t2L7X0UN50DYmDr7F-IeJoUT8BmEDkcQOxwPEvzCq2RreeOWrIHcoC1e6MJ-Rb3H6ggmO7G-2uxW3fe1YRvWyAr3kCJ15SqHIH-01V3r7U1IQXo__LUG4-ah8SuHguRTD7dvE9EaUNkuO2YuPVY-HGIAGpQLWCiDyQu91mNsWm4w29PGrNerkYTQwSPY7M0rlZXLy3Ixi5CPDEEbSOuTaUbmklHOCVbPDFxlbKuPeYY5VraSb6-wbR8ep3ETpx-t8U5sV5w_7IINUVyjWLd453DLSOvjpsxtFI_GcKAoXuBycDVhFkewk4cd9lGUpYLPydHi89nyyyT7NrT23i3_Pnk7sgjv-MG7vKDf_AHj5Fw-Jo_c7oQuLKqekJlunpKHk5qVz0i7oIgvCviiA77oBF90xBcd8EUBX9Tgi07wRXt8UYMvKqnDFzX4oj2-npNvH88uP3zyXecOvwCb0PllluqYcVHEkhWwFqWYYqGuMDkVa96JSgkeaR1gP44Mdgy6VFmRhjEMK8Kqil-QebNp9DGhLJZlGhdCMlYlEjypikWaFVmShlUkmDohvH-SeeHK2mN3lTrv-YvX-UQIOQohD5IchHBComHo1tZ2OWTQ-15cuXNSrfOZA9YOGX7aizh3SqOF78GKgumNgpf_Nvsr8mD8u52Sebfb69fkXnHTrdrdGwfcX8RHzOs |
| linkProvider | Elsevier |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=A+fast+and+effective+partitional+clustering+algorithm+for+large+categorical+datasets+using+a+k-means+based+approach&rft.jtitle=Computers+%26+electrical+engineering&rft.au=Ben+Salem%2C+Semeh&rft.au=Naouali%2C+Sami&rft.au=Chtourou%2C+Zied&rft.date=2018-05-01&rft.pub=Elsevier+Ltd&rft.issn=0045-7906&rft.eissn=1879-0755&rft.volume=68&rft.spage=463&rft.epage=483&rft_id=info:doi/10.1016%2Fj.compeleceng.2018.04.023&rft.externalDocID=S0045790617327131 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0045-7906&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0045-7906&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0045-7906&client=summon |