Determining the number of clusters using information entropy for mixed data

In cluster analysis, one of the most challenging and difficult problems is the determination of the number of clusters in a data set, which is a basic input parameter for most clustering algorithms. To solve this problem, many algorithms have been proposed for either numerical or categorical data se...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:Pattern recognition Jg. 45; H. 6; S. 2251 - 2265
Hauptverfasser: Liang, Jiye, Zhao, Xingwang, Li, Deyu, Cao, Fuyuan, Dang, Chuangyin
Format: Journal Article
Sprache:Englisch
Veröffentlicht: Kidlington Elsevier Ltd 01.06.2012
Elsevier
Schlagworte:
ISSN:0031-3203, 1873-5142
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Abstract In cluster analysis, one of the most challenging and difficult problems is the determination of the number of clusters in a data set, which is a basic input parameter for most clustering algorithms. To solve this problem, many algorithms have been proposed for either numerical or categorical data sets. However, these algorithms are not very effective for a mixed data set containing both numerical attributes and categorical attributes. To overcome this deficiency, a generalized mechanism is presented in this paper by integrating Rényi entropy and complement entropy together. The mechanism is able to uniformly characterize within-cluster entropy and between-cluster entropy and to identify the worst cluster in a mixed data set. In order to evaluate the clustering results for mixed data, an effective cluster validity index is also defined in this paper. Furthermore, by introducing a new dissimilarity measure into the k-prototypes algorithm, we develop an algorithm to determine the number of clusters in a mixed data set. The performance of the algorithm has been studied on several synthetic and real world data sets. The comparisons with other clustering algorithms show that the proposed algorithm is more effective in detecting the optimal number of clusters and generates better clustering results. ► A generalized mechanism is presented using information entropy. ► An effective cluster validity index is developed. ► We redefine the dissimilarity measure used in the k-prototypes algorithm. ► An algorithm is presented to determine the number of clusters for mixed data.
AbstractList In cluster analysis, one of the most challenging and difficult problems is the determination of the number of clusters in a data set, which is a basic input parameter for most clustering algorithms. To solve this problem, many algorithms have been proposed for either numerical or categorical data sets. However, these algorithms are not very effective for a mixed data set containing both numerical attributes and categorical attributes. To overcome this deficiency, a generalized mechanism is presented in this paper by integrating Renyi entropy and complement entropy together. The mechanism is able to uniformly characterize within-cluster entropy and between-cluster entropy and to identify the worst cluster in a mixed data set. In order to evaluate the clustering results for mixed data, an effective cluster validity index is also defined in this paper. Furthermore, by introducing a new dissimilarity measure into the k-prototypes algorithm, we develop an algorithm to determine the number of clusters in a mixed data set. The performance of the algorithm has been studied on several synthetic and real world data sets. The comparisons with other clustering algorithms show that the proposed algorithm is more effective in detecting the optimal number of clusters and generates better clustering results.
In cluster analysis, one of the most challenging and difficult problems is the determination of the number of clusters in a data set, which is a basic input parameter for most clustering algorithms. To solve this problem, many algorithms have been proposed for either numerical or categorical data sets. However, these algorithms are not very effective for a mixed data set containing both numerical attributes and categorical attributes. To overcome this deficiency, a generalized mechanism is presented in this paper by integrating Rényi entropy and complement entropy together. The mechanism is able to uniformly characterize within-cluster entropy and between-cluster entropy and to identify the worst cluster in a mixed data set. In order to evaluate the clustering results for mixed data, an effective cluster validity index is also defined in this paper. Furthermore, by introducing a new dissimilarity measure into the k-prototypes algorithm, we develop an algorithm to determine the number of clusters in a mixed data set. The performance of the algorithm has been studied on several synthetic and real world data sets. The comparisons with other clustering algorithms show that the proposed algorithm is more effective in detecting the optimal number of clusters and generates better clustering results. ► A generalized mechanism is presented using information entropy. ► An effective cluster validity index is developed. ► We redefine the dissimilarity measure used in the k-prototypes algorithm. ► An algorithm is presented to determine the number of clusters for mixed data.
Author Zhao, Xingwang
Dang, Chuangyin
Liang, Jiye
Li, Deyu
Cao, Fuyuan
Author_xml – sequence: 1
  givenname: Jiye
  surname: Liang
  fullname: Liang, Jiye
  email: ljy@sxu.edu.cn
  organization: Key Laboratory of Computational Intelligence and Chinese Information Processing of Ministry of Education, School of Computer and Information Technology, Shanxi University, Taiyuan, 030006 Shanxi, China
– sequence: 2
  givenname: Xingwang
  surname: Zhao
  fullname: Zhao, Xingwang
  email: zhaoxw84@163.com
  organization: Key Laboratory of Computational Intelligence and Chinese Information Processing of Ministry of Education, School of Computer and Information Technology, Shanxi University, Taiyuan, 030006 Shanxi, China
– sequence: 3
  givenname: Deyu
  surname: Li
  fullname: Li, Deyu
  email: lidy@sxu.edu.cn
  organization: Key Laboratory of Computational Intelligence and Chinese Information Processing of Ministry of Education, School of Computer and Information Technology, Shanxi University, Taiyuan, 030006 Shanxi, China
– sequence: 4
  givenname: Fuyuan
  surname: Cao
  fullname: Cao, Fuyuan
  email: cfy@sxu.edu.cn
  organization: Key Laboratory of Computational Intelligence and Chinese Information Processing of Ministry of Education, School of Computer and Information Technology, Shanxi University, Taiyuan, 030006 Shanxi, China
– sequence: 5
  givenname: Chuangyin
  surname: Dang
  fullname: Dang, Chuangyin
  email: mecdang@cityu.edu.hk
  organization: Department of Manufacturing Engineering and Engineering Management, City University of Hong Kong, Hong Kong
BackLink http://pascal-francis.inist.fr/vibad/index.php?action=getRecordDetail&idt=25876383$$DView record in Pascal Francis
BookMark eNqFkEtPGzEUha0KpCbAP-jCG6RuZurHZB5dIFUB2opIbGBtee5cU0czdrA9VfPvcRrYdFFWlu75zif5LMmJ8w4J-cRZyRmvv2zLnU7gn0rBOC-5KBlvPpAFbxtZrHglTsiCMckLKZj8SJYxblkmcrAgd9eYMEzWWfdE0y-kbp56DNQbCuMccxbpHA-hdcaHSSfrHUWXgt_tab7Qyf7BgQ466XNyavQY8eL1PSOPtzcP6x_F5v77z_W3TQGy7lJRd0ywzqx61nZGNqyFDhrgEuueQd8C7_qqarUwVSNa1vNG6gEqVgGvmRFDI8_I56N3F_zzjDGpyUbAcdQO_RxV3qSrs0J0Gb18RXUEPZqgHdiodsFOOuyVWLVNLVuZua9HDoKPMaBRYNPfv6ag7ZiVB2uttuq4tDosrbhQecdcrv4pv_nfqV0da5i3-m0xqAgWHeBgA0JSg7f_F7wA93ibyQ
CODEN PTNRA8
CitedBy_id crossref_primary_10_1016_j_jbi_2018_06_007
crossref_primary_10_1016_j_knosys_2016_06_023
crossref_primary_10_1016_j_patcog_2023_110136
crossref_primary_10_1016_j_orl_2022_11_008
crossref_primary_10_1016_j_ifset_2024_103772
crossref_primary_10_1111_exsy_12204
crossref_primary_10_1016_j_ijar_2013_03_018
crossref_primary_10_1016_j_patcog_2022_108651
crossref_primary_10_1016_j_procs_2018_03_067
crossref_primary_10_1145_3372499
crossref_primary_10_1007_s00521_021_06689_x
crossref_primary_10_1016_j_cam_2024_116345
crossref_primary_10_1016_j_fss_2013_12_013
crossref_primary_10_3233_IDA_200511
crossref_primary_10_3233_JIFS_191361
crossref_primary_10_14358_PERS_80_7_619
crossref_primary_10_1016_j_swevo_2016_06_004
crossref_primary_10_1080_0951192X_2018_1509129
crossref_primary_10_1016_j_procs_2018_01_093
crossref_primary_10_1109_JSTARS_2014_2307579
crossref_primary_10_1016_j_asoc_2025_113904
crossref_primary_10_1016_j_neucom_2012_11_009
crossref_primary_10_1109_TNNLS_2015_2451151
crossref_primary_10_1109_TFUZZ_2013_2291570
crossref_primary_10_1016_j_neunet_2014_10_012
crossref_primary_10_1016_j_asoc_2022_109718
crossref_primary_10_1016_j_patcog_2016_02_013
crossref_primary_10_3390_e18050185
crossref_primary_10_1016_j_eswa_2020_114149
crossref_primary_10_1016_j_ins_2020_09_056
crossref_primary_10_1007_s11227_018_2249_1
crossref_primary_10_1109_TFUZZ_2018_2880933
crossref_primary_10_1016_j_knosys_2018_09_007
crossref_primary_10_1049_iet_com_2013_0899
crossref_primary_10_1007_s13721_023_00412_7
crossref_primary_10_1109_ACCESS_2020_2999720
crossref_primary_10_1007_s12665_015_4208_y
crossref_primary_10_1007_s13042_018_0803_z
crossref_primary_10_1016_j_ins_2024_120334
crossref_primary_10_1007_s00521_022_07411_1
crossref_primary_10_1016_j_ins_2019_07_100
crossref_primary_10_1109_TFUZZ_2023_3262256
crossref_primary_10_1016_j_ins_2013_03_046
crossref_primary_10_1007_s13042_022_01602_x
crossref_primary_10_1016_j_amc_2018_04_035
crossref_primary_10_1007_s13042_013_0202_4
crossref_primary_10_1007_s10772_025_10201_4
crossref_primary_10_1016_j_asoc_2020_106639
crossref_primary_10_1007_s13042_021_01293_w
crossref_primary_10_1007_s11063_021_10427_8
crossref_primary_10_3390_e25020185
crossref_primary_10_1007_s11280_021_00958_4
crossref_primary_10_1016_j_ins_2015_11_005
crossref_primary_10_1371_journal_pone_0190110
crossref_primary_10_3233_JIFS_18113
crossref_primary_10_3390_math12101434
crossref_primary_10_1111_insr_12274
crossref_primary_10_3390_rs12152449
crossref_primary_10_1007_s11042_022_13050_4
crossref_primary_10_1016_j_neucom_2016_01_056
crossref_primary_10_1080_00949655_2014_1000900
crossref_primary_10_1007_s10462_021_10072_6
crossref_primary_10_1016_j_eswa_2019_01_074
crossref_primary_10_1109_ACCESS_2019_2902620
crossref_primary_10_1109_TNNLS_2015_2498625
crossref_primary_10_1016_j_ipm_2020_102388
crossref_primary_10_1002_pan3_10067
crossref_primary_10_1109_ACCESS_2019_2903568
crossref_primary_10_1016_j_eswa_2020_113555
crossref_primary_10_1007_s13042_025_02793_9
crossref_primary_10_1016_j_comnet_2019_04_022
crossref_primary_10_1080_19393555_2016_1231353
crossref_primary_10_1007_s00500_018_3287_6
crossref_primary_10_1016_j_asoc_2018_07_026
crossref_primary_10_1109_TCYB_2020_2973379
crossref_primary_10_1016_j_patcog_2017_04_019
crossref_primary_10_1016_j_patcog_2018_11_022
crossref_primary_10_3233_IDT_210187
crossref_primary_10_1016_j_ins_2021_04_076
crossref_primary_10_1109_TFUZZ_2021_3118113
crossref_primary_10_1016_j_swevo_2025_102119
crossref_primary_10_1109_LSP_2025_3569466
crossref_primary_10_1002_sec_1560
Cites_doi 10.1109/2.781637
10.1016/j.datak.2008.08.005
10.1023/A:1010924920739
10.1109/TKDE.2002.1019208
10.1109/FUZZ.2002.1004954
10.1002/j.1538-7305.1948.tb01338.x
10.1145/584792.584888
10.1002/widm.47
10.1016/j.patcog.2004.03.012
10.1109/ICECS.2006.379729
10.1080/03081070600687668
10.1109/IJCNN.2010.5596684
10.1023/A:1022852608280
10.1002/widm.33
10.1145/233269.233324
10.1016/j.patcog.2005.01.025
10.1002/widm.15
10.1145/331499.331504
10.1198/016214503000000666
10.1145/276304.276312
10.1016/S0167-8655(99)00008-2
10.1109/TNN.2005.845141
10.1016/j.patrec.2009.09.011
10.1016/B978-012722442-8/50016-1
10.1214/aoms/1177704472
10.1007/s11265-006-9771-8
10.1080/0308107021000013635
10.1016/j.patrec.2007.12.011
10.1145/1497577.1497578
10.1016/S0004-3702(98)00091-5
10.1007/978-3-540-71701-0_129
10.1109/34.982897
10.1016/j.datak.2007.03.016
10.1016/S0167-8655(97)00168-2
10.1016/j.artint.2010.04.018
10.1007/3-540-44533-1_24
10.1016/S0306-4379(00)00022-3
10.1016/S0031-3203(01)00108-X
10.1016/j.fss.2007.03.004
10.1016/j.patcog.2011.04.024
10.1016/j.ins.2007.05.003
10.1016/j.ins.2007.08.010
10.1109/TFUZZ.2010.2050891
10.1109/TKDE.2008.79
10.1023/A:1009769707641
10.1109/TKDE.2008.88
10.1007/978-3-642-20841-6_22
10.1016/j.knosys.2011.02.015
10.1007/BF01908075
ContentType Journal Article
Copyright 2011 Elsevier Ltd
2015 INIST-CNRS
Copyright_xml – notice: 2011 Elsevier Ltd
– notice: 2015 INIST-CNRS
DBID AAYXX
CITATION
IQODW
7SC
8FD
JQ2
L7M
L~C
L~D
DOI 10.1016/j.patcog.2011.12.017
DatabaseName CrossRef
Pascal-Francis
Computer and Information Systems Abstracts
Technology Research Database
ProQuest Computer Science Collection
Advanced Technologies Database with Aerospace
Computer and Information Systems Abstracts – Academic
Computer and Information Systems Abstracts Professional
DatabaseTitle CrossRef
Computer and Information Systems Abstracts
Technology Research Database
Computer and Information Systems Abstracts – Academic
Advanced Technologies Database with Aerospace
ProQuest Computer Science Collection
Computer and Information Systems Abstracts Professional
DatabaseTitleList Computer and Information Systems Abstracts

DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
Applied Sciences
EISSN 1873-5142
EndPage 2265
ExternalDocumentID 25876383
10_1016_j_patcog_2011_12_017
S0031320311005188
GroupedDBID --K
--M
-D8
-DT
-~X
.DC
.~1
0R~
123
1B1
1RT
1~.
1~5
29O
4.4
457
4G.
53G
5VS
7-5
71M
8P~
9JN
AABNK
AACTN
AAEDT
AAEDW
AAIAV
AAIKJ
AAKOC
AALRI
AAOAW
AAQFI
AAQXK
AAXUO
AAYFN
ABBOA
ABEFU
ABFNM
ABFRF
ABHFT
ABJNI
ABMAC
ABTAH
ABXDB
ABYKQ
ACBEA
ACDAQ
ACGFO
ACGFS
ACNNM
ACRLP
ACZNC
ADBBV
ADEZE
ADJOM
ADMUD
ADMXK
ADTZH
AEBSH
AECPX
AEFWE
AEKER
AENEX
AFKWA
AFTJW
AGHFR
AGUBO
AGYEJ
AHHHB
AHJVU
AHZHX
AIALX
AIEXJ
AIKHN
AITUG
AJBFU
AJOXV
ALMA_UNASSIGNED_HOLDINGS
AMFUW
AMRAJ
AOUOD
ASPBG
AVWKF
AXJTR
AZFZN
BJAXD
BKOJK
BLXMC
CS3
DU5
EBS
EFJIC
EFLBG
EJD
EO8
EO9
EP2
EP3
F0J
F5P
FD6
FDB
FEDTE
FGOYB
FIRID
FNPLU
FYGXN
G-Q
G8K
GBLVA
GBOLZ
HLZ
HVGLF
HZ~
H~9
IHE
J1W
JJJVA
KOM
KZ1
LG9
LMP
LY1
M41
MO0
N9A
O-L
O9-
OAUVE
OZT
P-8
P-9
P2P
PC.
Q38
R2-
RIG
RNS
ROL
RPZ
SBC
SDF
SDG
SDP
SDS
SES
SEW
SPC
SPCBC
SST
SSV
SSZ
T5K
TN5
UNMZH
VOH
WUQ
XJE
XPP
ZMT
ZY4
~G-
9DU
AATTM
AAXKI
AAYWO
AAYXX
ABDPE
ABWVN
ACLOT
ACRPL
ACVFH
ADCNI
ADNMO
AEIPS
AEUPX
AFJKZ
AFPUW
AGQPQ
AIGII
AIIUN
AKBMS
AKRWK
AKYEP
ANKPU
APXCP
CITATION
EFKBS
~HD
AFXIZ
AGCQF
AGRNS
BNPGV
IQODW
SSH
7SC
8FD
JQ2
L7M
L~C
L~D
ID FETCH-LOGICAL-c369t-690209f5b089f3708c9c7c13e6b0cb8c19b448a2f47280b173adc404c160f2d73
ISICitedReferencesCount 102
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000301758400020&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 0031-3203
IngestDate Sat Sep 27 20:55:32 EDT 2025
Mon Jul 21 09:15:43 EDT 2025
Sat Nov 29 07:28:49 EST 2025
Tue Nov 18 21:14:54 EST 2025
Fri Feb 23 02:33:55 EST 2024
IsPeerReviewed true
IsScholarly true
Issue 6
Keywords Cluster validity index
k-Prototypes algorithm
Number of clusters
Clustering
Information entropy
Mixed data
Cluster analysis
Automatic classification
Prototype
Similarity
Entropy
Algorithm
Signal classification
Algorithm performance
Numerical data
Renyi theory
Language English
License CC BY 4.0
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-c369t-690209f5b089f3708c9c7c13e6b0cb8c19b448a2f47280b173adc404c160f2d73
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
PQID 1019644829
PQPubID 23500
PageCount 15
ParticipantIDs proquest_miscellaneous_1019644829
pascalfrancis_primary_25876383
crossref_citationtrail_10_1016_j_patcog_2011_12_017
crossref_primary_10_1016_j_patcog_2011_12_017
elsevier_sciencedirect_doi_10_1016_j_patcog_2011_12_017
PublicationCentury 2000
PublicationDate 2012-06-01
PublicationDateYYYYMMDD 2012-06-01
PublicationDate_xml – month: 06
  year: 2012
  text: 2012-06-01
  day: 01
PublicationDecade 2010
PublicationPlace Kidlington
PublicationPlace_xml – name: Kidlington
PublicationTitle Pattern recognition
PublicationYear 2012
Publisher Elsevier Ltd
Elsevier
Publisher_xml – name: Elsevier Ltd
– name: Elsevier
References D. Barbara, Y. Li, J. Couto, Coolcat: an entropy-based algorithm for categorical clustering, in: Proceeding of the 2002 ACM CIKM International Conference on Information and Knowledge Management, 2002, pp. 582–589.
Bandyopadhyay, Saha (bib31) 2008; 20
W.D. Zhao, W.H. Dai, C.B. Tang, K-centers algorithm for clustering mixed type data, in: Proceedings of the 11th Pacific-Asia Conference on Knowledge Discovery and Data Mining, 2007, pp. 1140–1147.
Düntsch, Gediga (bib42) 1998; 106
Qian, Liang, Li, Zhang, Y Dang (bib49) 2008; 178
Huang (bib13) 1998; 2
UCI Machine Learning Repository
Qian, Liang, Pedrycz, Dang (bib48) 2010; 174
for entropy-based categorical clustering, in: Proceeding of the 17th International Conference on Scientific and Statistical Database Management, 2005.
Bai, Liang, Dang (bib35) 2011; 24
Yan, Chen, Liu, Bae (bib37) 2009; 68
J.B. MacQueen, Some methods for classification and analysis of multivariate observations, in: Proceeding of the 5th Berkeley Symposium on Mathematical Statistics and Probability, 1967, pp. 281–297.
Hsu, Chen, Su (bib20) 2007; 177
Mirkin (bib58) 2001; 45
V. Estivill-Castro, J. Yang, A fast and robust general purpose clustering algorithm, in: Proceeding of 6th Pacific Rim International Conference Artificial Intelligence, Melbourne, Australia, 2000, pp. 208–218.
M. Aghagolzadeh, H. Soltanian-Zadeh, B.N. Araabi, A. Aghagolzadeh, Finding the number of clusters in a dataset using an information theoretic hierarchical algorithm, in: Proceedings of the 13th IEEE International Conference on Electronics, Circuits and Systems, 2006, pp. 1336–1339.
S. Guha, R. Rastogi, K. Shim, CURE: An efficient clustering algorithm for large databases, in: Proceeding of ACM SIGMOD International Conference Management of Data, 1998, pp. 73–84.
Ahmad, Dey (bib12) 2007; 63
Shannon (bib39) 1948; 27
Sun, Wang, Jiang (bib26) 2004; 37
Gluck, Corter (bib55) 1985
Liang, Shi, Li, Wierman (bib51) 2006; 35
Wang, Zhang (bib54) 2007; 158
J. Al-Shaqsi, W.J. Wang, A clustering ensemble method for clustering mixed data, in: The 2010 International Joint Conference on Neural Networks, 2010.
A. Renyi, On measures of entropy and information, in: Proceeding of the 4th Berkeley Symposium on Mathematics of Statistics and Probability, 1961, pp. 547–561.
Kaufman, Rousseeuw (bib17) 1990
Xu, Wunsch II (bib3) 2005; 16
Sugar, James (bib33) 2003; 98
Parzen (bib44) 1962; 33
Leung, Zhang, Xu (bib29) 2000; 22
Rezaee, Lelieveldt, Reiber (bib53) 1998; 19
Höppner, Klawonn, Kruse (bib15) 1999
2011.
T. Zhang, R. Ramakrishnan, M. Livny, BIRCH: an efficient data clustering method for very large databases, in: Proceeding of ACM SIGMOD International Conference Management of Data, 1996, pp. 103–114.
K. McKusick, K. Thompson, COBWEB/3: A Portable Implementation, Technical Report FIA-90-6-18-2, NASA Ames Research Center, 1990.
Kothari, Pitts (bib27) 1999; 20
Liang, Chin, Dang, Yam Richard (bib47) 2002; 31
Gokcay, Principe (bib46) 2002; 24
Halkidi, Vazirgiannis (bib52) 2008; 29
Jenssen, Eltoft, Erdogmus, Principe (bib45) 2006; 49
Kriegel, Kröger, Zimek (bib5) 2009; 3
Hunt, Jorgensen (bib10) 2011; 1
T.K. Xiong, S.R. Wang, A. Mayers, E. Monga, DHCC: Divisive hierarchical clustering of categorical data, Data Mining and Knowledge Discovery, doi
R. Jensen, Q. Shen, Fuzzy-rough sets for descriptive dimensionality reduction, in: Proceeding of the 2002 IEEE International Conference on Fuzzy Systems, 2002, pp. 29–34.
He, Deng, Xu (bib41) 2005; vol. 3644
Bandyopadhyay, Maulik (bib30) 2002; 35
Bai, Liang, Dang, Cao (bib6) 2011; 44
Cao, Liang, Bai, Zhao, Dang (bib9) 2010; 18
Z.X. Huang, A fast clustering algorithm to cluster very large categorical data sets in data mining, in: Proceeding of the SIGMOD Workshop on Research Issues on Data Mining and Knowledge Discovery, 1997, pp. 1–8.
Karypis, Han, Kumar (bib22) 1999; 32
Bandyopadhyay (bib32) 2011; 1
C.C. Aggarwal, J.W. Han, J.Y. Wang, P.S. Yu, A framework for clustering evolving data streams, in: Proceedings of the 29th VLDB Conference, Berlin, Germany, 2003.
Li, Biswas (bib19) 2002; 14
Jain (bib4) 2010; 31
Bezdek (bib61) 1998
Jain, Murty, Flynn (bib2) 1999; 31
Han, Kamber (bib1) 2001
Fisher (bib56) 1987; 2
M. Bohanec, V. Rajkovic, Knowledge acquisition and explanation for multi-attribute decision making, in: Proceeding of the 8th International Workshop on Expert Systems and Their Applications, Avignon, France, 1988, pp. 59–78.
K.K. Chen, L. Liu, The best
Liang, Li (bib50) 2005
Guha, Rastogi, Shim (bib24) 2000; 25
Hubert, Arabie (bib62) 1985; 2
Mirkin (bib25) 2011; 1
Li, Ng, Cheung, Huang (bib28) 2008; 20
Liao (bib7) 2005; 38
Karypis (10.1016/j.patcog.2011.12.017_bib22) 1999; 32
Shannon (10.1016/j.patcog.2011.12.017_bib39) 1948; 27
Wang (10.1016/j.patcog.2011.12.017_bib54) 2007; 158
10.1016/j.patcog.2011.12.017_bib43
Bandyopadhyay (10.1016/j.patcog.2011.12.017_bib32) 2011; 1
Ahmad (10.1016/j.patcog.2011.12.017_bib12) 2007; 63
10.1016/j.patcog.2011.12.017_bib40
10.1016/j.patcog.2011.12.017_bib38
Liang (10.1016/j.patcog.2011.12.017_bib50) 2005
10.1016/j.patcog.2011.12.017_bib34
10.1016/j.patcog.2011.12.017_bib36
Bai (10.1016/j.patcog.2011.12.017_bib6) 2011; 44
Düntsch (10.1016/j.patcog.2011.12.017_bib42) 1998; 106
Mirkin (10.1016/j.patcog.2011.12.017_bib58) 2001; 45
Cao (10.1016/j.patcog.2011.12.017_bib9) 2010; 18
Guha (10.1016/j.patcog.2011.12.017_bib24) 2000; 25
Hubert (10.1016/j.patcog.2011.12.017_bib62) 1985; 2
Halkidi (10.1016/j.patcog.2011.12.017_bib52) 2008; 29
Gluck (10.1016/j.patcog.2011.12.017_bib55) 1985
Li (10.1016/j.patcog.2011.12.017_bib28) 2008; 20
Bandyopadhyay (10.1016/j.patcog.2011.12.017_bib31) 2008; 20
10.1016/j.patcog.2011.12.017_bib23
Sugar (10.1016/j.patcog.2011.12.017_bib33) 2003; 98
Qian (10.1016/j.patcog.2011.12.017_bib49) 2008; 178
Qian (10.1016/j.patcog.2011.12.017_bib48) 2010; 174
Han (10.1016/j.patcog.2011.12.017_bib1) 2001
Jenssen (10.1016/j.patcog.2011.12.017_bib45) 2006; 49
Gokcay (10.1016/j.patcog.2011.12.017_bib46) 2002; 24
Li (10.1016/j.patcog.2011.12.017_bib19) 2002; 14
Hsu (10.1016/j.patcog.2011.12.017_bib20) 2007; 177
10.1016/j.patcog.2011.12.017_bib63
10.1016/j.patcog.2011.12.017_bib64
10.1016/j.patcog.2011.12.017_bib21
Rezaee (10.1016/j.patcog.2011.12.017_bib53) 1998; 19
Mirkin (10.1016/j.patcog.2011.12.017_bib25) 2011; 1
Jain (10.1016/j.patcog.2011.12.017_bib4) 2010; 31
10.1016/j.patcog.2011.12.017_bib60
10.1016/j.patcog.2011.12.017_bib16
10.1016/j.patcog.2011.12.017_bib18
Bai (10.1016/j.patcog.2011.12.017_bib35) 2011; 24
Kaufman (10.1016/j.patcog.2011.12.017_bib17) 1990
10.1016/j.patcog.2011.12.017_bib57
10.1016/j.patcog.2011.12.017_bib14
10.1016/j.patcog.2011.12.017_bib59
Bezdek (10.1016/j.patcog.2011.12.017_bib61) 1998
Kriegel (10.1016/j.patcog.2011.12.017_bib5) 2009; 3
Sun (10.1016/j.patcog.2011.12.017_bib26) 2004; 37
Huang (10.1016/j.patcog.2011.12.017_bib13) 1998; 2
Liao (10.1016/j.patcog.2011.12.017_bib7) 2005; 38
Fisher (10.1016/j.patcog.2011.12.017_bib56) 1987; 2
Hunt (10.1016/j.patcog.2011.12.017_bib10) 2011; 1
Liang (10.1016/j.patcog.2011.12.017_bib51) 2006; 35
Jain (10.1016/j.patcog.2011.12.017_bib2) 1999; 31
10.1016/j.patcog.2011.12.017_bib11
Leung (10.1016/j.patcog.2011.12.017_bib29) 2000; 22
He (10.1016/j.patcog.2011.12.017_bib41) 2005; vol. 3644
Liang (10.1016/j.patcog.2011.12.017_bib47) 2002; 31
Bandyopadhyay (10.1016/j.patcog.2011.12.017_bib30) 2002; 35
Yan (10.1016/j.patcog.2011.12.017_bib37) 2009; 68
Xu (10.1016/j.patcog.2011.12.017_bib3) 2005; 16
Parzen (10.1016/j.patcog.2011.12.017_bib44) 1962; 33
10.1016/j.patcog.2011.12.017_bib8
Höppner (10.1016/j.patcog.2011.12.017_bib15) 1999
Kothari (10.1016/j.patcog.2011.12.017_bib27) 1999; 20
References_xml – volume: 20
  start-page: 405
  year: 1999
  end-page: 416
  ident: bib27
  article-title: On finding the number of clusters
  publication-title: Pattern Recognition Letters
– volume: 31
  start-page: 331
  year: 2002
  end-page: 342
  ident: bib47
  article-title: A new method for measuring uncertainly and fuzziness in rough set theory
  publication-title: International Journal of General Systems
– volume: 158
  start-page: 2095
  year: 2007
  end-page: 2117
  ident: bib54
  article-title: On fuzzy cluster validity indices
  publication-title: Fuzzy Sets and Systems
– volume: 3
  start-page: 1
  year: 2009
  end-page: 58
  ident: bib5
  article-title: Clustering high-dimensional data: a survey on subspace clustering, pattern-based clustering and correlation clustering
  publication-title: ACM Transactions on Knowledge Discovery from Data
– volume: 63
  start-page: 503
  year: 2007
  end-page: 527
  ident: bib12
  article-title: A
  publication-title: Data & Knowledge Engineering
– volume: 35
  start-page: 1197
  year: 2002
  end-page: 1208
  ident: bib30
  article-title: Genetic clustering for automatic evolution of clusters and application to image classification
  publication-title: Pattern Recognition
– reference: T.K. Xiong, S.R. Wang, A. Mayers, E. Monga, DHCC: Divisive hierarchical clustering of categorical data, Data Mining and Knowledge Discovery, doi:
– volume: 1
  start-page: 524
  year: 2011
  end-page: 531
  ident: bib32
  article-title: Genetic algorithms for clustering and fuzzy clustering
  publication-title: WIREs Data Mining and Knowledge Discovery
– reference: A. Renyi, On measures of entropy and information, in: Proceeding of the 4th Berkeley Symposium on Mathematics of Statistics and Probability, 1961, pp. 547–561.
– reference: J.B. MacQueen, Some methods for classification and analysis of multivariate observations, in: Proceeding of the 5th Berkeley Symposium on Mathematical Statistics and Probability, 1967, pp. 281–297.
– reference: K. McKusick, K. Thompson, COBWEB/3: A Portable Implementation, Technical Report FIA-90-6-18-2, NASA Ames Research Center, 1990.
– volume: 18
  start-page: 872
  year: 2010
  end-page: 882
  ident: bib9
  article-title: A framework for clustering categorical time-evolving data
  publication-title: IEEE Transactions on Fuzzy Systems
– volume: 68
  start-page: 28
  year: 2009
  end-page: 48
  ident: bib37
  article-title: Determining the best
  publication-title: Data & Knowledge Engineering
– volume: 98
  start-page: 750
  year: 2003
  end-page: 763
  ident: bib33
  article-title: Finding the number of clusters in a data set: an information theoretic approach
  publication-title: Journal of the American Statistical Association
– volume: 35
  start-page: 641
  year: 2006
  end-page: 654
  ident: bib51
  article-title: The information entropy, rough entropy and knowledge granulation in incomplete information system
  publication-title: International Journal of General Systems
– volume: 32
  start-page: 68
  year: 1999
  end-page: 75
  ident: bib22
  article-title: Chameleon: hierarchical clustering using dynamic modeling
  publication-title: IEEE Computer
– start-page: 283
  year: 1985
  end-page: 287
  ident: bib55
  article-title: Information, uncertainty, and the utility of categories
  publication-title: Proceeding of the 7th Annual Conference of the Cognitive Science Society
– reference: Z.X. Huang, A fast clustering algorithm to cluster very large categorical data sets in data mining, in: Proceeding of the SIGMOD Workshop on Research Issues on Data Mining and Knowledge Discovery, 1997, pp. 1–8.
– volume: 22
  start-page: 1394
  year: 2000
  end-page: 1410
  ident: bib29
  article-title: Clustering by scale-space filtering
  publication-title: IEEE Transactions on Pattern Analysis and Machine Intelligence
– volume: 2
  start-page: 193
  year: 1985
  end-page: 218
  ident: bib62
  article-title: Comparing partitions
  publication-title: Journal of Classification
– reference: D. Barbara, Y. Li, J. Couto, Coolcat: an entropy-based algorithm for categorical clustering, in: Proceeding of the 2002 ACM CIKM International Conference on Information and Knowledge Management, 2002, pp. 582–589.
– reference: W.D. Zhao, W.H. Dai, C.B. Tang, K-centers algorithm for clustering mixed type data, in: Proceedings of the 11th Pacific-Asia Conference on Knowledge Discovery and Data Mining, 2007, pp. 1140–1147.
– volume: 16
  start-page: 645
  year: 2005
  end-page: 678
  ident: bib3
  article-title: Survey of clustering algorithms
  publication-title: IEEE Transactions on Neural Networks
– volume: 45
  start-page: 219
  year: 2001
  end-page: 228
  ident: bib58
  article-title: Reinterpreting the category utility function
  publication-title: Machine Learning
– year: 1998
  ident: bib61
  article-title: Pattern Recognition in Handbook of Fuzzy Computation
– reference: V. Estivill-Castro, J. Yang, A fast and robust general purpose clustering algorithm, in: Proceeding of 6th Pacific Rim International Conference Artificial Intelligence, Melbourne, Australia, 2000, pp. 208–218.
– reference: UCI Machine Learning Repository
– reference: M. Aghagolzadeh, H. Soltanian-Zadeh, B.N. Araabi, A. Aghagolzadeh, Finding the number of clusters in a dataset using an information theoretic hierarchical algorithm, in: Proceedings of the 13th IEEE International Conference on Electronics, Circuits and Systems, 2006, pp. 1336–1339.
– reference: C.C. Aggarwal, J.W. Han, J.Y. Wang, P.S. Yu, A framework for clustering evolving data streams, in: Proceedings of the 29th VLDB Conference, Berlin, Germany, 2003.
– year: 1999
  ident: bib15
  article-title: Fuzzy Cluster Analysis: Methods for Classification, Data Analysis, and Image Recognition
– reference: R. Jensen, Q. Shen, Fuzzy-rough sets for descriptive dimensionality reduction, in: Proceeding of the 2002 IEEE International Conference on Fuzzy Systems, 2002, pp. 29–34.
– volume: 37
  start-page: 2027
  year: 2004
  end-page: 2037
  ident: bib26
  article-title: FCM-based model selection algorithms for determining the number of clusters
  publication-title: Pattern Recognition
– volume: 2
  start-page: 283
  year: 1998
  end-page: 304
  ident: bib13
  article-title: Extensions to the
  publication-title: Data Mining and Knowledge Discovery
– volume: 33
  start-page: 1065
  year: 1962
  end-page: 1076
  ident: bib44
  article-title: On the estimation of a probability density function and the mode
  publication-title: Annals of Mathematical Statistics
– volume: 178
  start-page: 181
  year: 2008
  end-page: 202
  ident: bib49
  article-title: Measures for evaluating the decision performance of a decision table in rough set theory
  publication-title: Information Sciences
– reference: K.K. Chen, L. Liu, The best
– reference: M. Bohanec, V. Rajkovic, Knowledge acquisition and explanation for multi-attribute decision making, in: Proceeding of the 8th International Workshop on Expert Systems and Their Applications, Avignon, France, 1988, pp. 59–78.
– reference: , 2011.
– volume: 25
  start-page: 345
  year: 2000
  end-page: 366
  ident: bib24
  article-title: ROCK: a robust clustering algorithm for categorical attributes
  publication-title: Information Systems
– volume: 44
  start-page: 2843
  year: 2011
  end-page: 2861
  ident: bib6
  article-title: A novel attribute weighting algorithm for clustering high-dimensional categorical data
  publication-title: Pattern Recognition
– volume: 49
  start-page: 49
  year: 2006
  end-page: 65
  ident: bib45
  article-title: Some equivalences between kernel methods and information theoretic methods
  publication-title: Journal of VLSI Signal Processing Systems
– volume: 174
  start-page: 597
  year: 2010
  end-page: 618
  ident: bib48
  article-title: Positive approximation: an accelerator for attribute reduction in rough set theory
  publication-title: Artificial Intelligence
– volume: 31
  start-page: 264
  year: 1999
  end-page: 323
  ident: bib2
  article-title: Data clustering: a review
  publication-title: ACM Computing Surveys
– reference: for entropy-based categorical clustering, in: Proceeding of the 17th International Conference on Scientific and Statistical Database Management, 2005.
– year: 1990
  ident: bib17
  article-title: Finding Groups in Data: An Introduction to Cluster Analysis
– reference: S. Guha, R. Rastogi, K. Shim, CURE: An efficient clustering algorithm for large databases, in: Proceeding of ACM SIGMOD International Conference Management of Data, 1998, pp. 73–84.
– volume: 24
  start-page: 158
  year: 2002
  end-page: 171
  ident: bib46
  article-title: Information theoretic clustering
  publication-title: IEEE Transactions on Pattern Analysis and Machine Intelligence
– volume: 1
  start-page: 352
  year: 2011
  end-page: 361
  ident: bib10
  article-title: Clustering mixed data
  publication-title: WIREs Data Mining and Knowledge Discovery
– volume: 14
  start-page: 673
  year: 2002
  end-page: 690
  ident: bib19
  article-title: Unsupervised learning with mixed numeric and nominal data
  publication-title: IEEE Transactions on Knowledge and Data Engineering
– volume: 31
  start-page: 651
  year: 2010
  end-page: 666
  ident: bib4
  article-title: Data clustering: 50 years beyond
  publication-title: Pattern Recognition Letters
– volume: 20
  start-page: 1519
  year: 2008
  end-page: 1534
  ident: bib28
  article-title: Agglomerative fuzzy
  publication-title: IEEE Transactions on Knowledge and Data Engineering
– volume: 2
  start-page: 139
  year: 1987
  end-page: 172
  ident: bib56
  article-title: Knowledge acquisition via incremental conceptual clustering
  publication-title: Machine Learning
– volume: 29
  start-page: 773
  year: 2008
  end-page: 786
  ident: bib52
  article-title: A density-based cluster validity approach using multi-representatives
  publication-title: Pattern Recognition Letters
– volume: 106
  start-page: 109
  year: 1998
  end-page: 137
  ident: bib42
  article-title: Uncertainty measures of rough set prediction
  publication-title: Artificial Intelligence
– year: 2001
  ident: bib1
  article-title: Data Mining Concepts and Techniques
– reference: T. Zhang, R. Ramakrishnan, M. Livny, BIRCH: an efficient data clustering method for very large databases, in: Proceeding of ACM SIGMOD International Conference Management of Data, 1996, pp. 103–114.
– volume: 1
  start-page: 252
  year: 2011
  end-page: 260
  ident: bib25
  article-title: Choosing the number of clusters
  publication-title: WIREs Data Mining and Knowledge Discovery
– volume: 24
  start-page: 785
  year: 2011
  end-page: 795
  ident: bib35
  article-title: An initialization method to simultaneously find initial cluster centers and the number of clusters for clustering categorical data
  publication-title: Knowledge-Based Systems
– year: 2005
  ident: bib50
  article-title: Uncertainty and Knowledge Acquisition in Information Systems
– volume: 19
  start-page: 237
  year: 1998
  end-page: 346
  ident: bib53
  article-title: A new cluster validity index for the fuzzy c-mean
  publication-title: Pattern Recognition Letters
– volume: 177
  start-page: 4474
  year: 2007
  end-page: 4492
  ident: bib20
  article-title: Hierarchical clustering of mixed data based on distance hierarchy
  publication-title: Information Sciences
– volume: vol. 3644
  start-page: 400
  year: 2005
  end-page: 409
  ident: bib41
  article-title: An optimization model for outlier detection in categorical data
  publication-title: Lecture Notes in Computer Science
– reference: 2011.
– volume: 20
  start-page: 1441
  year: 2008
  end-page: 1457
  ident: bib31
  article-title: A point symmetry-based clustering technique for automatic evolution of clusters
  publication-title: IEEE Transactions on Knowledge and Data Engineering
– volume: 38
  start-page: 1857
  year: 2005
  end-page: 1874
  ident: bib7
  article-title: Clustering of time series data survey
  publication-title: Pattern Recognition
– volume: 27
  start-page: 379
  year: 1948
  end-page: 423
  ident: bib39
  article-title: A mathematical theory of communication
  publication-title: Bell Systems Technical Journal
– reference: J. Al-Shaqsi, W.J. Wang, A clustering ensemble method for clustering mixed data, in: The 2010 International Joint Conference on Neural Networks, 2010.
– volume: 32
  start-page: 68
  issue: 8
  year: 1999
  ident: 10.1016/j.patcog.2011.12.017_bib22
  article-title: Chameleon: hierarchical clustering using dynamic modeling
  publication-title: IEEE Computer
  doi: 10.1109/2.781637
– volume: 68
  start-page: 28
  issue: 1
  year: 2009
  ident: 10.1016/j.patcog.2011.12.017_bib37
  article-title: Determining the best k for clustering transactional datasets: a coverage density-based approach
  publication-title: Data & Knowledge Engineering
  doi: 10.1016/j.datak.2008.08.005
– volume: 45
  start-page: 219
  issue: 2
  year: 2001
  ident: 10.1016/j.patcog.2011.12.017_bib58
  article-title: Reinterpreting the category utility function
  publication-title: Machine Learning
  doi: 10.1023/A:1010924920739
– ident: 10.1016/j.patcog.2011.12.017_bib43
– volume: 14
  start-page: 673
  issue: 4
  year: 2002
  ident: 10.1016/j.patcog.2011.12.017_bib19
  article-title: Unsupervised learning with mixed numeric and nominal data
  publication-title: IEEE Transactions on Knowledge and Data Engineering
  doi: 10.1109/TKDE.2002.1019208
– ident: 10.1016/j.patcog.2011.12.017_bib38
  doi: 10.1109/FUZZ.2002.1004954
– year: 1998
  ident: 10.1016/j.patcog.2011.12.017_bib61
– ident: 10.1016/j.patcog.2011.12.017_bib14
– volume: 27
  start-page: 379
  issue: 3-4
  year: 1948
  ident: 10.1016/j.patcog.2011.12.017_bib39
  article-title: A mathematical theory of communication
  publication-title: Bell Systems Technical Journal
  doi: 10.1002/j.1538-7305.1948.tb01338.x
– ident: 10.1016/j.patcog.2011.12.017_bib40
  doi: 10.1145/584792.584888
– volume: 1
  start-page: 524
  issue: 6
  year: 2011
  ident: 10.1016/j.patcog.2011.12.017_bib32
  article-title: Genetic algorithms for clustering and fuzzy clustering
  publication-title: WIREs Data Mining and Knowledge Discovery
  doi: 10.1002/widm.47
– ident: 10.1016/j.patcog.2011.12.017_bib57
– volume: 37
  start-page: 2027
  issue: 10
  year: 2004
  ident: 10.1016/j.patcog.2011.12.017_bib26
  article-title: FCM-based model selection algorithms for determining the number of clusters
  publication-title: Pattern Recognition
  doi: 10.1016/j.patcog.2004.03.012
– year: 2005
  ident: 10.1016/j.patcog.2011.12.017_bib50
– ident: 10.1016/j.patcog.2011.12.017_bib34
  doi: 10.1109/ICECS.2006.379729
– volume: 35
  start-page: 641
  issue: 6
  year: 2006
  ident: 10.1016/j.patcog.2011.12.017_bib51
  article-title: The information entropy, rough entropy and knowledge granulation in incomplete information system
  publication-title: International Journal of General Systems
  doi: 10.1080/03081070600687668
– ident: 10.1016/j.patcog.2011.12.017_bib59
  doi: 10.1109/IJCNN.2010.5596684
– volume: 2
  start-page: 139
  issue: 2
  year: 1987
  ident: 10.1016/j.patcog.2011.12.017_bib56
  article-title: Knowledge acquisition via incremental conceptual clustering
  publication-title: Machine Learning
  doi: 10.1023/A:1022852608280
– volume: 1
  start-page: 352
  issue: 4
  year: 2011
  ident: 10.1016/j.patcog.2011.12.017_bib10
  article-title: Clustering mixed data
  publication-title: WIREs Data Mining and Knowledge Discovery
  doi: 10.1002/widm.33
– ident: 10.1016/j.patcog.2011.12.017_bib23
  doi: 10.1145/233269.233324
– volume: 38
  start-page: 1857
  issue: 11
  year: 2005
  ident: 10.1016/j.patcog.2011.12.017_bib7
  article-title: Clustering of time series data survey
  publication-title: Pattern Recognition
  doi: 10.1016/j.patcog.2005.01.025
– volume: 1
  start-page: 252
  issue: 3
  year: 2011
  ident: 10.1016/j.patcog.2011.12.017_bib25
  article-title: Choosing the number of clusters
  publication-title: WIREs Data Mining and Knowledge Discovery
  doi: 10.1002/widm.15
– volume: 31
  start-page: 264
  issue: 3
  year: 1999
  ident: 10.1016/j.patcog.2011.12.017_bib2
  article-title: Data clustering: a review
  publication-title: ACM Computing Surveys
  doi: 10.1145/331499.331504
– volume: 98
  start-page: 750
  issue: 463
  year: 2003
  ident: 10.1016/j.patcog.2011.12.017_bib33
  article-title: Finding the number of clusters in a data set: an information theoretic approach
  publication-title: Journal of the American Statistical Association
  doi: 10.1198/016214503000000666
– volume: vol. 3644
  start-page: 400
  year: 2005
  ident: 10.1016/j.patcog.2011.12.017_bib41
  article-title: An optimization model for outlier detection in categorical data
– ident: 10.1016/j.patcog.2011.12.017_bib21
  doi: 10.1145/276304.276312
– volume: 20
  start-page: 405
  issue: 4
  year: 1999
  ident: 10.1016/j.patcog.2011.12.017_bib27
  article-title: On finding the number of clusters
  publication-title: Pattern Recognition Letters
  doi: 10.1016/S0167-8655(99)00008-2
– volume: 16
  start-page: 645
  issue: 3
  year: 2005
  ident: 10.1016/j.patcog.2011.12.017_bib3
  article-title: Survey of clustering algorithms
  publication-title: IEEE Transactions on Neural Networks
  doi: 10.1109/TNN.2005.845141
– volume: 31
  start-page: 651
  issue: 8
  year: 2010
  ident: 10.1016/j.patcog.2011.12.017_bib4
  article-title: Data clustering: 50 years beyond k-means
  publication-title: Pattern Recognition Letters
  doi: 10.1016/j.patrec.2009.09.011
– ident: 10.1016/j.patcog.2011.12.017_bib8
  doi: 10.1016/B978-012722442-8/50016-1
– ident: 10.1016/j.patcog.2011.12.017_bib16
– volume: 33
  start-page: 1065
  issue: 3
  year: 1962
  ident: 10.1016/j.patcog.2011.12.017_bib44
  article-title: On the estimation of a probability density function and the mode
  publication-title: Annals of Mathematical Statistics
  doi: 10.1214/aoms/1177704472
– volume: 49
  start-page: 49
  issue: 1–2
  year: 2006
  ident: 10.1016/j.patcog.2011.12.017_bib45
  article-title: Some equivalences between kernel methods and information theoretic methods
  publication-title: Journal of VLSI Signal Processing Systems
  doi: 10.1007/s11265-006-9771-8
– volume: 31
  start-page: 331
  issue: 4
  year: 2002
  ident: 10.1016/j.patcog.2011.12.017_bib47
  article-title: A new method for measuring uncertainly and fuzziness in rough set theory
  publication-title: International Journal of General Systems
  doi: 10.1080/0308107021000013635
– volume: 29
  start-page: 773
  issue: 6
  year: 2008
  ident: 10.1016/j.patcog.2011.12.017_bib52
  article-title: A density-based cluster validity approach using multi-representatives
  publication-title: Pattern Recognition Letters
  doi: 10.1016/j.patrec.2007.12.011
– year: 2001
  ident: 10.1016/j.patcog.2011.12.017_bib1
– volume: 3
  start-page: 1
  issue: 1
  year: 2009
  ident: 10.1016/j.patcog.2011.12.017_bib5
  article-title: Clustering high-dimensional data: a survey on subspace clustering, pattern-based clustering and correlation clustering
  publication-title: ACM Transactions on Knowledge Discovery from Data
  doi: 10.1145/1497577.1497578
– volume: 22
  start-page: 1394
  issue: 12
  year: 2000
  ident: 10.1016/j.patcog.2011.12.017_bib29
  article-title: Clustering by scale-space filtering
  publication-title: IEEE Transactions on Pattern Analysis and Machine Intelligence
– volume: 106
  start-page: 109
  issue: 1
  year: 1998
  ident: 10.1016/j.patcog.2011.12.017_bib42
  article-title: Uncertainty measures of rough set prediction
  publication-title: Artificial Intelligence
  doi: 10.1016/S0004-3702(98)00091-5
– ident: 10.1016/j.patcog.2011.12.017_bib60
  doi: 10.1007/978-3-540-71701-0_129
– ident: 10.1016/j.patcog.2011.12.017_bib36
– volume: 24
  start-page: 158
  issue: 2
  year: 2002
  ident: 10.1016/j.patcog.2011.12.017_bib46
  article-title: Information theoretic clustering
  publication-title: IEEE Transactions on Pattern Analysis and Machine Intelligence
  doi: 10.1109/34.982897
– ident: 10.1016/j.patcog.2011.12.017_bib64
– volume: 63
  start-page: 503
  issue: 2
  year: 2007
  ident: 10.1016/j.patcog.2011.12.017_bib12
  article-title: A k-mean clustering algorithm for mixed numeric and categorical data
  publication-title: Data & Knowledge Engineering
  doi: 10.1016/j.datak.2007.03.016
– volume: 19
  start-page: 237
  issue: 3-4
  year: 1998
  ident: 10.1016/j.patcog.2011.12.017_bib53
  article-title: A new cluster validity index for the fuzzy c-mean
  publication-title: Pattern Recognition Letters
  doi: 10.1016/S0167-8655(97)00168-2
– volume: 174
  start-page: 597
  issue: 9–10
  year: 2010
  ident: 10.1016/j.patcog.2011.12.017_bib48
  article-title: Positive approximation: an accelerator for attribute reduction in rough set theory
  publication-title: Artificial Intelligence
  doi: 10.1016/j.artint.2010.04.018
– ident: 10.1016/j.patcog.2011.12.017_bib11
  doi: 10.1007/3-540-44533-1_24
– volume: 25
  start-page: 345
  issue: 5
  year: 2000
  ident: 10.1016/j.patcog.2011.12.017_bib24
  article-title: ROCK: a robust clustering algorithm for categorical attributes
  publication-title: Information Systems
  doi: 10.1016/S0306-4379(00)00022-3
– year: 1999
  ident: 10.1016/j.patcog.2011.12.017_bib15
– volume: 35
  start-page: 1197
  issue: 6
  year: 2002
  ident: 10.1016/j.patcog.2011.12.017_bib30
  article-title: Genetic clustering for automatic evolution of clusters and application to image classification
  publication-title: Pattern Recognition
  doi: 10.1016/S0031-3203(01)00108-X
– volume: 158
  start-page: 2095
  issue: 19
  year: 2007
  ident: 10.1016/j.patcog.2011.12.017_bib54
  article-title: On fuzzy cluster validity indices
  publication-title: Fuzzy Sets and Systems
  doi: 10.1016/j.fss.2007.03.004
– volume: 44
  start-page: 2843
  issue: 12
  year: 2011
  ident: 10.1016/j.patcog.2011.12.017_bib6
  article-title: A novel attribute weighting algorithm for clustering high-dimensional categorical data
  publication-title: Pattern Recognition
  doi: 10.1016/j.patcog.2011.04.024
– volume: 177
  start-page: 4474
  issue: 20
  year: 2007
  ident: 10.1016/j.patcog.2011.12.017_bib20
  article-title: Hierarchical clustering of mixed data based on distance hierarchy
  publication-title: Information Sciences
  doi: 10.1016/j.ins.2007.05.003
– volume: 178
  start-page: 181
  issue: 1
  year: 2008
  ident: 10.1016/j.patcog.2011.12.017_bib49
  article-title: Measures for evaluating the decision performance of a decision table in rough set theory
  publication-title: Information Sciences
  doi: 10.1016/j.ins.2007.08.010
– year: 1990
  ident: 10.1016/j.patcog.2011.12.017_bib17
– volume: 18
  start-page: 872
  issue: 5
  year: 2010
  ident: 10.1016/j.patcog.2011.12.017_bib9
  article-title: A framework for clustering categorical time-evolving data
  publication-title: IEEE Transactions on Fuzzy Systems
  doi: 10.1109/TFUZZ.2010.2050891
– volume: 20
  start-page: 1441
  issue: 11
  year: 2008
  ident: 10.1016/j.patcog.2011.12.017_bib31
  article-title: A point symmetry-based clustering technique for automatic evolution of clusters
  publication-title: IEEE Transactions on Knowledge and Data Engineering
  doi: 10.1109/TKDE.2008.79
– volume: 2
  start-page: 283
  issue: 3
  year: 1998
  ident: 10.1016/j.patcog.2011.12.017_bib13
  article-title: Extensions to the k-means algorithm for clustering large data sets with categorical values
  publication-title: Data Mining and Knowledge Discovery
  doi: 10.1023/A:1009769707641
– start-page: 283
  year: 1985
  ident: 10.1016/j.patcog.2011.12.017_bib55
  article-title: Information, uncertainty, and the utility of categories
– volume: 20
  start-page: 1519
  issue: 11
  year: 2008
  ident: 10.1016/j.patcog.2011.12.017_bib28
  article-title: Agglomerative fuzzy k-means clustering algorithm with selection of number of clusters
  publication-title: IEEE Transactions on Knowledge and Data Engineering
  doi: 10.1109/TKDE.2008.88
– ident: 10.1016/j.patcog.2011.12.017_bib18
  doi: 10.1007/978-3-642-20841-6_22
– ident: 10.1016/j.patcog.2011.12.017_bib63
– volume: 24
  start-page: 785
  issue: 6
  year: 2011
  ident: 10.1016/j.patcog.2011.12.017_bib35
  article-title: An initialization method to simultaneously find initial cluster centers and the number of clusters for clustering categorical data
  publication-title: Knowledge-Based Systems
  doi: 10.1016/j.knosys.2011.02.015
– volume: 2
  start-page: 193
  issue: 1
  year: 1985
  ident: 10.1016/j.patcog.2011.12.017_bib62
  article-title: Comparing partitions
  publication-title: Journal of Classification
  doi: 10.1007/BF01908075
SSID ssj0017142
Score 2.418294
Snippet In cluster analysis, one of the most challenging and difficult problems is the determination of the number of clusters in a data set, which is a basic input...
SourceID proquest
pascalfrancis
crossref
elsevier
SourceType Aggregation Database
Index Database
Enrichment Source
Publisher
StartPage 2251
SubjectTerms Algorithms
Applied sciences
Cluster validity index
Clustering
Clusters
Complement
Entropy (Information theory)
Exact sciences and technology
Information entropy
Information, signal and communications theory
k-Prototypes algorithm
Mixed data
Number of clusters
Optimization
Pattern recognition
Signal and communications theory
Signal representation. Spectral analysis
Signal, noise
Telecommunications and information theory
Title Determining the number of clusters using information entropy for mixed data
URI https://dx.doi.org/10.1016/j.patcog.2011.12.017
https://www.proquest.com/docview/1019644829
Volume 45
WOSCitedRecordID wos000301758400020&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVESC
  databaseName: Elsevier SD Freedom Collection Journals 2021
  customDbUrl:
  eissn: 1873-5142
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0017142
  issn: 0031-3203
  databaseCode: AIEXJ
  dateStart: 19950101
  isFulltext: true
  titleUrlDefault: https://www.sciencedirect.com
  providerName: Elsevier
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1ba9swFBZZu4dB2X0suxQN9lY8LMm2pMeytWxdKYV1I-zF2LLVpWSOqeM2edh_39HFTrIwug0GQSTCtojO509HR-eC0OtY6KSURRzAUqeDqIh0kGciCwpNcpkxqZitEvHlmJ-ciNFIng4GP7pYmKsJryoxn8v6v4oa-kDYJnT2L8TdPxQ64DsIHVoQO7R_JPh33sGli4NyNT-s-_ikNWkRmr22cZEsfeTinrHxTmvnvPl9PAct1Aet9ZrrqU3EaYJfvMfR8vz-eOyNzkfjRY-Tr98ya4UdwVDXmV8f7cWO5hbt8vjDXnjYLloPVW-FMO4cyaoVog-PWfoiWbplJGA0dAxWOoYVnAWgpa1RsMso6aG2xqfUp6Mt_U9XWGKD950J4uJNDevX9NxlZjVWXhcX-ktG7U8uYSU0xHCSELfQNuWxBF7f3v9wMDrqj6E4iVy6ef8_uthL6yC4OdbvdJudOmvgjdOuVMrGqm9VmbP76K7fg-B9h50HaFBWD9G9rr4H9nT_CH1cgRIGKGEHJTzVuIMStlDCK1DCHkoYerCFEjZQeow-Hx6cvX0f-OIbgWKJnAWJhI2E1HEeCqkZD4WSiivCyiQPVS4UkTns7DOqI1PgLCecZYWKwkiRJNS04OwJ2qqmVfkUYaaKuOCJFGXMYH9QSMpLIkmm4ZMrSYeIddOWKp-Z3hRImaSdC-JF6iY7NZOdEprCZA9R0N9Vu8wsN1zPO4mkXrt0WmMKILrhzt01AfbD0djkcxRsiF51Ek2Bnc2RW1aV07Yxz5TGAkLls38e_jm6s3zjXqCt2WVbvkS31dVs3Fzuesj-BGbsuIg
linkProvider Elsevier
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Determining+the+number+of+clusters+using+information+entropy+for+mixed+data&rft.jtitle=Pattern+recognition&rft.au=Liang%2C+Jiye&rft.au=Zhao%2C+Xingwang&rft.au=Li%2C+Deyu&rft.au=Cao%2C+Fuyuan&rft.date=2012-06-01&rft.pub=Elsevier+Ltd&rft.issn=0031-3203&rft.eissn=1873-5142&rft.volume=45&rft.issue=6&rft.spage=2251&rft.epage=2265&rft_id=info:doi/10.1016%2Fj.patcog.2011.12.017&rft.externalDocID=S0031320311005188
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0031-3203&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0031-3203&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0031-3203&client=summon