Hybrid PCA-ILGC clustering approach for high dimensional data

The availability of high dimensional dataset that incredible growth, imposes insufficient conventional approaches to extract hidden useful information. As a result, today researchers are challenged to develop new techniques to deal with massive high dimensional data that has not only in term of numb...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:2012 IEEE International Conference on Systems, Man, and Cybernetics (SMC) s. 420 - 424
Hlavní autoři: Musdholifah, A., Hashim, S. Z. M., Ngah, R.
Médium: Konferenční příspěvek
Jazyk:angličtina
Vydáno: IEEE 01.10.2012
Témata:
ISBN:9781467317139, 1467317136
ISSN:1062-922X
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Abstract The availability of high dimensional dataset that incredible growth, imposes insufficient conventional approaches to extract hidden useful information. As a result, today researchers are challenged to develop new techniques to deal with massive high dimensional data that has not only in term of number of data but also in the number of attributes. In order to improve effectiveness and accuracy of mining task on high dimensional data, an efficient dimensionality reduction method should be executed in data preprocessing stage before clustering technique is applied. Many clustering algorithms has been proposed and used to discover useful information from a dataset. Iterative Local Gaussian Clustering (ILGC) is a simple density based clustering technique that has successfully discovered number of clusters represented in the dataset. In this paper we proposed to use the Principal Component Analysis (PCA) method to preprocess the data prior to ILGC clustering in order to simplify the analysis and visualization of multi dimensional data set. The proposed approach is validated with benchmark classification datasets. In addition, the performance of proposed hybrid PCA-ILGC clustering approach is compared to original ILGC, basic k-means and hybridized k-means. The experimental results indicate that the proposed approach is capable to obtain clusters with higher accuracy, and time taken to process the data was decreased.
AbstractList The availability of high dimensional dataset that incredible growth, imposes insufficient conventional approaches to extract hidden useful information. As a result, today researchers are challenged to develop new techniques to deal with massive high dimensional data that has not only in term of number of data but also in the number of attributes. In order to improve effectiveness and accuracy of mining task on high dimensional data, an efficient dimensionality reduction method should be executed in data preprocessing stage before clustering technique is applied. Many clustering algorithms has been proposed and used to discover useful information from a dataset. Iterative Local Gaussian Clustering (ILGC) is a simple density based clustering technique that has successfully discovered number of clusters represented in the dataset. In this paper we proposed to use the Principal Component Analysis (PCA) method to preprocess the data prior to ILGC clustering in order to simplify the analysis and visualization of multi dimensional data set. The proposed approach is validated with benchmark classification datasets. In addition, the performance of proposed hybrid PCA-ILGC clustering approach is compared to original ILGC, basic k-means and hybridized k-means. The experimental results indicate that the proposed approach is capable to obtain clusters with higher accuracy, and time taken to process the data was decreased.
Author Ngah, R.
Musdholifah, A.
Hashim, S. Z. M.
Author_xml – sequence: 1
  givenname: A.
  surname: Musdholifah
  fullname: Musdholifah, A.
  email: aina_m@ugm.ac.id
  organization: Dept. of Software Eng., Univ. Teknol. Malaysia (UTM), Skudai, Malaysia
– sequence: 2
  givenname: S. Z. M.
  surname: Hashim
  fullname: Hashim, S. Z. M.
  email: sitizaiton@utm.my
  organization: Dept. of Software Eng., Univ. Teknol. Malaysia (UTM), Skudai, Malaysia
– sequence: 3
  givenname: R.
  surname: Ngah
  fullname: Ngah, R.
  email: razalin@fke.utm.my
  organization: Wireless Commun. Centre, Univ. Teknol. Malaysia (UTM), Skudai, Malaysia
BookMark eNo1j9FKwzAYRiNOcJ17Ab3JC7TmT9KkufBiBLcVKgoqeDfSNFkjXVvSebG3d-C8Ohw4fPAlaNYPvUPoHkgGQNRjqd9fdEYJ0EwwKaUgVygBLiQDCZxfo6WSxb8zNUNzIIKmitKvW5RM0zchlHAo5uhpe6pjaPCbXqVltdHYdj_T0cXQ77EZxzgY22I_RNyGfYubcHD9FIbedLgxR3OHbrzpJre8cIE-188feptWr5tSr6o0gMyPqaoJszw3wKmntVCGO3a2mhWu5g6M51bWBeOe59Z7y0FQIRsgVp0LpQRboIe_3eCc240xHEw87S7X2S_OdEuz
ContentType Conference Proceeding
DBID 6IE
6IH
CBEJK
RIE
RIO
DOI 10.1109/ICSMC.2012.6377760
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan (POP) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library (IEL)
IEEE Proceedings Order Plans (POP) 1998-present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library (IEL)
  url: https://ieeexplore.ieee.org/
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
Sciences (General)
EISBN 1467317144
9781467317122
9781467317146
1467317128
EndPage 424
ExternalDocumentID 6377760
Genre orig-research
GroupedDBID 29F
6IE
6IF
6IH
6IK
6IL
6IM
6IN
AAJGR
AAWTH
ABLEC
ADZIZ
ALMA_UNASSIGNED_HOLDINGS
BEFXN
BFFAM
BGNUA
BKEBE
BPEOZ
CBEJK
CHZPO
IEGSK
IJVOP
IPLJI
M43
OCL
RIE
RIL
RIO
RNS
ID FETCH-LOGICAL-i175t-9b03c45a142f2b69a4e35a1b38eb4e1af4c7b834f45cffc416267d10c938e9963
IEDL.DBID RIE
ISBN 9781467317139
1467317136
ISICitedReferencesCount 1
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000316869200071&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 1062-922X
IngestDate Wed Aug 27 03:40:40 EDT 2025
IsPeerReviewed false
IsScholarly true
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i175t-9b03c45a142f2b69a4e35a1b38eb4e1af4c7b834f45cffc416267d10c938e9963
PageCount 5
ParticipantIDs ieee_primary_6377760
PublicationCentury 2000
PublicationDate 2012-Oct.
PublicationDateYYYYMMDD 2012-10-01
PublicationDate_xml – month: 10
  year: 2012
  text: 2012-Oct.
PublicationDecade 2010
PublicationTitle 2012 IEEE International Conference on Systems, Man, and Cybernetics (SMC)
PublicationTitleAbbrev ICSMC
PublicationYear 2012
Publisher IEEE
Publisher_xml – name: IEEE
SSID ssj0020418
ssj0001107051
Score 1.818219
Snippet The availability of high dimensional dataset that incredible growth, imposes insufficient conventional approaches to extract hidden useful information. As a...
SourceID ieee
SourceType Publisher
StartPage 420
SubjectTerms Accuracy
Algorithm design and analysis
Clustering
Clustering algorithms
Data mining
Data visualization
dimensionality reduction
Heart
iterative local Gaussian clustering algorithm
Principal component analysis
Title Hybrid PCA-ILGC clustering approach for high dimensional data
URI https://ieeexplore.ieee.org/document/6377760
WOSCitedRecordID wos000316869200071&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlZ1LSwMxEMdDLR70on2Ib3LwoGDa3SS72Rw8yGJtoZaCD3orSXYCBWmlD8Fvb7K7bRW8eNpk2RcJy8wk8_8NQlciyUBByAg3WUK4UgnRgWsBtdIKJTTNSye89cVgkIxGclhBtxstDADkyWfQ8s18Lz-bmZVfKmvHTAgRuwB9xx0KrdZ2PcXFMTlqqgy2Ah4WMriYEknpKBd1xcKZy5DFa9ZT2ZdrNU0g2730-Sn1KV-0Vb7uV92V3Ox0Dv73wYeoudXv4eHGMtVQBaZ1tP8DPVhHtfKnXuDrkjx900B33S8v4MLD9J70-o8pNu8rD1Jwd-A1fBw7Lxd7yDHOfGGAAuqBfaJpE712Hl7SLinrK5CJcxqWROqAGR6pkFNLdSwVB-Z6miWgOYTKciN0wrjlkbHWONeNxiILAyPdFS5OYkeoOp1N4RjhCATVoQqAR5yDldo9RMWB0CEwqXRyghp-dMYfBUJjXA7M6d-nz9Cen4AiZ-4cVZfzFVygXfO5nCzml_m8fwPrs6WY
linkProvider IEEE
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlZ3PS8MwFMfDUEG96H6Iv83Bg4Ld0iRtmoMHKc4NtzFwym4jSV9hIJvsh-B_b9J2m4IXb0npL5KW917yvp-H0LWIElDgM4-bJPK4UpGniW0BTWUqlNA0K53w1hG9XjQcyn4J3a21MACQJZ9B3TWzvfxkapZuqawRMiFEaAP07YBzSnK11mZFxUYyGWyqCLcI93MhXEg9Sekwk3WFwhpMn4Ur2lPRlys9DZGNdvzSjV3SF60XD_xVeSUzPM2D_73yIaptFHy4v7ZNZVSCSQXt_4APVlC5-K3n-KZgT99W0X3ry0m4cD9-8Nqdpxib96VDKdgr8Ao_jq2fix3mGCeuNECO9cAu1bSGXpuPg7jlFRUWvLF1Gxae1IQZHiif05TqUCoOzPY0i0Bz8FXKjdAR4ykPTJoa67zRUCQ-MdKeYSMldoS2JtMJHCMcgKDaVwS4nRtIpbY3USER2gcmlY5OUNWNzugjh2iMioE5_fvwFdptDbqdUafdez5De24y8gy6c7S1mC3hAu2Yz8V4PrvMvoFvwiyo3w
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2012+IEEE+International+Conference+on+Systems%2C+Man%2C+and+Cybernetics+%28SMC%29&rft.atitle=Hybrid+PCA-ILGC+clustering+approach+for+high+dimensional+data&rft.au=Musdholifah%2C+A.&rft.au=Hashim%2C+S.+Z.+M.&rft.au=Ngah%2C+R.&rft.date=2012-10-01&rft.pub=IEEE&rft.isbn=9781467317139&rft.issn=1062-922X&rft.spage=420&rft.epage=424&rft_id=info:doi/10.1109%2FICSMC.2012.6377760&rft.externalDocID=6377760
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1062-922X&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1062-922X&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1062-922X&client=summon