An ensemble agglomerative hierarchical clustering algorithm based on clusters clustering technique and the novel similarity measurement
The advent of architectures such as the Internet of Things (IoT) has led to the dramatic growth of data and the production of big data. Managing this often-unlabeled data is a big challenge for the real world. Hierarchical Clustering (HC) is recognized as an efficient unsupervised approach to unlabe...
Gespeichert in:
| Veröffentlicht in: | Journal of King Saud University. Computer and information sciences Jg. 34; H. 6; S. 3828 - 3842 |
|---|---|
| Hauptverfasser: | , , |
| Format: | Journal Article |
| Sprache: | Englisch |
| Veröffentlicht: |
Springer
01.06.2022
|
| Schlagworte: | |
| ISSN: | 1319-1578 |
| Online-Zugang: | Volltext |
| Tags: |
Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
|
| Abstract | The advent of architectures such as the Internet of Things (IoT) has led to the dramatic growth of data and the production of big data. Managing this often-unlabeled data is a big challenge for the real world. Hierarchical Clustering (HC) is recognized as an efficient unsupervised approach to unlabeled data analysis. In data mining, HC is a mechanism for grouping data at different scales by creating a dendrogram. One of the most common HC methods is Agglomerative Hierarchical Clustering (AHC) in which clusters are created bottom-up. In addition, ensemble clustering approaches are used today in complex problems due to the weakness of individual clustering methods. Accordingly, we propose a clustering framework using AHC methods based on ensemble approaches, which includes the clusters clustering technique and a novel similarity measurement. The proposed algorithm is a Meta-Clustering Ensemble scheme based on Model Selection (MCEMS). MCEMS uses the bi-weighting policy to solve the model selection associated problem to improve ensemble clustering. Specifically, multiple AHC individual methods cluster the data from different aspects to form the primary clusters. According to the results of different methods, the similarity between the instances is calculated using a novel similarity measurement. The MCEMS scheme involves the creation of meta-clusters by re-clustering of primary clusters. After clusters clustering, the number of optimal clusters is determined by merging similar clusters and considering a threshold. Finally, the similarity of the instances to the meta-clusters is calculated and each instance is assigned to the meta-cluster with the highest similarity to form the final clusters. Simulations have been performed on some datasets from the UCI repository to evaluate MCEMS scheme compared to state-of-the-art algorithms. Extensive experiments clearly prove the superiority of MCEMS over HMM, DSPA and WHAC algorithms based on Wilcoxon test and Cophenetic correlation coefficient. |
|---|---|
| AbstractList | The advent of architectures such as the Internet of Things (IoT) has led to the dramatic growth of data and the production of big data. Managing this often-unlabeled data is a big challenge for the real world. Hierarchical Clustering (HC) is recognized as an efficient unsupervised approach to unlabeled data analysis. In data mining, HC is a mechanism for grouping data at different scales by creating a dendrogram. One of the most common HC methods is Agglomerative Hierarchical Clustering (AHC) in which clusters are created bottom-up. In addition, ensemble clustering approaches are used today in complex problems due to the weakness of individual clustering methods. Accordingly, we propose a clustering framework using AHC methods based on ensemble approaches, which includes the clusters clustering technique and a novel similarity measurement. The proposed algorithm is a Meta-Clustering Ensemble scheme based on Model Selection (MCEMS). MCEMS uses the bi-weighting policy to solve the model selection associated problem to improve ensemble clustering. Specifically, multiple AHC individual methods cluster the data from different aspects to form the primary clusters. According to the results of different methods, the similarity between the instances is calculated using a novel similarity measurement. The MCEMS scheme involves the creation of meta-clusters by re-clustering of primary clusters. After clusters clustering, the number of optimal clusters is determined by merging similar clusters and considering a threshold. Finally, the similarity of the instances to the meta-clusters is calculated and each instance is assigned to the meta-cluster with the highest similarity to form the final clusters. Simulations have been performed on some datasets from the UCI repository to evaluate MCEMS scheme compared to state-of-the-art algorithms. Extensive experiments clearly prove the superiority of MCEMS over HMM, DSPA and WHAC algorithms based on Wilcoxon test and Cophenetic correlation coefficient. |
| Author | ElSayed M. Tag El Din Teng Li Amin Rezaeipanah |
| Author_xml | – sequence: 1 fullname: Teng Li organization: Artificial Intelligence and Big Data College, Chongqing College of Electronic Engineering, Chongqing 401331, China; Corresponding authors – sequence: 2 fullname: Amin Rezaeipanah organization: Department of Computer Engineering, Persian Gulf University, Bushehr, Iran; Corresponding authors – sequence: 3 fullname: ElSayed M. Tag El Din organization: Electrical Engineering Department, Faculty of Engineering and Technology, Future University in Egypt, New Cairo 11845, Egypt |
| BookMark | eNpNjEtOwzAURT0oEqV0Bwy8gQQ7Tpx4WFX8pEpMYBw9Oy-Jg-OAnVTqCtg2FR-J0b265-pckZWfPBJyw1nKGZe3Qzq8xcXYNGNZlrI8ZZytyJoLrhJelNUl2cY4MMZ4KYtcyDX53HmKPuKoHVLoOjeNGGC2R6S9PbdgemvAUeOWOGOwvqPguinYuR-phogNnfwfjf9vM5re24_lrPUNnXukfjqio9GO1sFZcKIjQlwCjujna3LRgou4_c0Neb2_e9k_Jofnh6f97pCYnFdzUnEutRGqkAKqopWqRN1mhgNDVYFCxRrVZEI3TGUoeKHbHNpSybxkDHguxYY8_XibCYb6PdgRwqmewNbfwxS6GsJsjcNaFACtyFummyyvtNaSQQaVlGUrSlGV4gvgDXaz |
| CitedBy_id | crossref_primary_10_1016_j_heliyon_2024_e39016 crossref_primary_10_1016_j_ins_2022_08_100 crossref_primary_10_3390_ijgi14070272 crossref_primary_10_1007_s00357_025_09506_5 crossref_primary_10_1080_00051144_2023_2217601 crossref_primary_10_3390_e27070701 crossref_primary_10_1007_s40031_025_01241_0 crossref_primary_10_1111_jfpe_14519 crossref_primary_10_20473_jatm_v3i1_55572 crossref_primary_10_1080_01969722_2022_2159150 crossref_primary_10_1007_s12032_024_02419_0 crossref_primary_10_1080_01969722_2022_2103231 crossref_primary_10_1016_j_jksuci_2022_12_006 crossref_primary_10_1080_01969722_2022_2151165 crossref_primary_10_1093_ijfood_vvaf017 crossref_primary_10_1016_j_scico_2025_103303 crossref_primary_10_1109_ACCESS_2025_3589620 crossref_primary_10_3390_app151810251 crossref_primary_10_1007_s10639_023_11899_y crossref_primary_10_1016_j_asej_2025_103378 crossref_primary_10_1007_s12633_024_03148_9 crossref_primary_10_1016_j_jobe_2025_113517 crossref_primary_10_1051_itmconf_20257504002 crossref_primary_10_1016_j_enconman_2025_120002 crossref_primary_10_1007_s44196_025_00866_9 crossref_primary_10_1109_ACCESS_2024_3351365 crossref_primary_10_1007_s13177_024_00417_0 crossref_primary_10_1016_j_dajour_2025_100591 crossref_primary_10_1109_ACCESS_2023_3311203 crossref_primary_10_1109_TIM_2025_3578708 crossref_primary_10_2478_jaiscr_2025_0013 crossref_primary_10_1016_j_patcog_2022_109255 crossref_primary_10_1007_s41060_024_00524_x crossref_primary_10_1007_s11634_024_00588_4 crossref_primary_10_1016_j_neucom_2025_129700 crossref_primary_10_1080_0951192X_2023_2177748 crossref_primary_10_1080_01969722_2022_2129375 crossref_primary_10_1177_30504554251325141 crossref_primary_10_1080_01969722_2022_2110682 crossref_primary_10_1016_j_measurement_2025_117199 crossref_primary_10_1007_s11042_024_18215_x |
| ContentType | Journal Article |
| DBID | DOA |
| DOI | 10.1016/j.jksuci.2022.04.010 |
| DatabaseName | DOAJ Directory of Open Access Journals |
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: DOA name: DOAJ Directory of Open Access Journals url: https://www.doaj.org/ sourceTypes: Open Website |
| DeliveryMethod | fulltext_linktorsrc |
| Discipline | Computer Science |
| EndPage | 3842 |
| ExternalDocumentID | oai_doaj_org_article_35aaf34f0bd248bbb60a2a8667f37387 |
| GroupedDBID | --K 0R~ 4.4 457 5VS AAEDT AAEDW AAIKJ AAJSJ AALRI AASML AAXUO AAYWO ABMAC ACGFS ADBBV ADEZE ADVLN AEXQZ AFGXO AFJKZ AFTJW AGHFR AITUG ALMA_UNASSIGNED_HOLDINGS AMRAJ APXCP BCNDV C6C EBS FDB GROUPED_DOAJ IXB KQ8 M41 O-L O9- OK1 ROL SES SSZ XH2 |
| ID | FETCH-LOGICAL-c418t-8116bc39563a85f697ebf2c1a0e98a9e90d9d23bd092e315bf4af7964700a1463 |
| IEDL.DBID | DOA |
| ISICitedReferencesCount | 69 |
| ISICitedReferencesURI | http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000836430400002&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| ISSN | 1319-1578 |
| IngestDate | Mon Nov 03 22:06:49 EST 2025 |
| IsDoiOpenAccess | true |
| IsOpenAccess | true |
| IsPeerReviewed | true |
| IsScholarly | true |
| Issue | 6 |
| Language | English |
| LinkModel | DirectLink |
| MergedId | FETCHMERGED-LOGICAL-c418t-8116bc39563a85f697ebf2c1a0e98a9e90d9d23bd092e315bf4af7964700a1463 |
| OpenAccessLink | https://doaj.org/article/35aaf34f0bd248bbb60a2a8667f37387 |
| PageCount | 15 |
| ParticipantIDs | doaj_primary_oai_doaj_org_article_35aaf34f0bd248bbb60a2a8667f37387 |
| PublicationCentury | 2000 |
| PublicationDate | 2022-06-01 |
| PublicationDateYYYYMMDD | 2022-06-01 |
| PublicationDate_xml | – month: 06 year: 2022 text: 2022-06-01 day: 01 |
| PublicationDecade | 2020 |
| PublicationTitle | Journal of King Saud University. Computer and information sciences |
| PublicationYear | 2022 |
| Publisher | Springer |
| Publisher_xml | – name: Springer |
| SSID | ssj0001765436 |
| Score | 2.4698825 |
| Snippet | The advent of architectures such as the Internet of Things (IoT) has led to the dramatic growth of data and the production of big data. Managing this... |
| SourceID | doaj |
| SourceType | Open Website |
| StartPage | 3828 |
| SubjectTerms | Clusters clustering Ensemble clustering Hierarchical clustering Meta-clusters Model selection Similarity measurement |
| Title | An ensemble agglomerative hierarchical clustering algorithm based on clusters clustering technique and the novel similarity measurement |
| URI | https://doaj.org/article/35aaf34f0bd248bbb60a2a8667f37387 |
| Volume | 34 |
| WOSCitedRecordID | wos000836430400002&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D |
| hasFullText | 1 |
| inHoldings | 1 |
| isFullTextHit | |
| isPrint | |
| link | http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1LT4QwEG6M8eDFt_GdHrwSS4HSHlfjxtPGgyZ7Iy2drigPs7D7F_zbtoVVbl68AiFkZmAezPd9CN1ykZMERBSkRtMgZkkeSBAioABMS5sPGOvFJtLZjM_n4nkk9eV2wnp64N5wd1EipYliQ5SmMVdKMSKp5IylxpHyeBw5ScWomfLTldRhJj20yKF0QhuXG9ycX-56_2hXeWHbQ0o906kD0I44-31ymR6gvaEqxJP-aQ7RFtRHaH-juICHF_AYfU1qbPtOqFQJWC4WZeNmSu6LhZ2mtf8rYI2O83LlCBBsWsKyXDTLonursEtYGjf15mw7vuyHyxXLWmNbFeK6WUOJ26IqbPNra3Vc_Y4TT9Dr9PHl4SkYpBSCPA55F_AwZCqPbDMUSZ4YJlJQhuahJCC4FCCIFppGShNBIQoTZWJpPEqVEGkdFp2i7bqp4QzhBBLgjjNHQRynmtogIIrYOidUilCdnKN7Z8jss2fLyBx_tT9gvZoNXs3-8urFf9zkEu06B_eLXVdou1uu4Brt5OuuaJc3PmC-ASn8yDs |
| linkProvider | Directory of Open Access Journals |
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=An+ensemble+agglomerative+hierarchical+clustering+algorithm+based+on+clusters+clustering+technique+and+the+novel+similarity+measurement&rft.jtitle=Journal+of+King+Saud+University.+Computer+and+information+sciences&rft.au=Teng+Li&rft.au=Amin+Rezaeipanah&rft.au=ElSayed+M.+Tag+El+Din&rft.date=2022-06-01&rft.pub=Springer&rft.issn=1319-1578&rft.volume=34&rft.issue=6&rft.spage=3828&rft.epage=3842&rft_id=info:doi/10.1016%2Fj.jksuci.2022.04.010&rft.externalDBID=DOA&rft.externalDocID=oai_doaj_org_article_35aaf34f0bd248bbb60a2a8667f37387 |
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1319-1578&client=summon |
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1319-1578&client=summon |
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1319-1578&client=summon |