Optimization and Parallelization of Fuzzy Clustering Algorithm Based on the Improved Kmeans++ Clustering

In the field of big data, fuzzy clustering algorithm is one of the most widely used clustering algorithm. Although fuzzy c-means (FCM) algorithm performs well, it still is difficult to ascertain the number of clusters and is sensitive to initial clustering center. To solve these problems, an improve...

Full description

Saved in:
Bibliographic Details
Published in:IOP conference series. Materials Science and Engineering Vol. 768; no. 7; pp. 72106 - 72112
Main Authors: Ma, Yu, Cheng, Wenjuan
Format: Journal Article
Language:English
Published: Bristol IOP Publishing 01.03.2020
Subjects:
ISSN:1757-8981, 1757-899X
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Abstract In the field of big data, fuzzy clustering algorithm is one of the most widely used clustering algorithm. Although fuzzy c-means (FCM) algorithm performs well, it still is difficult to ascertain the number of clusters and is sensitive to initial clustering center. To solve these problems, an improved fuzzy clustering algorithm based on Spark is proposed in this article. The proposed algorithm integrates the L2 norm and uses the kmeans++ algorithm improved by the Canopy algorithm to initialize the cluster center. Experimental results show that the proposed algorithm performs well in clustering accuracy and computational performance.
AbstractList In the field of big data, fuzzy clustering algorithm is one of the most widely used clustering algorithm. Although fuzzy c-means (FCM) algorithm performs well, it still is difficult to ascertain the number of clusters and is sensitive to initial clustering center. To solve these problems, an improved fuzzy clustering algorithm based on Spark is proposed in this article. The proposed algorithm integrates the L2 norm and uses the kmeans++ algorithm improved by the Canopy algorithm to initialize the cluster center. Experimental results show that the proposed algorithm performs well in clustering accuracy and computational performance.
Author Ma, Yu
Cheng, Wenjuan
Author_xml – sequence: 1
  givenname: Yu
  surname: Ma
  fullname: Ma, Yu
  organization: School of Computer Science and Information Engineering, HeFei University of Technology , China
– sequence: 2
  givenname: Wenjuan
  surname: Cheng
  fullname: Cheng, Wenjuan
  email: cheng@ah.edu.cn
  organization: School of Computer Science and Information Engineering, HeFei University of Technology , China
BookMark eNqFkF1LQjEYx0cUpNZXiAPdBGJn2_FsE7ox0ZIMgwq6G_Ns08l5azsG-umbnbIXAq_29vs_e55fExzmRa4AOEPwEkHGQkRj2mG93ktICQtpCClGkByAxu7hcLdn6Bg0nVtCSGi3CxtgMS0rk5mNqEyRByKXwYOwIk1V-nVX6GC02mzWwSBduUpZk8-DfjovrKkWWXAtnJKBx6qFCsZZaYs3f77LlMhdu_0jcwKOtEidOv1cW-B5NHwa3HYm05vxoD_pJBHGpDObJQKzbo8xzHCkMBYMySRSjEpBBNFSME1VLEkiqU6YlgmR2k8cdUUk6UxELXBe1_WtvK6Uq_iyWNncf8lxTDCKIYqxp65qKrGFc1ZpnpjqY97KCpNyBPnWLd9q41uF3LvllNdufZz8iZfWZMKu9wcv6qApyu_G7h-HvzBeSu1R_A-6p_47i5KdxA
CitedBy_id crossref_primary_10_3390_app13148465
crossref_primary_10_1061_JPCFEV_CFENG_4615
crossref_primary_10_3390_f14020218
crossref_primary_10_3390_math11081920
Cites_doi 10.1109/TPAMI.1980.4766964
10.14778/2180912.2180915
ContentType Journal Article
Copyright Published under licence by IOP Publishing Ltd
2020. This work is published under http://creativecommons.org/licenses/by/3.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Copyright_xml – notice: Published under licence by IOP Publishing Ltd
– notice: 2020. This work is published under http://creativecommons.org/licenses/by/3.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
DBID O3W
TSCCA
AAYXX
CITATION
8FE
8FG
ABJCF
ABUWG
AFKRA
AZQEC
BENPR
BGLVJ
CCPQU
D1I
DWQXO
HCIFZ
KB.
L6V
M7S
PDBOC
PHGZM
PHGZT
PIMPY
PKEHL
PQEST
PQGLB
PQQKQ
PQUKI
PRINS
PTHSS
DOI 10.1088/1757-899X/768/7/072106
DatabaseName Institute of Physics Open Access Journal Titles
IOPscience (Open Access)
CrossRef
ProQuest SciTech Collection
ProQuest Technology Collection
Materials Science & Engineering Collection
ProQuest Central (Alumni)
ProQuest Central UK/Ireland
ProQuest Central Essentials - QC
ProQuest Central
Technology Collection
ProQuest One Community College
ProQuest Materials Science Collection
ProQuest Central
SciTech Premium Collection
Materials Science Database
ProQuest Engineering Collection
Engineering Database
Materials Science Collection
ProQuest Central Premium
ProQuest One Academic (New)
Publicly Available Content Database
ProQuest One Academic Middle East (New)
ProQuest One Academic Eastern Edition (DO NOT USE)
ProQuest One Applied & Life Sciences
ProQuest One Academic (retired)
ProQuest One Academic UKI Edition
ProQuest Central China
Engineering Collection
DatabaseTitle CrossRef
Publicly Available Content Database
Technology Collection
ProQuest One Academic Middle East (New)
ProQuest Central Essentials
Materials Science Collection
ProQuest Central (Alumni Edition)
SciTech Premium Collection
ProQuest One Community College
ProQuest Central China
ProQuest Central
ProQuest One Applied & Life Sciences
ProQuest Engineering Collection
ProQuest Central Korea
Materials Science Database
ProQuest Central (New)
Engineering Collection
ProQuest Materials Science Collection
Engineering Database
ProQuest One Academic Eastern Edition
ProQuest Technology Collection
ProQuest SciTech Collection
ProQuest One Academic UKI Edition
Materials Science & Engineering Collection
ProQuest One Academic
ProQuest One Academic (New)
DatabaseTitleList Publicly Available Content Database
CrossRef

Database_xml – sequence: 1
  dbid: O3W
  name: Institute of Physics Open Access Journal Titles
  url: http://iopscience.iop.org/
  sourceTypes:
    Enrichment Source
    Publisher
– sequence: 2
  dbid: KB.
  name: Materials Science Database (Proquest)
  url: http://search.proquest.com/materialsscijournals
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
DocumentTitleAlternate Optimization and Parallelization of Fuzzy Clustering Algorithm Based on the Improved Kmeans++ Clustering
EISSN 1757-899X
ExternalDocumentID 10_1088_1757_899X_768_7_072106
MSE_768_7_072106
GroupedDBID 1JI
5B3
5PX
5VS
AAJIO
AAJKP
ABHWH
ABJCF
ACAFW
ACGFO
ACHIP
ACIPV
AEFHF
AEJGL
AFKRA
AFYNE
AHSEE
AIYBF
AKPSB
ALMA_UNASSIGNED_HOLDINGS
ASPBG
ATQHT
AVWKF
AZFZN
BENPR
BGLVJ
CCPQU
CEBXE
CJUJL
CRLBU
EBS
EDWGO
EQZZN
GROUPED_DOAJ
GX1
HCIFZ
HH5
IJHAN
IOP
IZVLO
KB.
KNG
KQ8
M7S
N5L
O3W
OK1
P2P
PDBOC
PIMPY
PJBAE
PTHSS
RIN
RNS
SY9
T37
TR2
TSCCA
W28
AAYXX
AEINN
AFFHD
CITATION
PHGZM
PHGZT
PQGLB
8FE
8FG
ABUWG
AZQEC
D1I
DWQXO
L6V
PKEHL
PQEST
PQQKQ
PQUKI
PRINS
ID FETCH-LOGICAL-c3226-bbca2849882823e22a81dc3e87da6a6fda8f7e5d6cd7fc8fdc6df07234a3d7ba3
IEDL.DBID O3W
ISSN 1757-8981
IngestDate Wed Aug 13 09:28:37 EDT 2025
Sat Nov 29 02:34:06 EST 2025
Tue Nov 18 21:29:51 EST 2025
Wed Aug 21 03:34:55 EDT 2024
Thu Jan 07 15:21:17 EST 2021
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 7
Language English
License Content from this work may be used under the terms of the Creative Commons Attribution 3.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c3226-bbca2849882823e22a81dc3e87da6a6fda8f7e5d6cd7fc8fdc6df07234a3d7ba3
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
OpenAccessLink https://iopscience.iop.org/article/10.1088/1757-899X/768/7/072106
PQID 2562150152
PQPubID 4998670
PageCount 7
ParticipantIDs crossref_primary_10_1088_1757_899X_768_7_072106
crossref_citationtrail_10_1088_1757_899X_768_7_072106
proquest_journals_2562150152
iop_journals_10_1088_1757_899X_768_7_072106
PublicationCentury 2000
PublicationDate 20200301
PublicationDateYYYYMMDD 2020-03-01
PublicationDate_xml – month: 03
  year: 2020
  text: 20200301
  day: 01
PublicationDecade 2020
PublicationPlace Bristol
PublicationPlace_xml – name: Bristol
PublicationTitle IOP conference series. Materials Science and Engineering
PublicationTitleAlternate IOP Conf. Ser.: Mater. Sci. Eng
PublicationYear 2020
Publisher IOP Publishing
Publisher_xml – name: IOP Publishing
References White (MSE_768_7_072106bib11) 2015
Ling-long (MSE_768_7_072106bib4) 2018; 37
Bahmani (MSE_768_7_072106bib8) 2017; 5
Mccallum (MSE_768_7_072106bib6) 2000
Bing (MSE_768_7_072106bib3) 2017; 37
Yun-long (MSE_768_7_072106bib13) 2019; 29
Guilan (MSE_768_7_072106bib12) 2016; 36
Bezdek (MSE_768_7_072106bib1) 1980; 2
Arthur (MSE_768_7_072106bib7) 2007
Changjun (MSE_768_7_072106bib10) 2014
Zhe (MSE_768_7_072106bib2) 2019; 40
Zaharia (MSE_768_7_072106bib9) 2012
Fangyi (MSE_768_7_072106bib5) 2019; 55
References_xml – volume: 2
  start-page: 1
  year: 1980
  ident: MSE_768_7_072106bib1
  article-title: A Convergence theorem for the fuzzy ISODATA clustering algorithms[J]
  publication-title: IEEE Trans Pattern Anal Mach Intell
  doi: 10.1109/TPAMI.1980.4766964
– year: 2015
  ident: MSE_768_7_072106bib11
– volume: 37
  start-page: 46
  year: 2018
  ident: MSE_768_7_072106bib4
  article-title: Optimization of Fuzzy C-Means Clustering Based on Adaptive Ant Colony Algorithm[J]
  publication-title: Measurement & Control Technology
– start-page: 141
  year: 2012
  ident: MSE_768_7_072106bib9
– volume: 5
  start-page: 622
  year: 2017
  ident: MSE_768_7_072106bib8
  article-title: Scalable k-means++[J]
  publication-title: Proceedings of the VLDB Endowment
  doi: 10.14778/2180912.2180915
– year: 2014
  ident: MSE_768_7_072106bib10
– volume: 37
  start-page: 2600
  year: 2017
  ident: MSE_768_7_072106bib3
  article-title: Kernel fuzzy C-means clustering based on improved artificial bee colony algorithm[J]
  publication-title: Journal of Computer Applications
– volume: 36
  start-page: 342
  year: 2016
  ident: MSE_768_7_072106bib12
  article-title: Parallel fuzzy C-means clustering algorithm in Spark[J]
  publication-title: Journal of Computer Applications
– volume: 40
  start-page: 1390
  year: 2019
  ident: MSE_768_7_072106bib2
  article-title: Image segmentation algorithm based on improved genetic fuzzy clustering and level set[J]
  publication-title: Computer Engineering and Design
– volume: 55
  start-page: 16
  year: 2019
  ident: MSE_768_7_072106bib5
  article-title: Fuzzy Clustering Based on Adaptive Bat Algorithm Optimization and Its Application[J]
  publication-title: Computer Engineering and Applications
– volume: 29
  start-page: 130
  year: 2019
  ident: MSE_768_7_072106bib13
  article-title: Implementation and Application of Fuzzy Clustering Algorithm Based on Spark[J]
  publication-title: Computer Technology and Development
– start-page: 1027
  year: 2007
  ident: MSE_768_7_072106bib7
  article-title: k-means++:The advantages of careful seeding[C]
– start-page: 169
  year: 2000
  ident: MSE_768_7_072106bib6
  article-title: Efficient clustering of high-dimensional data sets with application to reference matching[A]
SSID ssj0067440
Score 2.205334
Snippet In the field of big data, fuzzy clustering algorithm is one of the most widely used clustering algorithm. Although fuzzy c-means (FCM) algorithm performs well,...
SourceID proquest
crossref
iop
SourceType Aggregation Database
Enrichment Source
Index Database
Publisher
StartPage 72106
SubjectTerms Algorithms
Clustering
Optimization
Parallel processing
SummonAdditionalLinks – databaseName: Engineering Database
  dbid: M7S
  link: http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwpV1LSwMxEA5aPejBt1itEsFbWbbdtEn2JFVaBF8FFXoL2Tys0HZrH4L-eifbrLYIevC4yczukknmm7y-QegsstbwJKkG9YSxoEYrOogtowFgkQFHqZXSGc_sDbu7451O3PYLbmN_rDL3iZmj1qlya-QhQDOgE4BXdD58DVzWKLe76lNoLKMVx5JQzY7uPeSemDryu-xCZB08ccyr-Q1hmPT5srgTQrwdstDRhLmsR3PgtPySDn946Ax2Wpv__eEttOEDTtyY9ZBttGQGO2h9joZwF3XvwW_0_YVMLAcat-XI5Vjp5WWpxa3px8c7vuxNHbMCqOFG7xk-N-n28QUgocYgBsEknq1SwPN13wAMlstzOnvoqdV8vLwKfAaGQMFAp0GSKAn4FUMYziNiokhCeKuI4UxLKqnVkltm6pqCRa3iViuqLTQlqUmiWSLJPioM0oE5QJiSiqxJS8CnsBpIccMqVckZJ0TF8KIiqudNL5SnJ3dZMnoi2ybnXDiTCWcyASYTTMxMVkThl95wRtDxp0YZLCv8WB3_KX26IH370FyoF0Nti6iUd4FvwW_7H_5efYTWIjeJzw62lVBhMpqaY7Sq3iYv49FJ1qM_AfBW-TA
  priority: 102
  providerName: ProQuest
Title Optimization and Parallelization of Fuzzy Clustering Algorithm Based on the Improved Kmeans++ Clustering
URI https://iopscience.iop.org/article/10.1088/1757-899X/768/7/072106
https://www.proquest.com/docview/2562150152
Volume 768
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVIOP
  databaseName: Institute of Physics Open Access Journal Titles
  customDbUrl:
  eissn: 1757-899X
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0067440
  issn: 1757-8981
  databaseCode: O3W
  dateStart: 20090201
  isFulltext: true
  titleUrlDefault: http://iopscience.iop.org/
  providerName: IOP Publishing
– providerCode: PRVPQU
  databaseName: Materials Science Database (Proquest)
  customDbUrl:
  eissn: 1757-899X
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0067440
  issn: 1757-8981
  databaseCode: KB.
  dateStart: 20090201
  isFulltext: true
  titleUrlDefault: http://search.proquest.com/materialsscijournals
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: ProQuest Central
  customDbUrl:
  eissn: 1757-899X
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0067440
  issn: 1757-8981
  databaseCode: BENPR
  dateStart: 20090201
  isFulltext: true
  titleUrlDefault: https://www.proquest.com/central
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: ProQuest Engineering Database (NC LIVE)
  customDbUrl:
  eissn: 1757-899X
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0067440
  issn: 1757-8981
  databaseCode: M7S
  dateStart: 20090201
  isFulltext: true
  titleUrlDefault: http://search.proquest.com
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: Publicly Available Content Database (ProQuest)
  customDbUrl:
  eissn: 1757-899X
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0067440
  issn: 1757-8981
  databaseCode: PIMPY
  dateStart: 20090201
  isFulltext: true
  titleUrlDefault: http://search.proquest.com/publiccontent
  providerName: ProQuest
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1bT9swFD7isofxwGBsolwqI_FWhbRxazuPFLUCASUam1aeLMeXFak3lXbS-PUc58Kopgkh8ZbE5zjOsX2-k-T4M8Bx5JwVadoIWinnQZPVTRA7zgLEIouO0mhtMp7ZK97riX4_TopswmwtzGRauP4TPMyJgnMTFglxIkTAQ8cax_0QQ-WQh57hy5Nur1OBYI5D-ob-LJ0x8_x32ZrITEc0ykXC_61nCZ9WsQ3_OOkMebqf3qHNW7BZhJ3kNFfYhhU7_gwbL8gId2Bwg95jVCzLJGpsSKJmfqeVYXlt4kh38fj4h5wNF55fAdXI6fDXZHY_H4xIG_HQEBTDkJLk3yrw_HJkEQxrtRc6X-BHt_P97Dwo9mEINE53FqSpVohiMQbjIqI2ihQGuZpawY1iijmjhOO2ZRj2q9PCGc2MwyekTUUNTxX9CmvjydjuAmG0rprKUfQsvIlSwvJ6QwkuKNUxVlSBVml9qQuScr9XxlBmP8uFkN6S0ltSoiUll7klKxA-601zmo5XNWrYWbKYsQ-vSh8tSV_fdpbK5dS4ChyUA-WvIEaUGFRhzBXtvemG-_Ax8m_2WbbbAazNZwt7CB_07_n9w6wK6-1OL_lWhdXL9knV56veVrNxjyXJxXVy9wQasPvH
linkProvider IOP Publishing
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V1JbxMxFH5qUyTKgb1qSgEjwSkaTeKZ2J4DQqU0apSFSBQpnIzHS1spG1mK2h_Fb-R5FpoKiZ564Dj2szWe-eb7nj32ewBvqXNWpGkjaKacBzGrmyBxnAWoRRaJ0mhtsjizXd7vi-EwGWzAr_IsjN9WWXJiRtRmqv0aeYjSjOqE4kU_zH4EPmuU_7taptDIYdGxlz9xyrZ43_6E7_cdpa2jk8PjoMgqEGgELwvSVCvk5ARdS0EjS6lCl01HVnCjmGLOKOG4bRqGd-m0cEYz4-qcRrGKDE9VhP1uwlbswV6BrUG7N_hWcj_z4fayI5hN5P5ENMozyTjNLMqSYYgefshDH5jM51lak8PN8-nsL03IhK716H97RI_hYeFSk4P8G3gCG3byFB6sBVp8BmefkRnHxZFToiaGDNTcZ5EZlWVTR1qrq6tLcjha-dgR2IwcjE5xeMuzMfmIWm8ImqG7TPJ1GLzujC0Kfa221uY5fL2Toe5AZTKd2F0gLKqrWLkIWZPHaCUsrzeU4CKKdIIdVaFZvmqpiwDsPg_ISGYbAYSQHiLSQ0QiRCSXOUSqEP5pN8tDkNzaooZIkgUbLW61fnPDuvfl6Ea9nBlXhf0ScteG13jb-3f1a7h_fNLrym6733kB29QvWWTb-Pahspyv7Eu4py-W54v5q-J7IvD9rvH5G13HWqk
linkToPdf http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1ZT-MwEB5xCS0Py7miLAtG4q0KaePWdh5ZthUIKJUA0TfL8QFIvVRaJPj1jHMA1QohJN5yzCTOjD3zJfF8BtiPnLMiSapBPeE8qLGKCWLHWYC5yGKgNFqblGf2jLdaotOJ2zPQeK2FGQzz0H-AmxlRcGbCfEKcCDHhYWCN406IUDnkoWf4qrBwaNwszHu2Et-5L-hNEZCZ58BL6yJTPVEtCoU_vNZUjprFdvwXqNPs01z-pnavwM8cfpLDTGkVZmx_DZbekRKuw90FRpFeXp5JVN-Qthr5FVe6xbGBI83J8_MTOepOPM8CqpHD7u1gdD--65G_mBcNQTGEliT7ZoH7pz2LSbFcfqezAdfNxtXRcZCvxxBoHPYsSBKtMJvFCMpFRG0UKQS7mlrBjWKKOaOE47ZuGPrXaeGMZsbhU9KaooYniv6Cuf6gbzeBMFpRNeUoRhheQylheaWqBBeU6hgvVIJ64QGpc7Jyv2ZGV6Y_zYWQ3prSW1OiNSWXmTVLEL7qDTO6jk81yugwmY_ch0-l96akzy8bU-clerME20VneRNEZIngCrFXtPWlG-7CYvtfU56dtE5_w4_Iv-ynE-C2YW48mtg_sKAfx_cPo520y78Awyb8Dw
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Optimization+and+Parallelization+of+Fuzzy+Clustering+Algorithm+Based+on+the+Improved+Kmeans%2B%2B+Clustering&rft.jtitle=IOP+conference+series.+Materials+Science+and+Engineering&rft.au=Ma%2C+Yu&rft.au=Cheng%2C+Wenjuan&rft.date=2020-03-01&rft.pub=IOP+Publishing&rft.issn=1757-8981&rft.eissn=1757-899X&rft.volume=768&rft.issue=7&rft_id=info:doi/10.1088%2F1757-899X%2F768%2F7%2F072106&rft.externalDocID=MSE_768_7_072106
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1757-8981&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1757-8981&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1757-8981&client=summon