A New Approach to Define the Number of Clusters for Partitional Clustering Algorithms

‎Data clustering consists of grouping similar objects according to some characteristic‎. ‎In the literature‎, ‎there are several clustering algorithms‎, ‎among which stands out the Fuzzy C-Means (FCM)‎, ‎one of the most discussed algorithms‎, ‎being used in different applications‎. ‎Although it is a...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:Transactions on fuzzy sets and systems Ročník 3; číslo 1; s. 67 - 87
Hlavní autori: Huliane Silva, Benjamın Ren Callejas Bedregal, Anne Canuto, Thiago Batista, Ronildo Moura
Médium: Journal Article
Jazyk:English
Vydavateľské údaje: Islamic Azad University, Bandar Abbas Branch 01.05.2024
Predmet:
ISSN:2821-0131
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Popis
Shrnutí:‎Data clustering consists of grouping similar objects according to some characteristic‎. ‎In the literature‎, ‎there are several clustering algorithms‎, ‎among which stands out the Fuzzy C-Means (FCM)‎, ‎one of the most discussed algorithms‎, ‎being used in different applications‎. ‎Although it is a simple and easy to manipulate clustering method‎, ‎the FCM requires as its initial parameter the number of clusters‎. ‎Usually‎, ‎this information is unknown‎, ‎beforehand and this becomes a relevant problem in the data cluster analysis process‎. ‎In this context‎, ‎this work proposes a new methodology to determine the number of clusters of partitional algorithms‎, ‎using subsets of the original data in order to define the number of clusters‎. ‎This new methodology‎, ‎is intended to reduce the side effects of the cluster definition phase‎, ‎possibly making the processing time faster and decreasing the computational cost‎. ‎To evaluate the proposed methodology‎, ‎different cluster validation indices will be used to evaluate the quality of the clusters obtained by the FCM algorithms and some of its variants‎, ‎when applied to different databases‎. ‎Through the empirical analysis‎, ‎we can conclude that the results obtained in this article are promising‎, ‎both from an experimental point of view and from a statistical point of view‎.
ISSN:2821-0131
DOI:10.30495/tfss.2023.1990425.1078