Soft-ECM: An extension of Evidential C-Means for complex data

Uložené v:
Podrobná bibliografia
Názov: Soft-ECM: An extension of Evidential C-Means for complex data
Autori: Soubeiga, Armel, Guyet, Thomas, Antoine, Violaine
Prispievatelia: Guyet, Thomas
Zdroj: 2025 IEEE International Conference on Fuzzy Systems (FUZZ). :1-6
Publication Status: Preprint
Informácie o vydavateľovi: IEEE, 2025.
Rok vydania: 2025
Predmety: [INFO.INFO-AI] Computer Science [cs]/Artificial Intelligence [cs.AI], Machine Learning, FOS: Computer and information sciences, [INFO.INFO-DM] Computer Science [cs]/Discrete Mathematics [cs.DM], Artificial Intelligence (cs.AI), Discrete Mathematics (cs.DM), [SDV.SPEE] Life Sciences [q-bio]/Santé publique et épidémiologie, Artificial Intelligence, Discrete Mathematics, Machine Learning (cs.LG)
Popis: Clustering based on belief functions has been gaining increasing attention in the machine learning community due to its ability to effectively represent uncertainty and/or imprecision. However, none of the existing algorithms can be applied to complex data, such as mixed data (numerical and categorical) or non-tabular data like time series. Indeed, these types of data are, in general, not represented in a Euclidean space and the aforementioned algorithms make use of the properties of such spaces, in particular for the construction of barycenters. In this paper, we reformulate the Evidential C-Means (ECM) problem for clustering complex data. We propose a new algorithm, Soft-ECM, which consistently positions the centroids of imprecise clusters requiring only a semi-metric. Our experiments show that Soft-ECM present results comparable to conventional fuzzy clustering approaches on numerical data, and we demonstrate its ability to handle mixed data and its benefits when combining fuzzy clustering with semi-metrics such as DTW for time series data.
Druh dokumentu: Article
Conference object
Popis súboru: application/pdf
DOI: 10.1109/fuzz62266.2025.11152191
DOI: 10.48550/arxiv.2507.13417
Prístupová URL adresa: http://arxiv.org/abs/2507.13417
https://inria.hal.science/hal-05162452v1
Rights: STM Policy #29
arXiv Non-Exclusive Distribution
CC BY
Prístupové číslo: edsair.doi.dedup.....d7d8c83c895a4b75fd91c01fb53c402b
Databáza: OpenAIRE
Popis
Abstrakt:Clustering based on belief functions has been gaining increasing attention in the machine learning community due to its ability to effectively represent uncertainty and/or imprecision. However, none of the existing algorithms can be applied to complex data, such as mixed data (numerical and categorical) or non-tabular data like time series. Indeed, these types of data are, in general, not represented in a Euclidean space and the aforementioned algorithms make use of the properties of such spaces, in particular for the construction of barycenters. In this paper, we reformulate the Evidential C-Means (ECM) problem for clustering complex data. We propose a new algorithm, Soft-ECM, which consistently positions the centroids of imprecise clusters requiring only a semi-metric. Our experiments show that Soft-ECM present results comparable to conventional fuzzy clustering approaches on numerical data, and we demonstrate its ability to handle mixed data and its benefits when combining fuzzy clustering with semi-metrics such as DTW for time series data.
DOI:10.1109/fuzz62266.2025.11152191