Fuzzy Ordered c-Means Clustering and Least Angle Regression for Fuzzy Rule-Based Classifier: Study for Imbalanced Data

This article introduces a new classifier design method that is based on a modification of the traditional fuzzy clustering. First, a new fuzzy ordered <inline-formula><tex-math notation="LaTeX">c</tex-math></inline-formula>-means clustering is proposed. This method...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Veröffentlicht in:IEEE transactions on fuzzy systems Jg. 28; H. 11; S. 2799 - 2813
Hauptverfasser: Leski, Jacek M., Czabanski, Robert, Jezewski, Michal, Jezewski, Janusz
Format: Journal Article
Sprache:Englisch
Veröffentlicht: IEEE 01.11.2020
Schlagworte:
ISSN:1063-6706, 1941-0034
Online-Zugang:Volltext
Tags: Tag hinzufügen
Keine Tags, Fügen Sie den ersten Tag hinzu!
Beschreibung
Zusammenfassung:This article introduces a new classifier design method that is based on a modification of the traditional fuzzy clustering. First, a new fuzzy ordered <inline-formula><tex-math notation="LaTeX">c</tex-math></inline-formula>-means clustering is proposed. This method can be considered as a generalization of the concept of the conditional fuzzy clustering by introducing ordering and weighting distances from data to cluster prototypes. As a result, a more local impact of data on created groups and increased repulsive force between group prototypes are obtained. The proposed method provides a better representation of the data classes, in particular for classes with small cardinality in the training set (imbalanced data). A special initialization of the prototypes is also introduced. Next, the proposed clustering method is used to construct the premises of if-then rules of a fuzzy classifier. The conclusions of the rules are obtained by the least angle regression algorithm, which selects only those rules, that maximize the generalization ability of a classifier. Each if-then rule is represented in easily interpretable Mamdani-Assilian form. Finally, an extensive experimental analysis on 89 benchmark balanced and imbalanced datasets is performed to demonstrate the validity of the introduced classifier. Its competitiveness to state-of-the-art classifiers, with respect to both performance and interpretability, is shown as well.
ISSN:1063-6706
1941-0034
DOI:10.1109/TFUZZ.2019.2939989