Classification rule mining based on Pareto-based Multiobjective Optimization
This paper introduces a novel classification rule mining model based on Pareto-based Multiobjective Optimization called CRM-PM. The process of rule extraction is a challenging classification task in data mining since it has several constraints and conflicting objectives such as accuracy and comprehe...
Uložené v:
| Vydané v: | Applied soft computing Ročník 127; s. 109321 |
|---|---|
| Hlavní autori: | , |
| Médium: | Journal Article |
| Jazyk: | English |
| Vydavateľské údaje: |
Elsevier B.V
01.09.2022
|
| Predmet: | |
| ISSN: | 1568-4946, 1872-9681 |
| On-line prístup: | Získať plný text |
| Tagy: |
Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
|
| Shrnutí: | This paper introduces a novel classification rule mining model based on Pareto-based Multiobjective Optimization called CRM-PM. The process of rule extraction is a challenging classification task in data mining since it has several constraints and conflicting objectives such as accuracy and comprehensibility. In this study, this task is accepted as a multi-objective optimization problem. Classification accuracy and misclassification ratio are assigned as evaluation criteria. The candidate solutions are generated in the direction of a proposed strategy to determine optimal ranges of the attributes that form the rules. The proposed approach is applied on eight benchmark datasets (Iris Plants, Wine Quality, Glass Identification, Stat log (Heart), Haberman’s Survival, E-coli, Wisconsin Breast Cancer, and Pima Indians Diabetes) included in the University of California at Irvine machine learning repository. Furthermore, CRM-PM is run in three different validation modes: cross-validation, training without test data, and training with random splitting. Regarding experimental results, it can be said that the presented method has a promising capability for classification, and it achieves comparative or superior results.
[Display omitted]
•A novel CRM model based on multiobjective optimization, run fast, and has a high accuracy rate.•A stepwise rule extraction process.•A new flexible structure to determine the optimum ranges of each attribute in multi-class data sets.•A new running model depends on a removing strategy that evolves the classification accuracy.•A fair comparison with 3 different validations: cross-validation, training without test, and random split. |
|---|---|
| ISSN: | 1568-4946 1872-9681 |
| DOI: | 10.1016/j.asoc.2022.109321 |