Unbalanced Data Classification Based on Hesitant Fuzzy Decision Tree
In order to optimize the classification effect of unbalanced data,an improved fuzzy decision tree algorithm is proposed combining the hesitant fuzzy set theory and the decision tree algorithm.The unbalanced data is oversampled by the SMOTE algorithm,the cluster center point of each attribute is obta...
Uloženo v:
| Vydáno v: | Ji suan ji gong cheng Ročník 45; číslo 8; s. 75 - 79,91 |
|---|---|
| Hlavní autor: | |
| Médium: | Journal Article |
| Jazyk: | čínština angličtina |
| Vydáno: |
Editorial Office of Computer Engineering
01.08.2019
|
| Témata: | |
| ISSN: | 1000-3428 |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Shrnutí: | In order to optimize the classification effect of unbalanced data,an improved fuzzy decision tree algorithm is proposed combining the hesitant fuzzy set theory and the decision tree algorithm.The unbalanced data is oversampled by the SMOTE algorithm,the cluster center point of each attribute is obtained by using the K-means clustering method,and the datasets is fuzzy processed by using two different membership functions.On this basis,the Hesitant Fuzzy Information Gain(HFIG) of each attribute is obtained by the information energy of hesitant fuzzy sets and membership functions.The largest HFIG is used to replace the FIG in the Fuzzy ID3 algorithm as the split criterion of the attribute,and a Hesitant Fuzzy Decision Tree(HFDT) model is constructed for unbalanced data classification.Experimental results show that,compared with traditional classification algorithms such as C4.5,KNN and random forest,the classifier based on HFDT has an average increase of 12.6% on the AUC evaluation index. |
|---|---|
| ISSN: | 1000-3428 |
| DOI: | 10.19678/j.issn.1000-3428.0051759 |