Decision Tree Algorithm Considering Distances Between Classes
Decision tree algorithm (DT) is a commonly used data mining method for classification and regression. DT repeatedly divides a dataset into pure subsets based on impurity measurements such as entropy and Gini. Then relatively "pure" partitions consisting of observations with the (almost) sa...
Saved in:
| Published in: | IEEE access Vol. 10; p. 1 |
|---|---|
| Main Authors: | , , , |
| Format: | Journal Article |
| Language: | English |
| Published: |
Piscataway
IEEE
2022
The Institute of Electrical and Electronics Engineers, Inc. (IEEE) |
| Subjects: | |
| ISSN: | 2169-3536, 2169-3536 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Decision tree algorithm (DT) is a commonly used data mining method for classification and regression. DT repeatedly divides a dataset into pure subsets based on impurity measurements such as entropy and Gini. Then relatively "pure" partitions consisting of observations with the (almost) same class are obtained. Gini index is one of the representative indices for measuring the impurity of data. However, the Gini index does not take into account distances between classes. If the distances between classes are considered when measuring impurity, the decision tree algorithm can distinguish clearly observations with different classes. To the end, a new decision tree algorithm based on Rao-Stirling index is proposed considering distances between classes. Rao-Stirling index considers distances between classes in such a way that weights more to pairs of references in more distant classes when measuring data impurity. Experimental results indicate that the proposed method is superior in terms of accuracy, implying that considering the distances between classes can help improve accuracy in DT. |
|---|---|
| Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ISSN: | 2169-3536 2169-3536 |
| DOI: | 10.1109/ACCESS.2022.3187172 |