Decision Tree Algorithm Considering Distances Between Classes

Decision tree algorithm (DT) is a commonly used data mining method for classification and regression. DT repeatedly divides a dataset into pure subsets based on impurity measurements such as entropy and Gini. Then relatively "pure" partitions consisting of observations with the (almost) sa...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:IEEE access Ročník 10; s. 1
Hlavní autoři: Lee, Sangyong, Lee, ChulHee, Mun, Kwon Gi, Kim, Dohyun
Médium: Journal Article
Jazyk:angličtina
Vydáno: Piscataway IEEE 2022
The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Témata:
ISSN:2169-3536, 2169-3536
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Decision tree algorithm (DT) is a commonly used data mining method for classification and regression. DT repeatedly divides a dataset into pure subsets based on impurity measurements such as entropy and Gini. Then relatively "pure" partitions consisting of observations with the (almost) same class are obtained. Gini index is one of the representative indices for measuring the impurity of data. However, the Gini index does not take into account distances between classes. If the distances between classes are considered when measuring impurity, the decision tree algorithm can distinguish clearly observations with different classes. To the end, a new decision tree algorithm based on Rao-Stirling index is proposed considering distances between classes. Rao-Stirling index considers distances between classes in such a way that weights more to pairs of references in more distant classes when measuring data impurity. Experimental results indicate that the proposed method is superior in terms of accuracy, implying that considering the distances between classes can help improve accuracy in DT.
Bibliografie:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:2169-3536
2169-3536
DOI:10.1109/ACCESS.2022.3187172