Text feature selection algorithm based on Chi-square rank correlation factorization
Feather Selection is an effective method to reduce the dimension of text feature. The existing feature selection methods usually use empirical estimation methods when determining the scale of the feature selection. These methods have achieved good results in some specific corpora. However, it is not...
Uloženo v:
| Vydáno v: | Journal of interdisciplinary mathematics Ročník 20; číslo 1; s. 153 - 160 |
|---|---|
| Hlavní autor: | |
| Médium: | Journal Article |
| Jazyk: | angličtina |
| Vydáno: |
Taylor & Francis
02.01.2017
|
| Témata: | |
| ISSN: | 0972-0502, 2169-012X |
| On-line přístup: | Získat plný text |
| Tagy: |
Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
|
| Shrnutí: | Feather Selection is an effective method to reduce the dimension of text feature. The existing feature selection methods usually use empirical estimation methods when determining the scale of the feature selection. These methods have achieved good results in some specific corpora. However, it is not easy to promote the further generalization of automatic text categorization due to the insufficient theoretical basis. Therefore, a text feature selection algorithm based on Chi-square rank correlation factorization is proposed based on the comprehensive consideration of the whole and local distribution of text features. Under the condition that the algorithm does not need any prior knowledge, the feature weights are portrayed and the feature selection is completed, fully reflects the characteristics of the probability distribution. |
|---|---|
| ISSN: | 0972-0502 2169-012X |
| DOI: | 10.1080/09720502.2016.1259769 |