Text feature selection algorithm based on Chi-square rank correlation factorization

Feather Selection is an effective method to reduce the dimension of text feature. The existing feature selection methods usually use empirical estimation methods when determining the scale of the feature selection. These methods have achieved good results in some specific corpora. However, it is not...

Full description

Saved in:
Bibliographic Details
Published in:Journal of interdisciplinary mathematics Vol. 20; no. 1; pp. 153 - 160
Main Author: Li, Yan-Hong
Format: Journal Article
Language:English
Published: Taylor & Francis 02.01.2017
Subjects:
ISSN:0972-0502, 2169-012X
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Feather Selection is an effective method to reduce the dimension of text feature. The existing feature selection methods usually use empirical estimation methods when determining the scale of the feature selection. These methods have achieved good results in some specific corpora. However, it is not easy to promote the further generalization of automatic text categorization due to the insufficient theoretical basis. Therefore, a text feature selection algorithm based on Chi-square rank correlation factorization is proposed based on the comprehensive consideration of the whole and local distribution of text features. Under the condition that the algorithm does not need any prior knowledge, the feature weights are portrayed and the feature selection is completed, fully reflects the characteristics of the probability distribution.
ISSN:0972-0502
2169-012X
DOI:10.1080/09720502.2016.1259769