Integrating Compound Terms in Bayesian Text Classification

Text classification usually assumed a word-based document representation. In this paper, we propose a new approach to integrate compound terms in Bayesian text classification. Compound terms are used as complementary features to single words. An acute problem is to consider their dependence with the...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:IEEE/WIC/ACM International Conference on web intelligence s. 598 - 601
Hlavní autoři: Bai, Jing, Nie, Jian-Yun, Cao, Guihong
Médium: Konferenční příspěvek
Jazyk:angličtina
Vydáno: Washington, DC, USA IEEE Computer Society 19.09.2005
IEEE
Edice:ACM Conferences
Témata:
ISBN:076952415X, 9780769524153
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Popis
Shrnutí:Text classification usually assumed a word-based document representation. In this paper, we propose a new approach to integrate compound terms in Bayesian text classification. Compound terms are used as complementary features to single words. An acute problem is to consider their dependence with the component words. In this paper, we propose to use smoothing techniques to combine both compound term and word representations. Experiments have been conducted on two corpora. Our results show that this approach can slightly but steadily improve the classification performance on both test corpora.
Bibliografie:SourceType-Conference Papers & Proceedings-1
ObjectType-Conference Paper-1
content type line 25
ISBN:076952415X
9780769524153
DOI:10.1109/WI.2005.79