Integrating Compound Terms in Bayesian Text Classification
Text classification usually assumed a word-based document representation. In this paper, we propose a new approach to integrate compound terms in Bayesian text classification. Compound terms are used as complementary features to single words. An acute problem is to consider their dependence with the...
Saved in:
| Published in: | IEEE/WIC/ACM International Conference on web intelligence pp. 598 - 601 |
|---|---|
| Main Authors: | , , |
| Format: | Conference Proceeding |
| Language: | English |
| Published: |
Washington, DC, USA
IEEE Computer Society
19.09.2005
IEEE |
| Series: | ACM Conferences |
| Subjects: |
Computing methodologies
> Artificial intelligence
> Natural language processing
> Language resources
Computing methodologies
> Machine learning
> Learning paradigms
> Supervised learning
> Supervised learning by classification
|
| ISBN: | 076952415X, 9780769524153 |
| Online Access: | Get full text |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Text classification usually assumed a word-based document representation. In this paper, we propose a new approach to integrate compound terms in Bayesian text classification. Compound terms are used as complementary features to single words. An acute problem is to consider their dependence with the component words. In this paper, we propose to use smoothing techniques to combine both compound term and word representations. Experiments have been conducted on two corpora. Our results show that this approach can slightly but steadily improve the classification performance on both test corpora. |
|---|---|
| Bibliography: | SourceType-Conference Papers & Proceedings-1 ObjectType-Conference Paper-1 content type line 25 |
| ISBN: | 076952415X 9780769524153 |
| DOI: | 10.1109/WI.2005.79 |

