Application of locally linear embedding algorithm on hotel data text classification

As a non-linear dimension reduction method, manifold learning algorithm projects high-dimensional input to a low-dimensional space by maintaining the local structure of the data, and discovers the inherent geometric structure hidden in the data. In this paper, we attempt to apply the manifold learni...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:Journal of physics. Conference series Ročník 1634; číslo 1; s. 12014 - 12019
Hlavný autor: Huang, Jinming
Médium: Journal Article
Jazyk:English
Vydavateľské údaje: Bristol IOP Publishing 01.09.2020
Predmet:
ISSN:1742-6588, 1742-6596
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Popis
Shrnutí:As a non-linear dimension reduction method, manifold learning algorithm projects high-dimensional input to a low-dimensional space by maintaining the local structure of the data, and discovers the inherent geometric structure hidden in the data. In this paper, we attempt to apply the manifold learning algorithm to the field of Chinese text classification, and use the locally linear embedding algorithm to reduce the dimension of the ctrip hotel review data set. Then, we utilize extreme gradient boosting (XGBoost) and logistic regression to classify the text. Experimental results show that it is effective and feasible to use manifold learning algorithm for text classification. Moreover, the classification effect of logistic regression is better than XGBoost in the text classification of hotel reviews.
Bibliografia:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:1742-6588
1742-6596
DOI:10.1088/1742-6596/1634/1/012014