Application of locally linear embedding algorithm on hotel data text classification

As a non-linear dimension reduction method, manifold learning algorithm projects high-dimensional input to a low-dimensional space by maintaining the local structure of the data, and discovers the inherent geometric structure hidden in the data. In this paper, we attempt to apply the manifold learni...

Celý popis

Uloženo v:
Podrobná bibliografie
Vydáno v:Journal of physics. Conference series Ročník 1634; číslo 1; s. 12014 - 12019
Hlavní autor: Huang, Jinming
Médium: Journal Article
Jazyk:angličtina
Vydáno: Bristol IOP Publishing 01.09.2020
Témata:
ISSN:1742-6588, 1742-6596
On-line přístup:Získat plný text
Tagy: Přidat tag
Žádné tagy, Buďte první, kdo vytvoří štítek k tomuto záznamu!
Abstract As a non-linear dimension reduction method, manifold learning algorithm projects high-dimensional input to a low-dimensional space by maintaining the local structure of the data, and discovers the inherent geometric structure hidden in the data. In this paper, we attempt to apply the manifold learning algorithm to the field of Chinese text classification, and use the locally linear embedding algorithm to reduce the dimension of the ctrip hotel review data set. Then, we utilize extreme gradient boosting (XGBoost) and logistic regression to classify the text. Experimental results show that it is effective and feasible to use manifold learning algorithm for text classification. Moreover, the classification effect of logistic regression is better than XGBoost in the text classification of hotel reviews.
AbstractList As a non-linear dimension reduction method, manifold learning algorithm projects high-dimensional input to a low-dimensional space by maintaining the local structure of the data, and discovers the inherent geometric structure hidden in the data. In this paper, we attempt to apply the manifold learning algorithm to the field of Chinese text classification, and use the locally linear embedding algorithm to reduce the dimension of the ctrip hotel review data set. Then, we utilize extreme gradient boosting (XGBoost) and logistic regression to classify the text. Experimental results show that it is effective and feasible to use manifold learning algorithm for text classification. Moreover, the classification effect of logistic regression is better than XGBoost in the text classification of hotel reviews.
Author Huang, Jinming
Author_xml – sequence: 1
  givenname: Jinming
  surname: Huang
  fullname: Huang, Jinming
  email: jinminghhh@163.com
  organization: College of mathematics and systems science, Shandong university of science and technology , China
BookMark eNqFkF1LwzAUhoNMcJv-BgPeCXVJ0zTN5Rh-MlCYXoc0H1tH1tQmA_fvbemYCILnJgfyvO-BZwJGta8NANcY3WFUFDPMsjTJKc9nOCfZDM8QThHOzsD49DM67UVxASYhbBEi3bAxWM2bxlVKxsrX0FvovJLOHaCraiNbaHal0bqq11C6tW-ruNnBDtz4aBzUMkoYzVeEyskQKnvsuQTnVrpgro7vFHw83L8vnpLl6-PzYr5MFKE8JimxhSKl1pyQDPEM67zktOC5pFpTyTGzJVGmUFRjnlsljeaqpCa3ZWaZomQKbobepvWfexOi2Pp9W3cnRUoZ4oxlhHcUGyjV-hBaY0XTVjvZHgRGojcoejei9yR6gwKLwWCXJEOy8s1P9f-p2z9SL2-L1W9QNNqSb_a_g34
Cites_doi 10.1126/science.290.5500.2319
10.3724/SP.J.1087.2010.02917
10.1126/science.290.5500.2323
ContentType Journal Article
Copyright Published under licence by IOP Publishing Ltd
2020. This work is published under http://creativecommons.org/licenses/by/3.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Copyright_xml – notice: Published under licence by IOP Publishing Ltd
– notice: 2020. This work is published under http://creativecommons.org/licenses/by/3.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
DBID O3W
TSCCA
AAYXX
CITATION
8FD
8FE
8FG
ABUWG
AFKRA
ARAPS
AZQEC
BENPR
BGLVJ
CCPQU
DWQXO
H8D
HCIFZ
L7M
P5Z
P62
PHGZM
PHGZT
PIMPY
PKEHL
PQEST
PQGLB
PQQKQ
PQUKI
PRINS
DOI 10.1088/1742-6596/1634/1/012014
DatabaseName Institute of Physics Open Access Journals (Activated by CARLI)
IOPscience (Open Access)
CrossRef
Technology Research Database
ProQuest SciTech Collection
ProQuest Technology Collection
ProQuest Central (Alumni)
ProQuest Central UK/Ireland
Advanced Technologies & Computer Science Collection
ProQuest Central Essentials
ProQuest Central
Technology Collection
ProQuest One
ProQuest Central Korea
Aerospace Database
SciTech Premium Collection
Advanced Technologies Database with Aerospace
Advanced Technologies & Aerospace Database
ProQuest Advanced Technologies & Aerospace Collection
ProQuest Central Premium
ProQuest One Academic (New)
Publicly Available Content Database
ProQuest One Academic Middle East (New)
ProQuest One Academic Eastern Edition (DO NOT USE)
ProQuest One Applied & Life Sciences
ProQuest One Academic (retired)
ProQuest One Academic UKI Edition
ProQuest Central China
DatabaseTitle CrossRef
Publicly Available Content Database
Advanced Technologies & Aerospace Collection
Technology Collection
Technology Research Database
ProQuest One Academic Middle East (New)
ProQuest Advanced Technologies & Aerospace Collection
ProQuest Central Essentials
ProQuest One Academic Eastern Edition
ProQuest Central (Alumni Edition)
SciTech Premium Collection
ProQuest One Community College
ProQuest Technology Collection
ProQuest SciTech Collection
ProQuest Central China
ProQuest Central
Advanced Technologies & Aerospace Database
ProQuest One Applied & Life Sciences
Aerospace Database
ProQuest One Academic UKI Edition
ProQuest Central Korea
ProQuest Central (New)
ProQuest One Academic
Advanced Technologies Database with Aerospace
ProQuest One Academic (New)
DatabaseTitleList
CrossRef
Publicly Available Content Database
Database_xml – sequence: 1
  dbid: O3W
  name: IOPscience
  url: http://iopscience.iop.org/
  sourceTypes:
    Enrichment Source
    Publisher
– sequence: 2
  dbid: PIMPY
  name: Publicly Available Content Database
  url: http://search.proquest.com/publiccontent
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Discipline Physics
DocumentTitleAlternate Application of locally linear embedding algorithm on hotel data text classification
EISSN 1742-6596
ExternalDocumentID 10_1088_1742_6596_1634_1_012014
JPCS_1634_1_012014
GroupedDBID 1JI
29L
2WC
4.4
5B3
5GY
5PX
5VS
7.Q
AAJIO
AAJKP
ABHWH
ACAFW
ACHIP
AEFHF
AEJGL
AFKRA
AFYNE
AIYBF
AKPSB
ALMA_UNASSIGNED_HOLDINGS
ARAPS
ASPBG
ATQHT
AVWKF
AZFZN
BENPR
BGLVJ
CCPQU
CEBXE
CJUJL
CRLBU
CS3
DU5
E3Z
EBS
EDWGO
EQZZN
F5P
FRP
GROUPED_DOAJ
GX1
HCIFZ
HH5
IJHAN
IOP
IZVLO
J9A
KNG
KQ8
LAP
N5L
N9A
O3W
OK1
P2P
PIMPY
PJBAE
RIN
RNS
RO9
ROL
SY9
T37
TR2
TSCCA
UCJ
W28
XSB
~02
AAYXX
AEINN
AFFHD
CITATION
OVT
PHGZM
PHGZT
PQGLB
8FD
8FE
8FG
ABUWG
AZQEC
DWQXO
H8D
L7M
P62
PKEHL
PQEST
PQQKQ
PQUKI
PRINS
PUEGO
ID FETCH-LOGICAL-c359t-23f8c3bdd93340941d6b95896a5dd5a917fb3ce8c5d196fcaed9cb5e6fb4f7c53
IEDL.DBID O3W
ISICitedReferencesCount 0
ISICitedReferencesURI http://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=Summon&SrcAuth=ProQuest&DestLinkType=CitingArticles&DestApp=WOS_CPL&KeyUT=000631259100014&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
ISSN 1742-6588
IngestDate Sun Sep 07 03:47:38 EDT 2025
Sat Nov 29 04:52:28 EST 2025
Thu Jan 07 14:56:17 EST 2021
Wed Aug 21 03:38:33 EDT 2024
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 1
Language English
License Content from this work may be used under the terms of the Creative Commons Attribution 3.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c359t-23f8c3bdd93340941d6b95896a5dd5a917fb3ce8c5d196fcaed9cb5e6fb4f7c53
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
OpenAccessLink https://iopscience.iop.org/article/10.1088/1742-6596/1634/1/012014
PQID 2570977439
PQPubID 4998668
PageCount 6
ParticipantIDs iop_journals_10_1088_1742_6596_1634_1_012014
proquest_journals_2570977439
crossref_primary_10_1088_1742_6596_1634_1_012014
PublicationCentury 2000
PublicationDate 20200901
PublicationDateYYYYMMDD 2020-09-01
PublicationDate_xml – month: 09
  year: 2020
  text: 20200901
  day: 01
PublicationDecade 2020
PublicationPlace Bristol
PublicationPlace_xml – name: Bristol
PublicationTitle Journal of physics. Conference series
PublicationTitleAlternate J. Phys.: Conf. Ser
PublicationYear 2020
Publisher IOP Publishing
Publisher_xml – name: IOP Publishing
References Zhang (JPCS_1634_1_012014bib5) 2014; 43
Tenenbaum (JPCS_1634_1_012014bib7) 2000; 290
Ren (JPCS_1634_1_012014bib3) 2012; 39
Gao (JPCS_1634_1_012014bib6) 2009; 36
Qun (JPCS_1634_1_012014bib10) 2018; 5
Roweis (JPCS_1634_1_012014bib8) 2000; 290
Ramanathan (JPCS_1634_1_012014bib12) 2012
Yang (JPCS_1634_1_012014bib2) 2009; 37
Shi (JPCS_1634_1_012014bib9) 2010; 30
Wang (JPCS_1634_1_012014bib4) 2012; 42
Chen (JPCS_1634_1_012014bib11) 2016
Chen (JPCS_1634_1_012014bib1) 2006; 25
References_xml – volume: 36
  start-page: 25
  year: 2009
  ident: JPCS_1634_1_012014bib6
  article-title: Problems and analysis in manifold learning
  publication-title: Computer Science
– volume: 37
  start-page: 557
  year: 2009
  ident: JPCS_1634_1_012014bib2
  article-title: Text manifold based on Semantic Analysis
  publication-title: Acta Electronica Sinica
– volume: 25
  start-page: 690
  year: 2006
  ident: JPCS_1634_1_012014bib1
  article-title: Literature review of feature dimension reduction in text Categorization
  publication-title: Journal of the China Society for Scientific and Technical Information
– volume: 42
  start-page: 8
  year: 2012
  ident: JPCS_1634_1_012014bib4
  article-title: The manifold learning algorithm’s application in the Chinese text clustering
  publication-title: Journal of Shandong University (Engineering Science)
– start-page: 785
  year: 2016
  ident: JPCS_1634_1_012014bib11
  article-title: XGboost: A Scalable Tree Boosting System
– volume: 290
  start-page: 2319
  year: 2000
  ident: JPCS_1634_1_012014bib7
  article-title: A global geometric framework for nonlinear dimensionality reduction
  publication-title: Science
  doi: 10.1126/science.290.5500.2319
– start-page: 102
  year: 2012
  ident: JPCS_1634_1_012014bib12
  article-title: Phishing Website Detection Using Latent Dirichlet Allocation and AdaBoost
– volume: 43
  start-page: 1
  year: 2014
  ident: JPCS_1634_1_012014bib5
  article-title: Text categorization algorithm based on non-linear manifold learning and k-NN
  publication-title: Journal of Shandong University (Engineering Science)
– volume: 5
  start-page: 70
  year: 2018
  ident: JPCS_1634_1_012014bib10
  article-title: Research and implementation of based on the logistic regression model for tibetan text classification
  publication-title: China Computer&Communication
– volume: 30
  start-page: 2917
  year: 2010
  ident: JPCS_1634_1_012014bib9
  article-title: Manifold learning algorithm Based on the small world model
  publication-title: Journal of Computer Applications
  doi: 10.3724/SP.J.1087.2010.02917
– volume: 290
  start-page: 2323
  year: 2000
  ident: JPCS_1634_1_012014bib8
  article-title: Nonlinear dimensionality reduction by locally linear embedding
  publication-title: Science
  doi: 10.1126/science.290.5500.2323
– volume: 39
  start-page: 261
  year: 2012
  ident: JPCS_1634_1_012014bib3
  article-title: Text categorization algorithm based on manifold learning and support vector machines
  publication-title: Computer Science
SSID ssj0033337
Score 2.233805
Snippet As a non-linear dimension reduction method, manifold learning algorithm projects high-dimensional input to a low-dimensional space by maintaining the local...
SourceID proquest
crossref
iop
SourceType Aggregation Database
Index Database
Enrichment Source
Publisher
StartPage 12014
SubjectTerms Algorithms
Classification
Embedding
Hotels & motels
Machine learning
Manifolds (mathematics)
Physics
Text categorization
SummonAdditionalLinks – databaseName: Publicly Available Content Database
  dbid: PIMPY
  link: http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1La9wwEB6aTQu99F26aVoEzbHCD1m2dCohNDShDQtpID0JWSM1gc16u-sG8u-j8YNlKbSX-mgPAvuT5uH5ZgbgQGZ1qqvguM5DxQuskavCI88qXwYslE19N7Xka3V2pi4v9Wwoj14PtMpRJ3aKuu_2TLztqIQTbBz9MU9o9hp5LkJ_Wv7iNEOKcq3DQI0d2KXGW-kEdmcn32Y_Rs0s4lX1BZI5j5ZXjXyvGAQO93SZRAelSLKEikqzYsta7Vw3yz9UdmeHjp_-3zd4Bk8Gf5Qd9hvoOTzwixfwqOOFuvVLOD_cJLhZE1hn-uZ3jJxTu2L-pvZI1o_Z-c-4ent1w6LgVdP6OSPyKSNiCXPkohMnqVvnFVwcf_5-9IUPcxi4E1K3PBdBOVEjaiEoHMywrLVUurQSUdoY8IVaOK-cxHieg7MetatlBLsuQuWkeA2TRbPwb4ApW-rCEuciuhUZSoVlmVupSo2pCAKnkI7f2yz7dhumS5MrZQgiQxAZgshkpodoCh8jLmY4eut_i3_YEj-dHZ1vS5glhinsjxhuRDeQ7f398Vt4nFNY3lHR9mHSrn77d_DQ3bbX69X7YVfeA5lp7Cg
  priority: 102
  providerName: ProQuest
Title Application of locally linear embedding algorithm on hotel data text classification
URI https://iopscience.iop.org/article/10.1088/1742-6596/1634/1/012014
https://www.proquest.com/docview/2570977439
Volume 1634
WOSCitedRecordID wos000631259100014&url=https%3A%2F%2Fcvtisr.summon.serialssolutions.com%2F%23%21%2Fsearch%3Fho%3Df%26include.ft.matches%3Dt%26l%3Dnull%26q%3D
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVIOP
  databaseName: IOPscience
  customDbUrl:
  eissn: 1742-6596
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0033337
  issn: 1742-6588
  databaseCode: O3W
  dateStart: 20040101
  isFulltext: true
  titleUrlDefault: http://iopscience.iop.org/
  providerName: IOP Publishing
– providerCode: PRVPQU
  databaseName: Advanced Technologies & Aerospace Database
  customDbUrl:
  eissn: 1742-6596
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0033337
  issn: 1742-6588
  databaseCode: P5Z
  dateStart: 20040801
  isFulltext: true
  titleUrlDefault: https://search.proquest.com/hightechjournals
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: ProQuest Central
  customDbUrl:
  eissn: 1742-6596
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0033337
  issn: 1742-6588
  databaseCode: BENPR
  dateStart: 20040801
  isFulltext: true
  titleUrlDefault: https://www.proquest.com/central
  providerName: ProQuest
– providerCode: PRVPQU
  databaseName: Publicly Available Content Database
  customDbUrl:
  eissn: 1742-6596
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0033337
  issn: 1742-6588
  databaseCode: PIMPY
  dateStart: 20040801
  isFulltext: true
  titleUrlDefault: http://search.proquest.com/publiccontent
  providerName: ProQuest
link http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV3hS9wwFA9TJ-yLTqfs1ElAP1rv2jRp8lFFmWOeRTfm9iWkeYkK5_W4q4L_vXltDzlkiGA_lFJe0vBLk_d75Jc8QnZ5XPRU5m2kEp9FKRQQydRBFGdOeEil6bk6a8nPrN-XV1dqZi9MOWqn_v3w2BwU3EDYCuJkN3DoJBJciW7gEmk37uL-T8xlvcAk5yjrO2d_prMxC1fWbIrEQlJONV7_r2jGQ82FVryYpmvfc7L8Hq3-TJZa5kkPmhIr5IMbrpLFWgFqJ1_I5cHzUjYtPa2d3OCRIg01Y-ruCgfo56gZXJfj2-rmjgbDm7JyA4oyU4oSEmqRjKP6qK5njfw-Of519D1qMy5ElnFVRQnz0rICQDGGgV8MolBcKmE4ADchtPMFs05aDmHkemscKFvw0K1F6jPL2TqZH5ZD95VQaYRKDaorAoGIgUsQIjFcCgU95hl0SG-Ksh41B2voekFcSo1YacRKI1Y61g1WHbIX0NXtIJu8br4zY_4jP7qctdAj8B2yNe3cZ1NM7YfEmKmNt31zk3xKMCCvRWhbZL4a37tv5KN9qG4n422ycHjczy-26z803HP-L7zLT8_yv09gv-CI
linkProvider IOP Publishing
linkToHtml http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMw1V1Lb9QwEB6VLQguvBELLVgCbkSblxP7UKGqperS7WqlFqmcjOOxaaXtZtkNoP6p_kY8eWi1QoJTD-SYjKLY83ke9jcZgLc8KkKZOxPI2OVBigUGIrUYRLnNHKZCh7buWjLKx2NxdiYnG3Dd1cIQrbKzibWhxtLQHvmAuq1RrJLID_PvAXWNotPVroVGA4sje_XLp2zLneG-1--7OD74eLp3GLRdBQKTcFkFceKESQpEn8pTchNhVkguZKY5Itc-fXFFYqwwHD06ndEWpSm4__QidbmhLhHe5G-mHuxhDzYnw-PJl872J_7KmxLMOPC-XXSMMp9mtvdkNvAhUDqIBlS2GqVr_vDWRTn_wynUnu7gwf82Rw_hfhtTs91mETyCDTt7DHdqbqtZPoGT3dUhPSsdq9339IpRgK0XzF4WFsmDMz395kdTnV8yL3heVnbKiEDLiBzDDKUZxKuq3_MUPt_IiJ5Bb1bO7HNgQmcy1cQb8aFRhFxglsWai0ximLgE-xB2GlXz5pchqj7qF0IRCBSBQBEIVKQaEPThvde8as3H8t_ib9bEP032TtYl1BxdH7Y6lKxEVxB58ffHr-Hu4enxSI2G46OXcC-mbYaaWrcFvWrxw27DbfOzulguXrVrgMHXm4bUb7UpQJs
linkToPdf http://cvtisr.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1Nb9QwEB31AxAXKF_qtoVagiMhmzh27GPVdlWgWlYqiN4sx2PTStvNajdF4t_jSbJUK4QQEjnlMHas53jmWX7jAXgjsmqoy-ASnYcyKbDCRBUek6z0MmCh7NC3VUvOy_FYXV7qyQaMfuXC1PPe9b-Lr91FwR2EvSBOpZFD54kUWqaRSxRpllL-Z1akcwybsE3XlVAlg0_868oj8_iUXWIkNVRqpfP6c2drUWozjuQ3V93Gn9Hj_zXyHXjUM1B21LV6Aht-9hTut0pQt3wGF0d3R9qsDqwNdtMfjOioXTB_U3mkeMfs9Fu9uG6ublg0vKobP2UkN2UkJWGOSDmpkNp-nsOX0enn47Okr7yQOC50k-Q8KMcrRM05bQAzlJUWSksrEIWNW7xQceeVExhXcHDWo3aViNNbFaF0gr-ArVk987vAlJW6sKSyiEQiQ6FQytwKJTUOeeA4gOEKaTPvLtgw7cG4UobwMoSXIbxMZjq8BvA2Imz6xbb8u_nrNfMPk-OLdQsTJ2AAB6sJvjOlEn9EkLne-7dvHsKDycnInL8ff9yHhznt0Vtd2gFsNYtb_xLuue_N9XLxqv1RfwIdOeGb
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Application+of+locally+linear+embedding+algorithm+on+hotel+data+text+classification&rft.jtitle=Journal+of+physics.+Conference+series&rft.au=Huang%2C+Jinming&rft.date=2020-09-01&rft.issn=1742-6588&rft.eissn=1742-6596&rft.volume=1634&rft.issue=1&rft.spage=12014&rft_id=info:doi/10.1088%2F1742-6596%2F1634%2F1%2F012014&rft.externalDBID=n%2Fa&rft.externalDocID=10_1088_1742_6596_1634_1_012014
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1742-6588&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1742-6588&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1742-6588&client=summon