Chinese Natural Language Processing: From Text Categorization to Machine Translation

The level and volume of automatic computerized processing of linguistic information has become one of the most important criteria for measuring whether a country has entered the information society. The study begins with statistical linguistics and aims to process complicated Chinese information. In...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:Applied mathematics and nonlinear sciences Ročník 9; číslo 1
Hlavný autor: Peng, Haitao
Médium: Journal Article
Jazyk:English
Vydavateľské údaje: Beirut Sciendo 01.01.2024
De Gruyter Brill Sp. z o.o., Paradigm Publishing Services
Predmet:
ISSN:2444-8656, 2444-8656
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Popis
Shrnutí:The level and volume of automatic computerized processing of linguistic information has become one of the most important criteria for measuring whether a country has entered the information society. The study begins with statistical linguistics and aims to process complicated Chinese information. In this paper, after establishing the word database of the Chinese language, the language model is smoothed and compressed, the Chinese character information and Chinese language information are extracted, and the processing of Chinese grammar and Chinese semantic information is emphasized. Among them, Chinese grammar processing includes Chinese word analysis and basic phrase analysis based on the maximum entropy model, and Chinese semantic processing includes Bayesian-based word sense disambiguation, semantic role labeling based on the conditional random field model, and thesaurus-based semantic similarity calculation method. In addition, SECTILE-based Chinese text categorization and statistical linguistics-based machine translation methods are explored and tested for their effectiveness in Chinese natural language processing. The results show that the total average check accuracy and check the completeness of Chinese text are 78.65% and 72.24%, respectively, and the BLEU values of the translation methods are improved by [1.62,3.73] and [0.93,5.01] compared with the Baseline method, which is able to process the Chinese information accurately. The research plays an important role in the process of information processing based on Chinese language processing.
Bibliografia:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:2444-8656
2444-8656
DOI:10.2478/amns-2024-1860