Research on Keyword Extraction Algorithm in English Text Based on Cluster Analysis

How to facilitate users to quickly and accurately search for the text information they need is a current research hotspot. Text clustering can improve the efficiency of information search and is an effective text retrieval method. Keyword extraction and cluster center point selection are key issues...

Celý popis

Uložené v:
Podrobná bibliografia
Vydané v:Computational intelligence and neuroscience Ročník 2022; s. 1 - 8
Hlavný autor: Ma, Jingxia
Médium: Journal Article
Jazyk:English
Vydavateľské údaje: United States Hindawi 28.03.2022
John Wiley & Sons, Inc
Predmet:
ISSN:1687-5265, 1687-5273, 1687-5273
On-line prístup:Získať plný text
Tagy: Pridať tag
Žiadne tagy, Buďte prvý, kto otaguje tento záznam!
Popis
Shrnutí:How to facilitate users to quickly and accurately search for the text information they need is a current research hotspot. Text clustering can improve the efficiency of information search and is an effective text retrieval method. Keyword extraction and cluster center point selection are key issues in text clustering research. Common keyword extraction algorithms can be divided into three categories: semantic-based algorithms, machine learning-based algorithms, and statistical model-based algorithms. There are three common methods for selecting cluster centers: randomly selecting the initial cluster center point, manually specifying the cluster center point, and selecting the cluster center point according to the similarity between the points to be clustered. The randomly selected initial cluster center points may contain “outliers,” and the clustering results are locally optimal. Manually specifying the cluster center points will be very subjective because each person’s understanding of the text set is different, and it is not suitable for the case of a large number of text sets. Selecting the cluster center points according to the similarity between the points to be clustered can make the selected cluster center points distributed in each class and be as close as possible to the class center points, but it takes a long time to calculate the cluster centers. Aiming at this problem, this paper proposes a keyword extraction algorithm based on cluster analysis. The results show that the algorithm does not rely on background knowledge bases, dictionaries, etc., and obtains statistical parameters and builds models through training. Experiments show that the keyword extraction algorithm has high accuracy and can quickly extract the subject content of an English translation.
Bibliografia:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ObjectType-Correction/Retraction-3
Academic Editor: Vijay Kumar
ISSN:1687-5265
1687-5273
1687-5273
DOI:10.1155/2022/4293102