Research Front Detection and Topic Evolution Based on Topological Structure and the PageRank Algorithm

Research front detection and topic evolution has for a long time been an important direction for research in the informetrics field. However, most previous studies either simply use a citation count for scientific document clustering or assume that each scientific document has the same importance in...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	Symmetry (Basel) Jg. 11; H. 3; S. 310
Hauptverfasser:	Xu, Yangbing, Zhang, Shuai, Zhang, Wenyu, Yang, Shuiqing, Shen, Yue
Format:	Journal Article
Sprache:	Englisch
Veröffentlicht:	Basel MDPI AG 01.03.2019
Schlagworte:	Algorithms Bibliographic coupling Citation analysis Citations Clustering Cocitation Data mining Evolutionary algorithms Graph theory Greedy algorithms Keywords Probability distribution Search algorithms Similarity Time lag Topology Visualization Windows (intervals)
ISSN:	2073-8994, 2073-8994
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	Research front detection and topic evolution has for a long time been an important direction for research in the informetrics field. However, most previous studies either simply use a citation count for scientific document clustering or assume that each scientific document has the same importance in detecting the clustering theme in a cluster. In this study, utilizing the topological structure and the PageRank algorithm, we propose a new research front detection and topic evolution approach based on graph theory. This approach is made up of three stages: (1) Setting a time window with appropriate length according to the accuracy of scientific documents clustering results and the time delay of a scientific document to be cited, dividing scientific documents into several time windows according to their years of publication, calculating similarities between them according to their topological structure, and clustering them in each time window based on the fast greedy algorithm; (2) combining the PageRank algorithm and keywords’ frequency to detect the clustering theme, which assumes that the more important a scientific document in the cluster is, the greater the possibility that it is cited by the other documents in the same cluster; and (3) reconstructing the cluster graph where nodes represent clusters and edges’ strengths represent the similarities between different clusters, then detecting research front and identifying topic evolution based on the reconstructed cluster graph. To evaluate the performance of our proposed approach, the scientific documents related to data mining and covered by Science Citation Index Expanded (SCI-EXPANDED) or Social Science Citation Index (SSCI) in Web of Science are collected as a case study. The experiment’s results show that the proposed approach can obtain reasonable clustering results, and it is effective for research front detection and topic evolution.
Bibliographie:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	2073-8994 2073-8994
DOI:	10.3390/sym11030310