A heuristic search algorithm based on subspaces for PageRank computation

We studied a fast algorithm for the large-scale computation of PageRank. PageRank is what the Google search engine uses to simulate the importance of web pages. It is defined by the eigenvector of a particular stochastic matrix related to the graphs of web pages. The power method is the typical mean...

Ausführliche Beschreibung

Gespeichert in:

Bibliographische Detailangaben
Veröffentlicht in:	The Journal of supercomputing Jg. 74; H. 7; S. 3278 - 3294
1. Verfasser:	Miyata, Takafumi
Format:	Journal Article
Sprache:	Englisch
Veröffentlicht:	New York Springer US 01.07.2018 Springer Nature B.V
Schlagworte:	Algorithms Compilers Computation Computer Science Computer simulation Eigenvectors Interpreters Processor Architectures Programming Languages Search algorithms Search engines Subspace methods Subspaces PageRank Parallel computing Krylov subspace Residual minimization Google matrix Power iteration
ISSN:	0920-8542, 1573-0484
Online-Zugang:	Volltext
Tags:	Tag hinzufügen Keine Tags, Fügen Sie den ersten Tag hinzu!

Beschreibung
Zusammenfassung:	We studied a fast algorithm for the large-scale computation of PageRank. PageRank is what the Google search engine uses to simulate the importance of web pages. It is defined by the eigenvector of a particular stochastic matrix related to the graphs of web pages. The power method is the typical means to compute the eigenvector, while the Krylov subspace method shows faster convergence, which can be regarded as a two-step algorithm. The first step predicts the eigenvector, and the second step corrects the predicted result. More precisely, the power method is first iterated to compute the eigenvector approximately. Secondly, a Krylov subspace spanned by the approximations is searched for a better approximate eigenvector in terms of minimizing a residual. To get a better approximation efficiently, we consider using subspaces not only at the second step but also at the first step. Specifically, a Krylov subspace is first used to compute an approximate eigenvector, by which another subspace is expanded. Secondly, this non-Krylov subspace is searched for a better approximate eigenvector that minimizes its residual over the subspace. This paper describes a heuristic search algorithm iterating the two steps alternately and presents its efficient implementation. Experimental results with huge Google matrices illustrate improvements in performance of the algorithm.
Bibliographie:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	0920-8542 1573-0484
DOI:	10.1007/s11227-018-2383-9