Imagine you are a social network user who wants to search, in a list of potential candidates, for the best candidate for a job on the basis of their PageRank-induced importance ranking. Is it possible to compute this ranking for a low cost, by visiting only small subnetworks around the nodes that represent each candidate? The fundamental problem underpinning this question, i.e. computing locally the PageRank ranking of k nodes in an $n$-node graph, was first raised by Chen et al. (CIKM 2004) and then restated by Bar-Yossef and Mashiach (CIKM 2008). In this paper we formalize and provide the first analysis of the problem, proving that any local algorithm that computes a correct ranking must take into consideration Ω(√(kn)) nodes -- even when ranking the top $k$ nodes of the graph, even if their PageRank scores are "well separated", and even if the algorithm is randomized (and we prove a stronger Ω(n) bound for deterministic algorithms). Experiments carried out on large, publicly available crawls of the web and of a social network show that also in practice the fraction of the graph to be visited to compute the ranking may be considerable, both for algorithms that are always correct and for algorithms that employ (efficient) local score approximations.

Local Computation of PageRank: the Ranking Side

BRESSAN, MARCO;PRETTO, LUCA
2011

Abstract

Imagine you are a social network user who wants to search, in a list of potential candidates, for the best candidate for a job on the basis of their PageRank-induced importance ranking. Is it possible to compute this ranking for a low cost, by visiting only small subnetworks around the nodes that represent each candidate? The fundamental problem underpinning this question, i.e. computing locally the PageRank ranking of k nodes in an $n$-node graph, was first raised by Chen et al. (CIKM 2004) and then restated by Bar-Yossef and Mashiach (CIKM 2008). In this paper we formalize and provide the first analysis of the problem, proving that any local algorithm that computes a correct ranking must take into consideration Ω(√(kn)) nodes -- even when ranking the top $k$ nodes of the graph, even if their PageRank scores are "well separated", and even if the algorithm is randomized (and we prove a stronger Ω(n) bound for deterministic algorithms). Experiments carried out on large, publicly available crawls of the web and of a social network show that also in practice the fraction of the graph to be visited to compute the ranking may be considerable, both for algorithms that are always correct and for algorithms that employ (efficient) local score approximations.
2011
Proceedings of the 20th ACM International Conference on Information and Knowledge Management
9781450307178
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11577/2898699
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 14
  • ???jsp.display-item.citation.isi??? ND
social impact