We study the complexity of local graph-centrality estimation, with the goal of approximating the centrality score of a given target node while exploring only a sublinear number of nodes/arcs of the graph and performing a sublinear number of elementary operations. We develop a technique, that we apply to PageRank and Heat Kernel, for constructing a low-variance score estimator through a local exploration of the graph. We obtain an algorithm that, given any node in any graph of $n$ nodes and $m$ arcs, with probability $(1-\delta)$ computes a multiplicative $(1\pm\epsilon)$-approximation of its score by examining only $\tilde{O}(\min(n^{1/2} \dmax^{1/2}, n^{1/2} m^{1/4}))$ nodes/arcs, where $\dmax$ is the maximum outdegree of the graph (omitting for readability $\poly(\epsilon^{-1})$ and $\polylog(\delta^{-1})$ factors). A similar bound holds for computational cost. We also prove a lower bound of $\Omega(\min(n^{1/2} \dmax^{1/2}, \, n^{1/3} m^{1/3}))$ for both query complexity and computational complexity. Moreover, in the jump-and-crawl graph-access model, our technique yields a $\tilde{O}(\min(n^{1/2}\dmax^{1/2}, n^{2/3}))$-queries algorithm; we show this algorithm is optimal up to a logarithmic factor -- in fact, sublogarithmic in the case of PageRank. These are the first algorithms with sublinear worst-case bounds for general directed graphs and any choice of the target node.

Sublinear Algorithms for Local Graph-Centrality Estimation

Peserico, Enoch;
2023

Abstract

We study the complexity of local graph-centrality estimation, with the goal of approximating the centrality score of a given target node while exploring only a sublinear number of nodes/arcs of the graph and performing a sublinear number of elementary operations. We develop a technique, that we apply to PageRank and Heat Kernel, for constructing a low-variance score estimator through a local exploration of the graph. We obtain an algorithm that, given any node in any graph of $n$ nodes and $m$ arcs, with probability $(1-\delta)$ computes a multiplicative $(1\pm\epsilon)$-approximation of its score by examining only $\tilde{O}(\min(n^{1/2} \dmax^{1/2}, n^{1/2} m^{1/4}))$ nodes/arcs, where $\dmax$ is the maximum outdegree of the graph (omitting for readability $\poly(\epsilon^{-1})$ and $\polylog(\delta^{-1})$ factors). A similar bound holds for computational cost. We also prove a lower bound of $\Omega(\min(n^{1/2} \dmax^{1/2}, \, n^{1/3} m^{1/3}))$ for both query complexity and computational complexity. Moreover, in the jump-and-crawl graph-access model, our technique yields a $\tilde{O}(\min(n^{1/2}\dmax^{1/2}, n^{2/3}))$-queries algorithm; we show this algorithm is optimal up to a logarithmic factor -- in fact, sublogarithmic in the case of PageRank. These are the first algorithms with sublinear worst-case bounds for general directed graphs and any choice of the target node.
2023
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11577/3506401
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact