We describe an optimal randomized MapReduce algorithm for the problem of triangle enumeration that requires $\BO{E^{3/2}/(M\sqrt m)}$ rounds, where $m$ denotes the expected memory size of a reducer and $M$ the total available space. This generalizes the well-known vertex partitioning approach proposed in (Suri and Vassilvitskii, 2011) to multiple rounds, significantly increasing the size of the graphs that can be handled on a given system. We also give new theoretical (high probability) bounds on the work needed in each reducer, addressing the ``curse of the last reducer''. Indeed, our work is the first to give guarantees on the maximum load of each reducer for an arbitrary input graph. Our experimental evaluation shows the scalability of our approach, that it is competitive with existing methods improving the performance by a factor up to $2\times$, and that it can significantly increase the size of datasets that can be processed.

MapReduce Triangle Enumeration With Guarantees

SILVESTRI, FRANCESCO;
2014

Abstract

We describe an optimal randomized MapReduce algorithm for the problem of triangle enumeration that requires $\BO{E^{3/2}/(M\sqrt m)}$ rounds, where $m$ denotes the expected memory size of a reducer and $M$ the total available space. This generalizes the well-known vertex partitioning approach proposed in (Suri and Vassilvitskii, 2011) to multiple rounds, significantly increasing the size of the graphs that can be handled on a given system. We also give new theoretical (high probability) bounds on the work needed in each reducer, addressing the ``curse of the last reducer''. Indeed, our work is the first to give guarantees on the maximum load of each reducer for an arbitrary input graph. Our experimental evaluation shows the scalability of our approach, that it is competitive with existing methods improving the performance by a factor up to $2\times$, and that it can significantly increase the size of datasets that can be processed.
2014
Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management
9781450325981
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11577/3021512
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 54
  • ???jsp.display-item.citation.isi??? ND
social impact