Triangle counting is a fundamental and widely studied problem on static graphs, and recently on temporal graphs, where edges carry information on the timings of the associated events. Streaming processing and resource efficiency are crucial requirements for counting triangles in modern massive temporal graphs, with millions of nodes and up to billions of temporal edges. However, current exact and approximate algorithms are unable to handle large-scale temporal graphs. To fill such a gap, we introduce STEP, a scalable and efficient algorithm to approximate temporal triangle counts from a stream of temporal edges. STEP combines predictions to the number of triangles a temporal edge is involved in, with a simple sampling strategy, leading to scalability, efficiency, and accurate approximation of all eight temporal triangle types simultaneously. We analytically prove that, by using a sublinear amount of memory, STEP obtains unbiased and very accurate estimates. In fact, even noisy predictions can significantly reduce the variance of STEP ’s estimates. Our extensive experiments on massive temporal graphs with up to billions of edges demonstrate that STEP outputs high-quality estimates and is more efficient than state-of-the-art methods.

Efficient Approximate Temporal Triangle Counting in Streaming with Predictions

Venturin, Giorgio;Vandin, Fabio
2025

Abstract

Triangle counting is a fundamental and widely studied problem on static graphs, and recently on temporal graphs, where edges carry information on the timings of the associated events. Streaming processing and resource efficiency are crucial requirements for counting triangles in modern massive temporal graphs, with millions of nodes and up to billions of temporal edges. However, current exact and approximate algorithms are unable to handle large-scale temporal graphs. To fill such a gap, we introduce STEP, a scalable and efficient algorithm to approximate temporal triangle counts from a stream of temporal edges. STEP combines predictions to the number of triangles a temporal edge is involved in, with a simple sampling strategy, leading to scalability, efficiency, and accurate approximation of all eight temporal triangle types simultaneously. We analytically prove that, by using a sublinear amount of memory, STEP obtains unbiased and very accurate estimates. In fact, even noisy predictions can significantly reduce the variance of STEP ’s estimates. Our extensive experiments on massive temporal graphs with up to billions of edges demonstrate that STEP outputs high-quality estimates and is more efficient than state-of-the-art methods.
2025
Machine Learning and Knowledge Discovery in Databases. Research Track
European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases - ECML PKDD
9783032060655
9783032060662
   DARE - Digital Lifelong Prevention - Spoke 3 - Digitally-enabled Secondary and Tertiary Prevention
   Italian Ministry of University and Research
   PNC-I.1 “Research initiatives for innovative technologies and pathways in the health and welfare sector, D.D. 931 of 06/06/2022, PNC0000002 DARE - Digital Lifelong Prevention CUP:B53C22006440001”

   EXPAND: scalable algorithms for EXPloratory Analyses of heterogeneous and dynamic Networked Data
   PRIN

   ERC Advanced Grant REBOUND
   European Research Council

   WallenbergAI, Autonomous Systems and Software Program
   WASP
   Knut and Alice Wallenberg Foundation
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11577/3573351
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
  • OpenAlex 0
social impact