Networks are ubiquitous tools that model many real world systems, ranging from academic collaborations to online markets. A massive amount of data is produced everyday from many systems that can be modeled through the network paradigm. The field of data mining aims at analyzing networked systems to extract meaningful patterns, to characterize the behaviour, and enhance our comprehension of the complicated systems being anayzed. Patterns play fundamental roles in defining important primitives that are used in exploratory analyses and specific applications to better understand networks. Modern systems, additionally to the network structure contain richer information about the timing of the interactions over the network, these networks are called temporal networks. Given the richer structure of temporal networks, novel patterns and primitives based on such patterns have been introduced in the field of data mining. Unfortunately, computing patterns on temporal networks is often a really hard and challenging task due to the more complicated structure of such networks and their patterns. In this thesis we develop efficient and rigorous algorithms for several primitives based on the computation of different patterns in temporal networks. The first patterns we consider are temporal motifs. A temporal motif is a small pattern defined by a small topology, capturing the function of the pattern, e.g., on a message network if such topology is a triangle then it captures the communication pattern between three users, and an ordering of its edges, that captures the dynamics of the topology over the temporal network. Analysing temporal motifs in temporal networks is a really challenging task, with the problem being NP-Hard in its general formulation. In this thesis we consider two different primitives based on such patterns. We first develop exact and approximate algorithms for obtaining a count of an arbitrary temporal motif in a temporal network. We then perform an extensive experimental evaluation to show that our algorithms improve over existing state-of-the-art algorithms, providing therefore practical algorithms to address the computation of the count of a temporal motif over a temporal network. Then we propose a novel problem of computing simultaneously the counts of multiple temporal motifs, all sharing the same common static topology. This problem is motivated by the fact that the ordering of a temporal motif cannot often be known a priori with high accuracy, especially in exploratory analyses. We then propose a randomized approximation algorithm to obtain high-quality approximation of all the temporal motifs addressing the novel problem introduced. We show bounds on the number of samples of the proposed algorithm to obtain concentrated estimates. We then perform a large experimental evaluation to show how such algorithm can be used to address the problem of obtaining the counts of all the motifs mapped on the same static topology, improving significantly existing state-of-the-art algorithms. The second type of patterns we consider are temporal paths, such patterns account for the connectivity on temporal networks. These patterns are used to define an important primitive, the emporal betweenness centrality, such measure assigns to each node a value that is based on the fraction of optimal temporal paths using a specific node, and nodes with higher temporal betweenness values are more central in spreading processes over the temporal network. We address the problem of the efficient computation of the temporal betweenness centrality of the nodes in temporal networks by developing a sampling-based approximation algorithm providing rigorous guarantees on its output. We empirically show significantly improves the scalability and the resources used compared to the state-of-the-art exact algorithm for computing the exact values of temporal betweenness centrality.

Networks are ubiquitous tools that model many real world systems, ranging from academic collaborations to online markets. A massive amount of data is produced everyday from many systems that can be modeled through the network paradigm. The field of data mining aims at analyzing networked systems to extract meaningful patterns, to characterize the behaviour, and enhance our comprehension of the complicated systems being anayzed. Patterns play fundamental roles in defining important primitives that are used in exploratory analyses and specific applications to better understand networks. Modern systems, additionally to the network structure contain richer information about the timing of the interactions over the network, these networks are called temporal networks. Given the richer structure of temporal networks, novel patterns and primitives based on such patterns have been introduced in the field of data mining. Unfortunately, computing patterns on temporal networks is often a really hard and challenging task due to the more complicated structure of such networks and their patterns. In this thesis we develop efficient and rigorous algorithms for several primitives based on the computation of different patterns in temporal networks. The first patterns we consider are temporal motifs. A temporal motif is a small pattern defined by a small topology, capturing the function of the pattern, e.g., on a message network if such topology is a triangle then it captures the communication pattern between three users, and an ordering of its edges, that captures the dynamics of the topology over the temporal network. Analysing temporal motifs in temporal networks is a really challenging task, with the problem being NP-Hard in its general formulation. In this thesis we consider two different primitives based on such patterns. We first develop exact and approximate algorithms for obtaining a count of an arbitrary temporal motif in a temporal network. We then perform an extensive experimental evaluation to show that our algorithms improve over existing state-of-the-art algorithms, providing therefore practical algorithms to address the computation of the count of a temporal motif over a temporal network. Then we propose a novel problem of computing simultaneously the counts of multiple temporal motifs, all sharing the same common static topology. This problem is motivated by the fact that the ordering of a temporal motif cannot often be known a priori with high accuracy, especially in exploratory analyses. We then propose a randomized approximation algorithm to obtain high-quality approximation of all the temporal motifs addressing the novel problem introduced. We show bounds on the number of samples of the proposed algorithm to obtain concentrated estimates. We then perform a large experimental evaluation to show how such algorithm can be used to address the problem of obtaining the counts of all the motifs mapped on the same static topology, improving significantly existing state-of-the-art algorithms. The second type of patterns we consider are temporal paths, such patterns account for the connectivity on temporal networks. These patterns are used to define an important primitive, the emporal betweenness centrality, such measure assigns to each node a value that is based on the fraction of optimal temporal paths using a specific node, and nodes with higher temporal betweenness values are more central in spreading processes over the temporal network. We address the problem of the efficient computation of the temporal betweenness centrality of the nodes in temporal networks by developing a sampling-based approximation algorithm providing rigorous guarantees on its output. We empirically show significantly improves the scalability and the resources used compared to the state-of-the-art exact algorithm for computing the exact values of temporal betweenness centrality.

Efficient and Rigorous Algorithms for the Analysis of Large Temporal Networks / Sarpe, Ilie. - (2023 Mar 20).

Efficient and Rigorous Algorithms for the Analysis of Large Temporal Networks

SARPE, ILIE
2023

Abstract

Networks are ubiquitous tools that model many real world systems, ranging from academic collaborations to online markets. A massive amount of data is produced everyday from many systems that can be modeled through the network paradigm. The field of data mining aims at analyzing networked systems to extract meaningful patterns, to characterize the behaviour, and enhance our comprehension of the complicated systems being anayzed. Patterns play fundamental roles in defining important primitives that are used in exploratory analyses and specific applications to better understand networks. Modern systems, additionally to the network structure contain richer information about the timing of the interactions over the network, these networks are called temporal networks. Given the richer structure of temporal networks, novel patterns and primitives based on such patterns have been introduced in the field of data mining. Unfortunately, computing patterns on temporal networks is often a really hard and challenging task due to the more complicated structure of such networks and their patterns. In this thesis we develop efficient and rigorous algorithms for several primitives based on the computation of different patterns in temporal networks. The first patterns we consider are temporal motifs. A temporal motif is a small pattern defined by a small topology, capturing the function of the pattern, e.g., on a message network if such topology is a triangle then it captures the communication pattern between three users, and an ordering of its edges, that captures the dynamics of the topology over the temporal network. Analysing temporal motifs in temporal networks is a really challenging task, with the problem being NP-Hard in its general formulation. In this thesis we consider two different primitives based on such patterns. We first develop exact and approximate algorithms for obtaining a count of an arbitrary temporal motif in a temporal network. We then perform an extensive experimental evaluation to show that our algorithms improve over existing state-of-the-art algorithms, providing therefore practical algorithms to address the computation of the count of a temporal motif over a temporal network. Then we propose a novel problem of computing simultaneously the counts of multiple temporal motifs, all sharing the same common static topology. This problem is motivated by the fact that the ordering of a temporal motif cannot often be known a priori with high accuracy, especially in exploratory analyses. We then propose a randomized approximation algorithm to obtain high-quality approximation of all the temporal motifs addressing the novel problem introduced. We show bounds on the number of samples of the proposed algorithm to obtain concentrated estimates. We then perform a large experimental evaluation to show how such algorithm can be used to address the problem of obtaining the counts of all the motifs mapped on the same static topology, improving significantly existing state-of-the-art algorithms. The second type of patterns we consider are temporal paths, such patterns account for the connectivity on temporal networks. These patterns are used to define an important primitive, the emporal betweenness centrality, such measure assigns to each node a value that is based on the fraction of optimal temporal paths using a specific node, and nodes with higher temporal betweenness values are more central in spreading processes over the temporal network. We address the problem of the efficient computation of the temporal betweenness centrality of the nodes in temporal networks by developing a sampling-based approximation algorithm providing rigorous guarantees on its output. We empirically show significantly improves the scalability and the resources used compared to the state-of-the-art exact algorithm for computing the exact values of temporal betweenness centrality.
Efficient and Rigorous Algorithms for the Analysis of Large Temporal Networks
20-mar-2023
Networks are ubiquitous tools that model many real world systems, ranging from academic collaborations to online markets. A massive amount of data is produced everyday from many systems that can be modeled through the network paradigm. The field of data mining aims at analyzing networked systems to extract meaningful patterns, to characterize the behaviour, and enhance our comprehension of the complicated systems being anayzed. Patterns play fundamental roles in defining important primitives that are used in exploratory analyses and specific applications to better understand networks. Modern systems, additionally to the network structure contain richer information about the timing of the interactions over the network, these networks are called temporal networks. Given the richer structure of temporal networks, novel patterns and primitives based on such patterns have been introduced in the field of data mining. Unfortunately, computing patterns on temporal networks is often a really hard and challenging task due to the more complicated structure of such networks and their patterns. In this thesis we develop efficient and rigorous algorithms for several primitives based on the computation of different patterns in temporal networks. The first patterns we consider are temporal motifs. A temporal motif is a small pattern defined by a small topology, capturing the function of the pattern, e.g., on a message network if such topology is a triangle then it captures the communication pattern between three users, and an ordering of its edges, that captures the dynamics of the topology over the temporal network. Analysing temporal motifs in temporal networks is a really challenging task, with the problem being NP-Hard in its general formulation. In this thesis we consider two different primitives based on such patterns. We first develop exact and approximate algorithms for obtaining a count of an arbitrary temporal motif in a temporal network. We then perform an extensive experimental evaluation to show that our algorithms improve over existing state-of-the-art algorithms, providing therefore practical algorithms to address the computation of the count of a temporal motif over a temporal network. Then we propose a novel problem of computing simultaneously the counts of multiple temporal motifs, all sharing the same common static topology. This problem is motivated by the fact that the ordering of a temporal motif cannot often be known a priori with high accuracy, especially in exploratory analyses. We then propose a randomized approximation algorithm to obtain high-quality approximation of all the temporal motifs addressing the novel problem introduced. We show bounds on the number of samples of the proposed algorithm to obtain concentrated estimates. We then perform a large experimental evaluation to show how such algorithm can be used to address the problem of obtaining the counts of all the motifs mapped on the same static topology, improving significantly existing state-of-the-art algorithms. The second type of patterns we consider are temporal paths, such patterns account for the connectivity on temporal networks. These patterns are used to define an important primitive, the emporal betweenness centrality, such measure assigns to each node a value that is based on the fraction of optimal temporal paths using a specific node, and nodes with higher temporal betweenness values are more central in spreading processes over the temporal network. We address the problem of the efficient computation of the temporal betweenness centrality of the nodes in temporal networks by developing a sampling-based approximation algorithm providing rigorous guarantees on its output. We empirically show significantly improves the scalability and the resources used compared to the state-of-the-art exact algorithm for computing the exact values of temporal betweenness centrality.
Efficient and Rigorous Algorithms for the Analysis of Large Temporal Networks / Sarpe, Ilie. - (2023 Mar 20).
File in questo prodotto:
File Dimensione Formato  
mainA.pdf

accesso aperto

Descrizione: tesi_Ilie_Sarpe
Tipologia: Tesi di dottorato
Dimensione 11.03 MB
Formato Adobe PDF
11.03 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11577/3472981
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact