The density-based formulation aims at recasting the clustering problem to a mathematically sound framework, by linking the groups to some features of the density assumed to underlie the data. Even if early probabilistic approaches to cluster analysis can be traced back to fifty years ago, the topic has recently found a renewed and vibrant interest in the scientific community. This may be motivated both by the computational advancements witnessed in the last years and by more conceptual and substantial reasons. The increased availability of mixed type and complex structured data has indeed required a rigorous formalization of the clustering problem. Stemming from the same roots, density-based clustering has been developed following two distinct paths. In its parametric formulation, a connection among groups and unimodal components of a mixture model is drawn. On the other hand, according to the nonparametric paradigm, clusters are seen as the domains of attraction of the modes of the density. Revolving around this approach to clustering, the thesis explores both the formulations by highlighting at the same time contact points and dissimilarities. Moreover differently structured data are considered ranging from the unidimensional setting to more complex three-way structure. Three main contributions can be highlighted. In the first one we derive some asymptotic results to address nonparametric density estimation from a clustering-oriented perspective. In the second contribution we propose an ensemble approach for density-based clustering, which inherits the strengths from both the parametric and the nonparametric formulation, and possibly enhances the robustness and the stability of the partitions. The third contribution addresses the problem of clustering complex multivariate time-dependent data by adopting a parametric approach and proposing a flexible modification of the Latent Block Model.
Climbing modes and exploring mixtures: a journey in density-based clustering / Casa, Alessandro. - (2019 Nov).
Climbing modes and exploring mixtures: a journey in density-based clustering
Casa, Alessandro
2019
Abstract
The density-based formulation aims at recasting the clustering problem to a mathematically sound framework, by linking the groups to some features of the density assumed to underlie the data. Even if early probabilistic approaches to cluster analysis can be traced back to fifty years ago, the topic has recently found a renewed and vibrant interest in the scientific community. This may be motivated both by the computational advancements witnessed in the last years and by more conceptual and substantial reasons. The increased availability of mixed type and complex structured data has indeed required a rigorous formalization of the clustering problem. Stemming from the same roots, density-based clustering has been developed following two distinct paths. In its parametric formulation, a connection among groups and unimodal components of a mixture model is drawn. On the other hand, according to the nonparametric paradigm, clusters are seen as the domains of attraction of the modes of the density. Revolving around this approach to clustering, the thesis explores both the formulations by highlighting at the same time contact points and dissimilarities. Moreover differently structured data are considered ranging from the unidimensional setting to more complex three-way structure. Three main contributions can be highlighted. In the first one we derive some asymptotic results to address nonparametric density estimation from a clustering-oriented perspective. In the second contribution we propose an ensemble approach for density-based clustering, which inherits the strengths from both the parametric and the nonparametric formulation, and possibly enhances the robustness and the stability of the partitions. The third contribution addresses the problem of clustering complex multivariate time-dependent data by adopting a parametric approach and proposing a flexible modification of the Latent Block Model.File | Dimensione | Formato | |
---|---|---|---|
Tesi_PhD_Casa.pdf
accesso aperto
Tipologia:
Tesi di dottorato
Licenza:
Non specificato
Dimensione
3.84 MB
Formato
Adobe PDF
|
3.84 MB | Adobe PDF | Visualizza/Apri |
Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.