The density-based formulation aims at recasting the clustering problem to a mathematically sound framework, by linking the groups to some features of the density assumed to underlie the data. Even if early probabilistic approaches to cluster analysis can be traced back to fifty years ago, the topic has recently found a renewed and vibrant interest in the scientific community. This may be motivated both by the computational advancements witnessed in the last years and by more conceptual and substantial reasons. The increased availability of mixed type and complex structured data has indeed required a rigorous formalization of the clustering problem. Stemming from the same roots, density-based clustering has been developed following two distinct paths. In its parametric formulation, a connection among groups and unimodal components of a mixture model is drawn. On the other hand, according to the nonparametric paradigm, clusters are seen as the domains of attraction of the modes of the density. Revolving around this approach to clustering, the thesis explores both the formulations by highlighting at the same time contact points and dissimilarities. Moreover differently structured data are considered ranging from the unidimensional setting to more complex three-way structure. Three main contributions can be highlighted. In the first one we derive some asymptotic results to address nonparametric density estimation from a clustering-oriented perspective. In the second contribution we propose an ensemble approach for density-based clustering, which inherits the strengths from both the parametric and the nonparametric formulation, and possibly enhances the robustness and the stability of the partitions. The third contribution addresses the problem of clustering complex multivariate time-dependent data by adopting a parametric approach and proposing a flexible modification of the Latent Block Model.

Climbing modes and exploring mixtures: a journey in density-based clustering / Casa, Alessandro. - (2019 Nov).

Climbing modes and exploring mixtures: a journey in density-based clustering

Casa, Alessandro
2019

Abstract

The density-based formulation aims at recasting the clustering problem to a mathematically sound framework, by linking the groups to some features of the density assumed to underlie the data. Even if early probabilistic approaches to cluster analysis can be traced back to fifty years ago, the topic has recently found a renewed and vibrant interest in the scientific community. This may be motivated both by the computational advancements witnessed in the last years and by more conceptual and substantial reasons. The increased availability of mixed type and complex structured data has indeed required a rigorous formalization of the clustering problem. Stemming from the same roots, density-based clustering has been developed following two distinct paths. In its parametric formulation, a connection among groups and unimodal components of a mixture model is drawn. On the other hand, according to the nonparametric paradigm, clusters are seen as the domains of attraction of the modes of the density. Revolving around this approach to clustering, the thesis explores both the formulations by highlighting at the same time contact points and dissimilarities. Moreover differently structured data are considered ranging from the unidimensional setting to more complex three-way structure. Three main contributions can be highlighted. In the first one we derive some asymptotic results to address nonparametric density estimation from a clustering-oriented perspective. In the second contribution we propose an ensemble approach for density-based clustering, which inherits the strengths from both the parametric and the nonparametric formulation, and possibly enhances the robustness and the stability of the partitions. The third contribution addresses the problem of clustering complex multivariate time-dependent data by adopting a parametric approach and proposing a flexible modification of the Latent Block Model.
nov-2019
nonparametric, density-based clustering, density estimation, coclustering, bandwidth selection, ensamble approach
Climbing modes and exploring mixtures: a journey in density-based clustering / Casa, Alessandro. - (2019 Nov).
File in questo prodotto:
File Dimensione Formato  
Tesi_PhD_Casa.pdf

accesso aperto

Tipologia: Tesi di dottorato
Licenza: Non specificato
Dimensione 3.84 MB
Formato Adobe PDF
3.84 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11577/3422342
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact