The intrinsic dimension (id) of a dataset conveys essential information regarding the complexity of the underlying data-generating process. In particular, it describes the di- mensionality of the latent manifold on which the data-generating probability distribution has support. Complex datasets may be characterized by multiple manifolds having differ- ent ids. To properly estimate these heterogeneous ids, a recent modeling approach uses finite scale mixtures of Pareto distributions aided by a homogeneity-inducing term in the likelihood. In this contribution, we explore a different modeling perspective, estimating Pareto’s scale mixtures via spatial product partition models. We present the general idea and introduce Spider, our Bayesian nonparametric approach. Finally, we showcase some encouraging preliminary results.
Bayesian nonparametric estimation of heterogeneous intrinsic dimension via product partition models
Denti F.;
2023
Abstract
The intrinsic dimension (id) of a dataset conveys essential information regarding the complexity of the underlying data-generating process. In particular, it describes the di- mensionality of the latent manifold on which the data-generating probability distribution has support. Complex datasets may be characterized by multiple manifolds having differ- ent ids. To properly estimate these heterogeneous ids, a recent modeling approach uses finite scale mixtures of Pareto distributions aided by a homogeneity-inducing term in the likelihood. In this contribution, we explore a different modeling perspective, estimating Pareto’s scale mixtures via spatial product partition models. We present the general idea and introduce Spider, our Bayesian nonparametric approach. Finally, we showcase some encouraging preliminary results.Pubblicazioni consigliate
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.