A wide class of Bayesian nonparametric priors leads to the representation of the distribution of the observable variables as a mixture density with an infinite number of components. Such a representation induces a clustering structure in the data. However, due to label switching, cluster identification is not straightforward a posteriori and some post-processing of the MCMC output is usually required. Alternatively, observations can be mapped on a weighted undirected graph, where each node represents a sample item and edge weights are given by the posterior pairwise similarities. It is shown how, after building a particular random walk on such a graph, it is possible to apply a community detection algorithm, known as map equation, leading to the minimisation of the expected description length of the partition. A relevant feature of this method is that it allows for the quantification of the posterior uncertainty of the classification.

Bayesian nonparametric clustering as a community detection problem

Tonellato S. F.
2020-01-01

Abstract

A wide class of Bayesian nonparametric priors leads to the representation of the distribution of the observable variables as a mixture density with an infinite number of components. Such a representation induces a clustering structure in the data. However, due to label switching, cluster identification is not straightforward a posteriori and some post-processing of the MCMC output is usually required. Alternatively, observations can be mapped on a weighted undirected graph, where each node represents a sample item and edge weights are given by the posterior pairwise similarities. It is shown how, after building a particular random walk on such a graph, it is possible to apply a community detection algorithm, known as map equation, leading to the minimisation of the expected description length of the partition. A relevant feature of this method is that it allows for the quantification of the posterior uncertainty of the classification.
File in questo prodotto:
File Dimensione Formato  
1-s2.0-S0167947320301353-main.pdf

non disponibili

Descrizione: Articolo principale
Tipologia: Versione dell'editore
Licenza: Accesso chiuso-personale
Dimensione 1.09 MB
Formato Adobe PDF
1.09 MB Adobe PDF   Visualizza/Apri

I documenti in ARCA sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10278/3729558
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 2
  • ???jsp.display-item.citation.isi??? 2
social impact