Power posteriors do not reliably learn the number of components in a finite mixture


Scientists and engineers are often interested in learning the number of subpopulations (or components) present in a data set. Data science folk wisdom tells us that a finite mixture model (FMM) with a prior on the number of components will fail to recover the true, data-generating number of components under model misspecification. But practitioners still widely use FMMs to learn the number of components, and statistical machine learning papers can be found recommending such an approach. Increasingly, though, data science papers suggest potential alternatives beyond vanilla FMMs, such as power posteriors, coarsening, and related methods. In this work we start by adding rigor to folk wisdom and proving that, under even the slightest model misspecification, the FMM component-count posterior diverges: the posterior probability of any particular finite number of latent components converges to 0 in the limit of infinite data. We use the same theoretical techniques to show that power posteriors with fixed power face the same undesirable divergence, and we provide a proof for the case where the power converges to a non-zero constant. We illustrate the practical consequences of our theory on simulated and real data. We conjecture how our methods may be applied to lend insight into other component-count robustification techniques.

NeurIPS Workshop: I Can’t Believe It’s Not Better

See also: Finite mixture models do not reliably learn the number of components (arXiv e-print 2007.04470) for results on learning the number of components in a finite mixture model via the usual posterior distribution.