Bayesian analysis of finite mixture distributions using the allocation sampler

Fearnside, Alastair T. (2007) Bayesian analysis of finite mixture distributions using the allocation sampler. PhD thesis, University of Glasgow.

Full text available as:
[thumbnail of 2007fearnsidePhD.pdf] PDF
Download (5MB)
Printed Thesis Information:


Finite mixture distributions are receiving more and more attention from statisticians in many different fields of research because they are a very flexible class of
models. They are typically used for density estimation or to model population heterogeneity. One can think of a finite mixture distribution as grouping the observations into components from which they are assumed to have arisen. In certain settings these groups have a physical interpretation. The interest in these distributions has been boosted recently because of the ever increasing computer power available to researchers to carry out the computationally intensive tasks required in their analysis.

In order to fit a finite mixture distribution taking a Bayesian approach a posterior distribution has to be evaluated. When the number of components in the model is assumed known this posterior distribution can be sampled from using methods such as Data Augmentation or Gibbs sampling (Tanner and Wong (1987) and Gelfand and Smith (1990)) and the Metropolis-Hastings algorithm (Hastings (1970)). However, the number of components in the model can also be considered an unknown and an object of inference. Richardson and Green (1997) and Stephens (2000a) both describe Bayesian methods to sample across models with different numbers of components. This enables an estimate of the posterior distribution of the number of components to be evaluated. Richardson and Green
(1997) define a reversible jump Markov chain Monte Carlo (RJMCMC) sampler while Stephens (2000a) uses a Markov birth-death process approach sample from the posterior distribution. In this thesis a Markov chain Monte Carlo method, named the allocation sampler. This sampler differs from the RJMCMC method reported in Richardson and Green (1997) because the state space of the sampler is simplified by the assumption that the components' parameters and weights can be analytically integrated out of the model. This in turn has the advantage that
only minimal changes are required to the sampler for mixtures of components from other parametric families. This thesis illustrates the allocation sampler's
performance on both simulated and real data sets.

Chapter 1 provides a background to finite mixture distributions and gives an overview of some inferential techniques that have already been used to analyse
these distributions.
Chapter 2 sets out the Bayesian model framework that is used throughout this thesis and defines all the required distributional results.
Chapter 3 describes the allocation sampler.
Chapter 4 tests the performance of the allocation sampler using simulated datasets from a collection of 15 different known mixture distributions.
Chapter 5 illustrates the allocation sampler with real datasets from a number of different research fields.
Chapter 6 summarises the research in the thesis and provides areas of possible future research.

Item Type: Thesis (PhD)
Qualification Level: Doctoral
Keywords: Classification, Normal mixtures, Multivarite normal mixtures, Mixtures of exponentials, Mixtures of uniforms, Markov chain Monte Carlo, Reversible Jump, Label switching, Allocation sampler, Galaxy data, Iris data, Acidity data, Enzyme data, Hidalgo stamps data, S&P 500 Returns data
Subjects: H Social Sciences > HA Statistics
Colleges/Schools: College of Science and Engineering > School of Mathematics and Statistics > Statistics
Supervisor's Name: Nobile, Dr. Agostino
Date of Award: 2007
Depositing User: Dr Alastair T Fearnside
Unique ID: glathesis:2007-555
Copyright: Copyright of this thesis is held by the author.
Date Deposited: 24 May 2010
Last Modified: 10 Dec 2012 13:19

Actions (login required)

View Item View Item


Downloads per month over past year