A Bayesian hierarchical model of compositional data with zeros: classification and evidence evaluation of forensic glass

Napier, Gary (2014) A Bayesian hierarchical model of compositional data with zeros: classification and evidence evaluation of forensic glass. PhD thesis, University of Glasgow.

Full text available as:
[thumbnail of 2014NapierPhD.pdf] PDF
Download (11MB)
Printed Thesis Information: https://eleanor.lib.gla.ac.uk/record=b3090267

Abstract

A Bayesian hierarchical model is proposed for modelling compositional data containing large concentrations of zeros. Two data transformations were used and compared: the commonly used additive log-ratio (alr) transformation for compositional data, and the square root of the compositional ratios. For this data the square root transformation was found to stabilise variability in the data better. The square root transformation also had no issues dealing with the large concentrations of zeros. To deal with the zeros, two different approaches have been implemented: the data augmentation approach and the composite model approach. The data augmentation approach treats any zero values as rounded zeros, i.e. traces of components below limits of detection, and updates those zero values with non-zero values. This is better than the simple approach of adding constant values to zeros as it reduces any artificial correlation produced by updating the zeros as part of the modelling procedure. However, due to the small detection limit it does not necessarily alleviate the problems of having a point mass very close to zero. The composite model approach treats any zero components as being absent from a composition. This is done by splitting the data into subsets according to the presence or absence of certain components to produce different data configurations that are then modelled separately. The models are applied to a database consisting of the elemental configurations of forensic glass fragments with many levels of variability and of various use types. The main purposes of the model are (i) to derive expressions for the posterior predictive probabilities of newly observed glass fragments to infer their use type (classification) and (ii) to compute the evidential value of glass fragments under two complementary propositions about their source (forensic evidence evaluation). Simulation studies using cross-validation are carried out to assess both model approaches, with both performing well at classifying glass fragments of use types bulb, headlamp and container, but less well so when classifying car and building windows. The composite model approach marginally outperforms the data augmentation approach at the classification task; both approaches have the edge over support vector machines (SVM). Both model approaches also perform well when evaluating the evidential value of glass fragments, with false negative and false positive error rates below 5%. The results from glass classification and evidence evaluation are an improvement over existing methods. Assessment of the models as part of the evidence evaluation simulation study also leads to a restriction being placed upon the reported strength of the value of this type of evidence. To prevent strong support in favour of the wrong proposition it is recommended that this glass evidence should provide, at most, moderately strong support in favour of a proposition. The classification and evidence evaluation procedures are implemented into an online web application, which outputs the corresponding results for a given set of elemental composition measurements. The web application contributes a quick and easy-to-use tool for forensic scientists that deal with this type of forensic evidence in real-life casework.

Item Type: Thesis (PhD)
Qualification Level: Doctoral
Keywords: Bayes factor, compositional data, compositional zeros, classification, evidence evaluation, forensic glass, hierarchical model, Markov chain Monte Carlo
Subjects: H Social Sciences > HA Statistics
Q Science > QA Mathematics
Colleges/Schools: College of Science and Engineering > School of Mathematics and Statistics > Statistics
Supervisor's Name: Neocleous, Dr. Tereza and Nobile, Dr. Agostino
Date of Award: 2014
Depositing User: Mr Gary Napier
Unique ID: glathesis:2014-5793
Copyright: Copyright of this thesis is held by the author.
Date Deposited: 01 Dec 2014 11:13
Last Modified: 11 Dec 2014 16:12
URI: https://theses.gla.ac.uk/id/eprint/5793

Actions (login required)

View Item View Item

Downloads

Downloads per month over past year