The use of Bayes factors in fine-scale genetic association studies.
PhD thesis, University of Glasgow.
Full text available as:
The aim of this thesis is to explore and compare methods that can be used for the purposes of finding possible genetic effects in the context of fine-scale genotype-phenotype association studies. Fine-scale genetic association studies present unique challenges for attempts at finding genetic effects, due to the strong linkage that can exist between different variants and issues that exist as a result of multiple testing. However, unlike Genome-Wide Association Studies (GWAS), there is potential to use the information from haplotypes arising from areas of low genetic recombination.
In order to test the effectiveness of approaches involved in fine-scale studies, the PheGe-Sim (Phenotype Genotype Simulation) application has been developed in order to simulate fine-scale phenotype-genotype data sets under a variety of scenarios. The simulations are based upon the coalescent model with extensions of population expansion, recombination, and finite sites mutations, that allow for real data sets to be more closely mirrored. The simulated data sets are subsequently used to assess the effectiveness of each of the methods that are used in this thesis, in attempting to find the known simulated causal variants.
One of the methods suitable for use in fine-scale genetic association studies for testing associations is Treescan (Templeton et al., 2005). Treescan is a method that attempts to use relationships between closely related haplotypes in an attempt to increase the power of finding genetic determinants of a phenotype. A haplotype tree is constructed, and each branch can be sequentially tested for any evidence of association from the resultant groups. To provide comparisons with the Treescan method, similar methods to the Treescan approach using each SNP (single nucleotide polymorphism) and haplotype have been implemented.
As a result of the issues of multiple testing in the context of GWAS, Balding (2006) advocated the use of Bayes factors as an alternative to the standard use of p-values for categorical data sets. In this thesis Bayes factors have been formulated that are suitable for continuous phenotype data, and for the context of fine-scale association studies. Bayes factors are used in a method that utilizes the Treescan approach of assessing various groupings from a haplotype tree, with the method being adapted to take advantage of the flexibility offered by Bayes factors. Single SNP and haplotype approaches have also been programmed using
the same implementation of Bayes factors.
The PheGe-Find (Phenotype Genotype-Find) application has been developed
that implements the association methods when supplied with the appropriate genotype and phenotype input files. In addition to testing the methods on simulated data, the approaches are also tested on two real data sets. The first of these concerns genotypes and phenotypes of the Drosophila Melanogaster fruit fly, that has previously been assessed using the original Treescan approach of Templeton et al. (2005). This allows for comparisons to be made between the different approaches upon a data set where there is strong evidence of a causal link between the genotype and phenotypes concerned. A second data set of genetic variants surrounding the human ADRA1A gene is also assessed for any potential causative genetic effects on blood pressure and heart rate phenotype measurements.
Actions (login required)