Zhu, Ximin (2009) Topics on statistical design and analysis of cDNA microarray experiment. PhD thesis, University of Glasgow.
Full text available as:

PDF
Download (5MB)  Preview 
Abstract
A microarray is a powerful tool for surveying the expression levels of many thousands of genes simultaneously. It belongs to the new genomics technologies which have important applications in the biological, agricultural and pharmaceutical sciences. In this thesis, we focus on the dual channel cDNA microarray which is one of the most popular microarray technologies and discuss three different topics: optimal experimental design; estimating the true proportion of true nulls, local false discovery rate (lFDR) and positive false discovery rate (pFDR) and dye effect normalization. The first topic consists of four subtopics each of which is about an independent and practical problem of cDNA microarray experimental design. In the first subtopic, we propose an optimization strategy which is based on the simulated annealing method to find optimal or nearoptimal designs with both biological and technical replicates. In the second subtopic, we discuss how to apply Qcriterion for the factorial design of microarray experiments. In the third subtopic, we suggest an optimal way of pooling samples, which is actually a replication scheme to minimize the variance of the experiment under the constraint of fixing the total cost at a certain level. In the fourth subtopic, we indicate that the criterion for distant pair design is not proper and propose an alternative criterion instead. The second topic of this thesis is dye effect normalization. For cDNA microarray technology, each array compares two samples which are usually labelled with different dyes Cy3 and Cy5. It assumes that: for a given gene (spot) on the array, if Cy3labelled sample has k times as much of a transcript as the Cy5labelled sample, then the Cy3 signal should be k times as high as the Cy5 signal, and vice versa. This important assumption requires that the dyes should have the same properties. However, the reality is that the Cy3 and Cy5 dyes have slightly different properties and the relative efficiency of the dyes vary across the intensity range in a "bananashape" way. In order to remove the dye effect, we propose a novel dye effect normalization method which is based on modeling dye response functions and dye effect curve. Real and simulated microarray data sets are used to evaluate the method. It shows that the performance of the proposed method is satisfactory. The focus of the third topic is the estimation of the proportion of true null hypotheses, lFDR and pFDR. In a typical microarray experiment, a large number of gene expression data could be measured. In order to find differential expressed genes, these variables are usually screened by a statistical test simultaneously. Since it is a case of multiple hypothesis testing, some kind of adjustment should be made to the pvalues resulted from the statistical test. Lots of multiple testing error rates, such as FDR, lFDR and pFDR have been proposed to address this issue. A key related problem is the estimation of the proportion of true null hypotheses (i.e. nonexpressed genes). To model the distribution of the pvalues, we propose three kinds of finite mixture of unknown number of components (the first component corresponds to differentially expressed genes and the rest components correspond to nondifferentially expressed ones). We apply a new MCMC method called allocation sampler to estimate the proportion of true null (i.e. the mixture weight of the first component). The method also provides a framework for estimating lFDR and pFDR. Two real microarray data studies plus a small simulation study are used to assess our method. We show that the performance of the proposed method is satisfactory.
Item Type:  Thesis (PhD) 

Qualification Level:  Doctoral 
Keywords:  cDNA microarray, optimal design, dye normalization, multiple hypothesis testing, mixture model 
Subjects:  Q Science > QH Natural history > QH301 Biology H Social Sciences > HA Statistics Q Science > QA Mathematics 
Colleges/Schools:  College of Science and Engineering > School of Mathematics and Statistics > Statistics 
Funder's Name:  UNSPECIFIED 
Supervisor's Name:  Wit, Prof. Ernst and Agostino, Dr. Nobile 
Date of Award:  2009 
Depositing User:  Mr Ximin Zhu 
Unique ID:  glathesis:20091206 
Copyright:  Copyright of this thesis is held by the author. 
Date Deposited:  28 Oct 2009 
Last Modified:  10 Dec 2012 13:35 
URI:  http://theses.gla.ac.uk/id/eprint/1206 
Actions (login required)
View Item 