Christie, David Alan
(1998)
Genome Rearrangement Problems.
PhD thesis, University of Glasgow.
Full text available as:
Abstract
Various global rearrangements of permutations, such as reversals and transpositions, have recently become of interest because of their applications in computational molecular biology. A reversal is an operation that reverses the order of a substring of a permutation. A transposition is an operation that swaps two adjacent substrings of a permutation. The problem of determining the smallest number of reversals required to transform a given permutation into the identity permutation is called sorting by reversals. Similar problems can be defined for transpositions and other global rearrangements. Related to sorting by reversals is the problem of establishing the reversal diameter. The reversal diameter of Sn (the symmetric group on n elements) is the maximum number of reversals required to sort a permutation of length n. Of course, diameter problems can be posed for other global rearrangements. These various problems are of interest because the permutations can be used to represent sequences of genes in chromosomes, and the global rearrangements then represent evolutionary events. As a result, we call these problems genome rearrangement problems. Genome rearrangement problems seem to be unlike previously studied algorithmic problems on sequences, so new methods have had to be developed to deal with them. These methods predominantly employ graphs to model permutation structure. However, even using these methods, often a genome rearrangement problem has no obvious polynomialtime algorithm, and in some cases can be shown to be NPhard. For example, the problem of sorting by reversals is NPhard, whereas the computational complexity of sorting by transpositions is open. For problems like these, it is natural to seek polynomialtime approximation algorithms that achieve an approximation guarantee. In this thesis, we study several genome rearrangement problems as interesting and challenging algorithmic problems in their own right, including some problems for which the global rearrangement has no immediate biological equivalent. For example, we define a blockinterchange to be a rearrangement that swaps any two substrings of the permutation. We examine, in particular, how the graph theoretic models relate to the genome rearrangement problems that we study. The major new results contained in this thesis are as follows: We present a 3/2approximation algorithm for sorting by reversals. This is the best known approximation algorithm for the problem, and improves upon the 7/4 approximation bound of the previous best algorithm. We give a polynomialtime algorithm for a significant special case of sorting by reversals, thereby disproving a conjecture of Kececioglu and Sankoff, who had suggested that this special case was likely to be NPhard. We analyse the structure of the socalled cpcle graph of a permutation in the context of sorting by transpositions, and thereby gain a deeper insight into this problem. Among the consequences are; a tighter lower bound for the problem, a simpler 3/2aproximation algorithm than had previously been described, and algorithms that, in empirical tests, almost always find the exact transposition distance of random permutations. We introduce a natural generalisation of sorting by transpositions called sorting by blockinterchanges, and present a polynomialtime algorithm for this problem. We initiate the study of analogous problems on strings over a fixed length alphabet. We establish upper and lower bounds and diameter results for the problems over a binary alphabet. We also prove that the problems analogous to sorting by reversals and sorting by blockinterchanges are NPhard. (Abstract shortened by ProQuest.).
Item Type: 
Thesis
(PhD)

Qualification Level: 
Doctoral 
Keywords: 
Bioinformatics, Computer science 
Date of Award: 
1998 
Depositing User: 
Enlighten Team

Unique ID: 
glathesis:199874685 
Copyright: 
Copyright of this thesis is held by the author. 
Date Deposited: 
27 Sep 2019 17:10 
Last Modified: 
27 Sep 2019 17:10 
URI: 
http://theses.gla.ac.uk/id/eprint/74685 
Actions (login required)

View Item 