On the minimum number of topologies explaining a sample of DNA sequences

Research output: Contribution to journalJournal articleResearchpeer-review

In this article I derive an alternative algorithm to Hudson and Kaplan's (Genetics 111, 147-165) algorithm that gives a lower bound to the number of recombination events in a sample's history. It is shown that the number, T M, found by the algorithm is the least number of topologies required to explain a set of DNA sequences sampled under the infinite-site assumption. Let T=(T1,⋯,Tr) be a list of topologies compatible with the sequences, i.e., Tk is compatible with an interval, I k, of sites in the alignment. A characterization of all lists having TM topologies is given and it is shown that TM relates to specific patterns in the alignment, here called chain series. Further, a number of theorems relating general lists of topologies to the number TM is presented. The results are discussed in relation to the true minimum number of recombination events required to explain an alignment.

Original languageEnglish
JournalTheoretical Population Biology
Volume62
Issue number4
Pages (from-to)357-363
Number of pages7
ISSN0040-5809
DOIs
Publication statusPublished - 1 Dec 2002
Externally publishedYes

    Research areas

  • Algorithm, Recombination, SNP, Topology

ID: 203903433