On the minimum number of topologies explaining a sample of DNA sequences

Publikation: Bidrag til tidsskriftTidsskriftartikelForskningfagfællebedømt

In this article I derive an alternative algorithm to Hudson and Kaplan's (Genetics 111, 147-165) algorithm that gives a lower bound to the number of recombination events in a sample's history. It is shown that the number, T M, found by the algorithm is the least number of topologies required to explain a set of DNA sequences sampled under the infinite-site assumption. Let T=(T1,⋯,Tr) be a list of topologies compatible with the sequences, i.e., Tk is compatible with an interval, I k, of sites in the alignment. A characterization of all lists having TM topologies is given and it is shown that TM relates to specific patterns in the alignment, here called chain series. Further, a number of theorems relating general lists of topologies to the number TM is presented. The results are discussed in relation to the true minimum number of recombination events required to explain an alignment.

OriginalsprogEngelsk
TidsskriftTheoretical Population Biology
Vol/bind62
Udgave nummer4
Sider (fra-til)357-363
Antal sider7
ISSN0040-5809
DOI
StatusUdgivet - 1 dec. 2002
Eksternt udgivetJa

ID: 203903433