Inferring population history from genealogical trees

Publikation: Bidrag til tidsskriftTidsskriftartikelForskningfagfællebedømt

Inference about population history from DNA sequence data has become increasingly popular. For human populations, questions about whether a population has been expanding and when expansion began are often the focus of attention. For viral populations, questions about the epidemiological history of a virus, e.g., HIV-1 and Hepatitis C, are often of interest. In this paper I address the following question: Can population history be accurately inferred from single locus DNA data? An idealised world is considered in which the tree relating a sample of n non-recombining and selectively neutral DNA sequences is observed, rather than just the sequences themselves. This approach provides an upper limit to the information that possibly can be extracted from a sample. It is shown, based on Kingman's (1982a) coalescent process, that consistent estimation of parameters describing population history (e.g., a growth rate) cannot be achieved for increasing sample size, n. This is worse than often found for estimators of genetic parameters, e.g., the mutation rate typically converges at rate √log(n) under the assumption that all historical mutations can be observed in the sample. In addition, various results for the distribution of maximum likelihood estimators are presented.

OriginalsprogEngelsk
TidsskriftJournal of Mathematical Biology
Vol/bind46
Udgave nummer3
Sider (fra-til)241-264
Antal sider24
ISSN0303-6812
DOI
StatusUdgivet - 1 mar. 2003
Eksternt udgivetJa

ID: 203902681