Inferring population history from genealogical trees

Research output: Contribution to journalJournal articleResearchpeer-review

Inference about population history from DNA sequence data has become increasingly popular. For human populations, questions about whether a population has been expanding and when expansion began are often the focus of attention. For viral populations, questions about the epidemiological history of a virus, e.g., HIV-1 and Hepatitis C, are often of interest. In this paper I address the following question: Can population history be accurately inferred from single locus DNA data? An idealised world is considered in which the tree relating a sample of n non-recombining and selectively neutral DNA sequences is observed, rather than just the sequences themselves. This approach provides an upper limit to the information that possibly can be extracted from a sample. It is shown, based on Kingman's (1982a) coalescent process, that consistent estimation of parameters describing population history (e.g., a growth rate) cannot be achieved for increasing sample size, n. This is worse than often found for estimators of genetic parameters, e.g., the mutation rate typically converges at rate √log(n) under the assumption that all historical mutations can be observed in the sample. In addition, various results for the distribution of maximum likelihood estimators are presented.

Original languageEnglish
JournalJournal of Mathematical Biology
Volume46
Issue number3
Pages (from-to)241-264
Number of pages24
ISSN0303-6812
DOIs
Publication statusPublished - 1 Mar 2003
Externally publishedYes

    Research areas

  • Coalescent process, Genealogy, Maximum likelihood inference, Population history

ID: 203902681