The Behavior of Admixed Populations in Neighbor-Joining Inference of Population Trees

Naama M. Kopelman1, Lewi Stone2, Olivier Gascuel3, Noah A. Rosenberg4

1Porter School of Environmental Studies, Department of Zoology, Tel Aviv University;2M´thodes et Algorithmes pour la Bioinformatique, LIRMM-CNRS;3Department of Biology, Stanford University, Stanford

Pacific Symposium on Biocomputing 18:273-284(2013)


Neighbor-joining is one of the most widely used methods for constructing evolutionary trees. This approach from phylogenetics is often employed in population genetics, where distance matrices ob- tained from allele frequencies are used to produce a representation of population relationships in the form of a tree. In phylogenetics, the utility of neighbor-joining derives partly from a result that for a class of distance matrices including those that are additive or tree-like—generated by summing weights over the edges connecting pairs of taxa in a tree to obtain pairwise distances—application of neighbor-joining recovers exactly the underlying tree. For populations within a species, however, migration and admixture can produce distance matrices that re?ect more complex processes than those obtained from the bifurcating trees typical in the multispecies context. Admixed populations— populations descended from recent mixture of groups that have long been separated—have been observed to be located centrally in inferred neighbor-joining trees, with short external branches incident to the path connecting their source populations. Here, using a simple model, we explore mathematically the behavior of an admixed population under neighbor-joining. We show that with an additive distance matrix, a population admixed among two source populations necessarily lies on the path between the sources. Relaxing the additivity requirement, we examine the smallest nontriv- ial case—four populations, one of which is admixed between two of the other three—showing that the two source populations never merge with each other before one of them merges with the admixed population. Furthermore, the distance on the constructed tree between the admixed population and either source population is always smaller than the distance between the source populations, and the external branch for the admixed population is always incident to the path connecting the sources. We de?ne three properties that hold for four taxa and that we hypothesize are satis?ed under more general conditions: antecedence of clustering, intermediacy of distances, and intermediacy of path lengths. Our ?ndings can inform interpretations of neighbor-joining trees with admixed groups, and they provide an explanation for patterns observed in trees of human populations.

[Full-Text PDF] [PSB Home Page]