The Behavior of Admixed Populations in NeighborJoining Inference of Population TreesNaama M. Kopelman^{1}, Lewi Stone^{2}, Olivier Gascuel^{3}, Noah A. Rosenberg^{4} ^{1}Porter School of Environmental Studies, Department of Zoology, Tel Aviv University;^{2}M´thodes et Algorithmes pour la Bioinformatique, LIRMMCNRS;^{3}Department of Biology, Stanford University, Stanford Email: noahr@stanford.edu Pacific Symposium on Biocomputing 18:273284(2013) 

AbstractNeighborjoining is one of the most widely used methods for constructing evolutionary trees. This approach from phylogenetics is often employed in population genetics, where distance matrices ob tained from allele frequencies are used to produce a representation of population relationships in the form of a tree. In phylogenetics, the utility of neighborjoining derives partly from a result that for a class of distance matrices including those that are additive or treelike—generated by summing weights over the edges connecting pairs of taxa in a tree to obtain pairwise distances—application of neighborjoining recovers exactly the underlying tree. For populations within a species, however, migration and admixture can produce distance matrices that re?ect more complex processes than those obtained from the bifurcating trees typical in the multispecies context. Admixed populations— populations descended from recent mixture of groups that have long been separated—have been observed to be located centrally in inferred neighborjoining trees, with short external branches incident to the path connecting their source populations. Here, using a simple model, we explore mathematically the behavior of an admixed population under neighborjoining. We show that with an additive distance matrix, a population admixed among two source populations necessarily lies on the path between the sources. Relaxing the additivity requirement, we examine the smallest nontriv ial case—four populations, one of which is admixed between two of the other three—showing that the two source populations never merge with each other before one of them merges with the admixed population. Furthermore, the distance on the constructed tree between the admixed population and either source population is always smaller than the distance between the source populations, and the external branch for the admixed population is always incident to the path connecting the sources. We de?ne three properties that hold for four taxa and that we hypothesize are satis?ed under more general conditions: antecedence of clustering, intermediacy of distances, and intermediacy of path lengths. Our ?ndings can inform interpretations of neighborjoining trees with admixed groups, and they provide an explanation for patterns observed in trees of human populations.  
[FullText PDF] [PSB Home Page] 