Identifying Parent-Daughter Relationships Among Duplicated Genes


Mira V. Han And Matthew W. Hahn


Department of Biology and School of Informatics, Indiana University, Bloomington, IN 47405 USA


Pacific Symposium on Biocomputing 14:114-125(2009)


Abstract

In this paper we use the length of the shared synteny between genes to identify "parent" orthologs among multiple lineage specific duplicated genes. Genes in the region around each duplicated paralog are compared with the genes flanking an outgroup ortholog to estimate the probability of observing homologs in syntenic vs. non-syntenic regions. The length of the shared synteny is introduced as a hidden variable and is estimated using Expectation-Maximization for each lineage specific paralog. Assuming that the original, parental gene will preserve the longest synteny with the outgroup gene, and that any daughter genes will have a shorter syntenic block, we are able to determine parent-daughter relationships. We apply this method to lineage specific duplications in the human genome, and show that we are able to determine the direction and size of the duplication events that have created hundreds of genes.


[Full-Text PDF] [PSB Home Page]