Gathering the Gold Dust: Methods for Assessing the Aggregate Impact of Small Effect Genes in Genomic Scans

Michael A. Province, Ingrid B. Borecki


Division of Statistical Genomics Box 8506, Center for Genome Sciences, Washington University School of Medicine, 4444 Forest Park Blvd, St. Louis, MO, 63108, USA


Pac Symp Biocomput. 2008;:190-200.


Abstract

Genomewide association scan (GWAS) data mining has found moderate-effect “gold nugget” complex trait genes. But for many traits, much of the explanatory variance may be truly polygenic, more like gold dust, whose small marginal effects are undetectable by traditional methods. Yet, their collective effects may be quite important in advancing personalized medicine. We consider a novel approach to sift out the genetic gold dust influencing quantitative (or qualitative) traits. Out of a GWAS, we randomly grab handfuls of SNPs, modeling their effects in a multiple linear (or logistic) regression. The model’s significance is used to obtain an iteratively updated pseudo-Bayesian posterior probability associated with each SNP, which is repeated over many random draws until the distribution becomes stable. A stepwise procedure culls the list of SNPs to define the final set. Results from a benchmark simulation of 5 quantitative trait genes among 1,000, in 1,000 random subjects, are contrasted with marginal tests using nominal significance, Bonferroni-corrected significance, false discovery rates, as well as with serial selection methods. Random handfuls produced the best combination of sensitivity (0.95) specificity (0.99) and true positive rate (0.71) of all methods tested and better replicability in an independent subject set. From more extensive simulations, we determine which combinations of signal to noise ratios, SNP typing densities, and sample sizes are tractable with which methods to gather the gold dust.


[Full-Text PDF] [PSB Home Page]