Improved gene selection for classification of microarrays

Jaeger J, Sengupta R, Ruzzo WL

Department of Computer Science & Engineering, University of Washington, 114 Sieg Hall, Box 352350, Seattle, WA 98195, USA.

Pac Symp Biocomput. 2003;:53-64.


Abstract

In this paper we derive a method for evaluating and improving techniques for selecting informative genes from microarray data. Genes of interest are typically selected by ranking genes according to a test-statistic and then choosing the top k genes. A problem with this approach is that many of these genes are highly correlated. For classification purposes it would be ideal to have distinct but still highly informative genes. We propose three different pre-filter methods--two based on clustering and one based on correlation--to retrieve groups of similar genes. For these groups we apply a test-statistic to finally select genes of interest. We show that this filtered set of genes can be used to significantly improve existing classifiers.


[Full-Text PDF] [PSB Home Page]