Recommending Pathway Genes Using a Compendium of Clustering Solutions

Ng DM, Woehrmann MH, Stuart JM

Department of Biomolecular Engineering, University of California, Santa Cruz Santa Cruz, CA 95064, USA
E-mail: jstuart @ soe.ucsc.edu


Pac Symp Biocomput. 2007;:379-390.


Abstract

A common approach for identifying pathways from gene expression data is to cluster the genes without using prior information about a pathway, which often identi- es only the dominant coexpression groups. Recommender systems are well-suited for using the known genes of a pathway to identify the appropriate experiments for predicting new members. However, existing systems, such as the GeneRecommender, ignore how genes naturally group together within speci c experiments. We present a collaborative ltering approach which uses the pattern of how genes cluster together in di erent experiments to recommend new genes in a pathway. Clusters are rst identi ed within a single experiment series. Informative clusters, in which the user-supplied query genes appear together, are identi ed. New genes that cluster with the known genes, in a signi cant fraction of the informative clusters, are recommended. We implemented a prototype of our system and measured its performance on hundreds of pathways. We nd that our method performs as well as an established approach while signi cantly increasing the speed and scalability of searching large datasets. [Supplemental material is available online at sysbio.soe.ucsc.edu/cluegene/psb07.]


[Full-Text PDF] [PSB Home Page]