Predicting gene function from gene expressions and ontologies

Hvidsten TR, Komorowski J, Sandvik AK, Laegreid A

Knowledge Systems Group, Department of Information and Computer Science, Norwegian University of Science and Technology, 7491 Trondheim, Norway.

Pac Symp Biocomput. 2001;:299-310.


We introduce a methodology for inducing predictive rule models for functional classification of gene expressions from microarray hybridisation experiments. The basic learning method is the rough set framework for rule induction. The methodology is different from the commonly used unsupervised clustering approaches in that it exploits background knowledge of gene function in a supervised manner. Genes are annotated using Ashburner's Gene Ontology and the functional classes used for learning are mined from these annotations. From the original expression data, we extract a set of biologically meaningful features that are used for learning. A rule model is induced from the data described in terms of these features. Its predictive quality is fine-turned via cross-validation on subsets of the known genes prior to classification of unknown genes. The predictive and descriptive quality of such a rule model is demonstrated on the fibroblast serum response data previously analysed by Iyer et. al. Our analysis shows that the rules are capable of representing the complex relationship between gene expressions and function, and that it is possible to put forward high quality hypotheses about the function of unknown genes.

[Full-Text PDF] [PSB Home Page]