Promoter region-based classification of genes

Pavlidis P, Furey TS, Liberto M, Haussler D, Grundy WN

Columbia Genome Center, Columbia University, USA. pp175@columbia.edu

Pac Symp Biocomput. 2001;:151-63.


Abstract

In this paper we consider the problem of extracting information from the upstream untranslated regions of genes to make predictions about their transcriptional regulation. We present a method for classifying genes based on motif-based hidden Markov models (HMMs) of their promoter regions. Sequence motifs discovered in yeast promoters are used to construct HMMs that include parameters describing the number and relative locations of motifs within each sequence. Each model provides a Fisher kernel for a support vector machine, which can be used to predict the classifications of unannotated promoters. We demonstrate this method on two classes of genes from the budding yeast, S. cerevisiae. Our results suggest that the additional sequence features captured by the HMM assist in correctly classifying promoters.


[Full-Text PDF] [PSB Home Page]