PSB 2002 Tutorial |
PSB 2002 |
Pattern recognition problems in Bioinformatics are characterized by high dimensional and noisy data, by the need to insert prior knowledge in the system and to merge heterogeneous sources of data. The class of Kernel Methods has been designed to work in such conditions, and delivers state of the art performance in many bioinformatics applications.
Based on recent developements in learning theory, this new generation of learning systems introduces several innovations, from the theoretical and computational viewpoint. Its best known instance, Support Vector Machines, have been used to analyze DNA microarray data, sequence data, phylogenetic information, promoter region information, etc. Although sophisticated, the basic ideas can be introduced with some illustrative examples.
One of the main features of these systems is
their modularity: they are formed by a general
purpose learning module and by a problem
specific
In the first part of this tutorial we will introduce the basics of this method, gradually discussing all the main components and features of this approach, and giving simple examples and pointers to literature.
In the second part we will review the main applications of the method to bioinformatics problems: from remote protein homology detection to gene function classification, from the use of promoters information to cancer type classification.
Dr. Nello Cristianini is senior scientist at BIOwulf Genomics. He has been working in the field of Support Vector and Kernel Machines for many years, co-authoring the first text book on this topic and editing two journal special issues. He is action editor of the Journal of Machine Learning Research, and member of the steering board of kernel-machines.org.
Back to the main PSB page | Updated: February 20, 2002 |