A call for paper in

Literature Data Mining for Biology

A special session within the
Pacific Symposium on Biocomputing 2002
January 3-7, 2002
Kauai Marriott Resort and Beach Club

A large part of the information required for biology research can only be found in free-text form, as in MEDLINE abstracts, or in comment fields of relevant reports, as in GenBank feature table annotations. This information is important for many types of analysis, such as classification of proteins into functional groups, discovery of new functional relationships, maintenance of  information on material and methods, increased  precision and relevance of hits returned by BLAST, extraction of protein interaction information, and so on. However, information in free-text form or in comment fields is very difficult for automated systems to use. In addition, the extracted information  may need further enrichment, for example, through the inclusion of quantitative information about the interaction. This session will investigate how natural language and data mining techniques can provide and structure information relevant to biological applications. The session solicits papers on techniques and applications of natural language processing to the extraction of biological information from free text, including literature abstracts (e.g., MEDLINE), database annotations (e.g., GENBANK or PIR), and other relevant biology sources. It will emphasize the combination of natural language techniques with other biological information sources, such as database and sequence searches, to facilitate collection and organization of information about particular genes, proteins, or pathways. In particular, we are interested in:

Session co-chairs

