Call For Papers

Translating Biology: Text Mining Tools That Work

A Pacific Symposium on Biocomputing Session

January 4-8, 2008

The Big Island, Hawai'i

Motivation|Submission requirements|Session chairs|Submission information

Motivation for this session

Biomedical science is now an information-intensive field of study, with high-throughput experimental techniques generating large amounts of data, and bioinformatics providing tools for managing and making sense of that data. However, the information generated and used in biomedical science must be accessible both to computers and to people. This requires constant translation between human-readable forms, such as text and figures, to computer-readable forms, such as biological databases and ontologies. In a recent PLoS Computational Biology editorial, Philip Bourne posed the following question: Will a biological database be different from a biological journal? If we had text mining tools that worked, then the translation from text to database (and back) would blur these lines. Such tools would enable the seamless incorporation of semantic information extracted from text with databases and with analytical tools, as just one of many sources of information in addressing complex biological problems.

Submission requirements

From the many publications in the area, we know that performance has reached reasonable levels on a number of basic text mining tasks, such as indexing and the identification of biomedical entities. We now need to ask a new set of questions: Do these tools work? Can they be adapted to new applications? Are they cost-effective in real applications? Who uses these tools, and how? Can these tools be maintained over time? The answers to these questions are critical to understanding the apparent gap between the number of publications on biomedical text mining and the number of deployed text mining applications. The answers to these questions are also essential to providing the bioinformatics community with the text mining tools that they are asking for. We categorize these questions into four attributes: utility, usability, portability, and robustness.

The proposed session will focus on papers that explore these issues, including questions such as:

What is the actual utility of text mining in the work flows of the various communities of potential users—model organism database curators, bedside clinicians, biologists utilizing high-throughput experimental assays, hospital billing departments?
How usable are biomedical text mining applications? How does the application fit into the workflow of a complex bioinformatics pipeline? What kind of training does a bioscientist require to be able to use an application?
Is it possible to build portable text mining systems? Can systems be adapted to specific domains and specific tasks without the assistance of an experienced language processing specialist?
How robust and reliable are biomedical text mining applications? What are the best ways to assess robustness and reliability? Are the standard evaluation paradigms of the natural language processing world—intrinsic evaluation against a gold standard, post-hoc judging of outputs by trained judges, extrinsic evaluation in the context of some other task—the best evaluation paradigms for biomedical text mining, or even sufficient evaluation paradigms?

Session chairs

Lynette Hirschman
The MITRE Corporation
Kevin Bretonnel Cohen (Contact person)
University of Colorado School of Medicine
kevin.cohen@gmail.com
Philip Bourne
University of California San Diego
Hong Yu
University of Wisconsin at Milwaukee

Submission information

The core of the conference consists of rigorously peer-reviewed full-length papers reporting on original work. Accepted papers will be published in a hard-bound archival proceedings, and the best of these will be presented orally to the entire conference. Researchers wishing to present their research without official publication are encouraged to submit a one page abstract by noon, November 9, 2007 to present their work in the poster sessions.

Important dates

Paper submissions due: July 21, 2007 midnight East Coast time (for this session only)
Notification of paper acceptance: September 5, 2007
Final paper deadline: September 24, 2007 midnight PT
Abstract deadline: November 9, 2007
Meeting: January 4-8, 2008

Paper format

All papers must be submitted to psb-submit @ helix.stanford.edu and kevin.cohen @ gmail.com in electronic format with PSB in the subject line. The file formats we accept are: postscript (*.ps) and Adobe Acrobat (*.pdf). Attached files should be named with the last name of the first author (e.g. altman.ps or altman.pdf). Hardcopy submissions or unprocessed TEX or LATEX files will be rejected without review.

Each paper must be accompanied by a cover letter. The cover letter must state the following:

The email address of the corresponding author.
The specific PSB session that should review the paper or abstract.
The submitted paper contains original, unpublished results, and is not currently under consideration elsewhere.
All co-authors concur with the contents of the paper.

Submitted papers are limited to twelve (12) pages in our publication format. Please format your paper according to instructions found at http://psb.stanford.edu/psb-online/psb-submit/. If figures can not be easily resized and placed precisely in the text, then it should be clear that with appropriate modifications, the total manuscript length would be within the page limit.

Color pictures can be printed at the expense of the authors. The fee is $500 per page of color pictures, payable at the time of camera ready submission.

Contact Russ Altman (psb-submit @ helix.stanford.edu) for additional information about paper submission requirements.