Call for Papers and Posters

Session on Joint Learning from Multiple Types of Genomic Data

at the Pacific Symposium on Biocomputing 2005

 

Recent technological advances enable us to collect many different types of data at a genome-wide scale, including: DNA sequences, gene and protein expression measurements, protein-protein interactions, protein structural information, and protein-DNA binding data. These data provide us with a means to elucidate the large-scale modular organization of the cell. Indeed, much recent work has been devoted to the analysis of these data for this purpose. However, most of this work has been devoted to the analysis of a single type of data at a time, using other types of data only for validation.

Results and findings jointly learned from more than one type of data are likely to lead to new insights that might not be as readily available from analyzing one type of data in isolation. For instance, experimental genomic datasets often contain errors arising from imperfections in the applied technology. Thus, some of the findings of methods that analyze a single type of data may be erroneous. If we assume that technological errors across different genomic datasets are largely independent, then the probability of error in results that are supported by two different types of data is dramatically reduced.

This session will focus on novel methods that use more than one type of data in their analysis and do so jointly. As an example, rather than obtaining clusters of genes with similar expression profiles and then trying to identify common DNA-binding sequence motifs in promoter regions of the genes in each cluster, methods are expected to combine the gene expression levels and sequences of promoter regions in a single framework or algorithm. Since there are many largely equivalent ways to cluster genes according to their expression profiles, the advantage of a combined approach is that it may refine the clustering so that it is also supported by the presence of sequence motifs. Such refinements may increase the signal-to-noise ratio and be critical to achieving meaningful results.

As another example, the recent genome-wide cis-regulatory protein-DNA binding data of Lee, et al., provides experimental results regarding the targets of 106 transcription factors in yeast. Thus, if this data predicts two targets of a transcription factor to be regulated by the same transcription factor, it might be the case that they interact or participate in the same biological process, but it might be difficult to conclude this with high probability based on the binding evidence alone. Combining additional data types might increase our confidence in such predicted interactions. For example, if evidence from a gene expression dataset indicated that the two genes have similar expression profiles, or if protein-protein interaction data indicated that the two genes physically interact, we might be more confident in our claim. This session aims to present research that proposes novel methods for jointly learning from such multiple types of data.

Possible Topics:

  • Cis-regulatory motif identification
  • Gene expression analysis
  • Comparative genomics
  • Protein-protein interactions and protein complexes
  • Genome-wide protein-DNA binding locations
  • Combining text mining and genomic data
  • Gene regulatory networks and/or pathway dynamics

Papers addressing any of the mentioned topics (or other related topics) using multiple types of data are welcome. The session is especially interested in new methods for analyzing multiple types of experimental data and will give a strong preference towards papers that combine more than one of the following types of data: gene expression profiles, protein expression profiles, protein-protein interaction data, protein-DNA binding location data, protein 3D structural information, genomic sequence data (perhaps from multiple organisms), and pathway or functional annotation databases. The performance of the methods should ideally be compared to the performance that can be obtained from learning from one data source at a time.

Individuals who cannot submit to this session but are willing to help referee submissions are kindly requested to contact either of the session co-chairs.

General Information on PSB 2005:

The Pacific Symposium on Biocomputing (PSB 2005) is an international, multidisciplinary conference for the presentation and discussion of current research in the theory and application of computational methods in problems of biological significance. PSB 2005 will be held January 4-8, 2005 at the Fairmont Orchid on the Big Island of Hawaii. Tutorials will be offered prior to the start of the conference.

PSB has been designed to be responsive to the need for critical mass in sub-disciplines within biocomputing. For that reason, it is the only meeting whose sessions are defined dynamically each year in response to specific proposals. PSB sessions are targeted to provide a forum for publication and discussion of research in biocomputing's "hot topics." In this way, PSB provides an early forum for serious examination of emerging methods and approaches in this rapidly changing field. More information on the conference can be obtained from the conference web page: http://psb.stanford.edu/.

General Information on Papers, Abstracts, and Demonstrations:

The scientific core of the conference consists of rigorously peer-reviewed full-length papers reporting on original work. Accepted papers will be published in a hard-bound archival proceedings volume (which is fully indexed in Medline), and the best of these will be presented orally to the entire conference. Researchers wishing to present their research without official publication are encouraged to submit a one page abstract, and present their work in discussion, poster, and demonstration sessions. Workstations and Internet connections will be available for demonstrations; please submit detailed requests for demonstration facilities along with your paper or abstract if you require them.

Paper Formatting and Submission:

All papers must be submitted to russ.altman@stanford.edu in electronic format. The only acceptable file formats are Adobe Acrobat (*.pdf) and Microsoft Word (*.doc). Attached files should be named with the last name of the first author (e.g. altman.pdf or altman.doc). Hardcopy submissions or unprocessed TeX or LaTeX files will be rejected without review.

Each paper must be accompanied by a cover letter. The cover letter must state the following:

  • The email address of the corresponding author
  • The specific PSB session that should review the paper or abstract
  • The submitted paper contains original, unpublished results, and is not currently under consideration elsewhere.
  • All co-authors concur with the contents of the paper.

Submitted papers are limited to twelve (12) pages in the official PSB publication format. Please format your paper according to these instructions, which can be found at http://psb.stanford.edu/psb-online/psb-submit/. If figures cannot be easily resized and placed precisely in the text, then it should be clear that with appropriate modifications, the total manuscript length would be within the page limit.

Important Dates:

  • Paper submission deadline: July 19, 2004
  • Notification of paper acceptance: September 13, 2004
  • Final paper deadline: September 27, 2004
  • Poster abstract deadline: November 1, 2004
  • Meeting: January 4-8, 2005

Session Co-chairs: