Identification of Aberrant Pathway and Network Activity from High-Throughput Data

Pacific symposium on Biocomputing 2012
January 3-7, 2012
The Big Island of Hawaii

Motivation

Numerous large-scale projects are underway to perform GWAS, proteomic and metabolomic studies, to collect genomic and epigenetic data from tens of thousands of cancer genomes (Cancer Genome Atlas, International Cancer Genomics Consortium) and to comprehensively characterize functional elements in the human genome (ENCODE). These projects are generating very large data sets that characterize genome-wide RNA transcript abundance, DNA methylation, copy number variation, inherited and somatic DNA sequence variation, structural rearrangements, transcription factor binding sites, histone marks, chromatin accessibility, RNA binding and cellular abundance of peptides and metabolites.

Interpretation of this data urgently requires new analysis methods, algorithms, and visualization tools, presenting a significant challenge to the computational biology and bioinformatics community.

In the area of cancer genomics, recent work from The Cancer Genome Atlas (TCGA) project and the Vogelstein-Kinzler-Velculescu group at Hopkins has demonstrated conclusively that cancer etiology is driven not by single gene mutation or expression change, but by coordinated changes in multiple signaling pathways. These pathway changes involve different genes in different individuals, leading to the failure of gene-focused analysis to identify mutations or expression changes driving cancer development. As demonstrated initially by Landers group, there is also evidence that metabolic pathways rather than individual genes play the critical role in metabolic diseases, which led to the development of their gene set enrichment analysis approach.

Many complex databases are being developed and maintained to house genetic, epigenetic, genomic, and functional genomic data. Centralized resources such as the NCBI are developing databases to integrate reads from next generation sequencing experiments, tumor-derived somatic DNA sequence variation, and SNPs/haplotypes significantly associated with disease phenotypes in GWAS studies. Functional genomic data and methylation array data are being captured in the Gene Expression Omnibus (GEO) and ArrayExpress data repositories. The TCGA combines all these types of data together with detailed information about clinical phenotypes. This vast amount of open-access data now allows data analysts and informaticians the opportunity to develop tools and perform initial demonstrations of their validity, independently of new bench experiments. This provides a unique opportunity for the development of tools suitable for analyzing data arising from complex biology.

Following up on our well-received workshop on this topic at PSB 2011, we propose to organize a session that focuses on how pathway and network-based analysis and data integration tools can address the complexity of genome-scale, high-dimensional data sets.

Session Topics

— Algorithms to predict the impact of mutations using integrative approaches.
— Top-down and bottom-up approaches to infer altered pathway activities.
— Predicting clinical outcomes such as survival and drug response using combinations of high-throughput data including but not limited to mutations, copy number, methylation, expression, and external information such as from literature or curated repositories (e.g. COSMIC). Identifying novel gene-gene and gene-phenotype interactions from cancer genomics data.
— Methods that combine GWAS and functional genomics datasets.
— Novel network comparison and network alignment methods to identify significant tumor state changes.
— Algorithms to identify significantly altered subnetworks from a list of altered genes or a set of genomic scores.

Session Co-Chairs

Rachel Karchin
Johns Hopkins University
karchin at jhu dot edu

Michael Ochs
Johns Hopkins University
mfo at jhu dot edu

Josh Stuart
University of California, Santa Cruz
jstuart at ucsc at soe dot ucsc dot edu

Joel Bader
Johns Hopkins University
joel dot bader at jhu dot edu

Session Guest-Speaker

Trey-Ideker Trey Ideker, Ph.D.
Trey Ideker, Ph. D. is Professor of Medicine and Bioengineering at the University of California at San Diego. He serves as Division Chief of Medical Genetics and Director of the National Resource for Network Biology, as well as being Adjunct Professor of Computer Science and Member of the Moores UCSD Cancer Center. Ideker received Bachelor's and Master's degrees from MIT in Electrical Engineering and Computer Science and his Ph.D. from the University of Washington in Molecular Biology under the supervision of Dr. Leroy Hood. He is a pioneer in assembling genome-scale measurements to construct network models of cellular processes and disease. His recent research activities include assembly of networks governing the response to DNA damage; development of the Cytoscape and NetworkBLAST software packages for biological network visualization and cross-species network comparison; and methods for identifying network-based biomarkers in development and disease. Ideker serves on the Editorial Boards for Bioinformatics and PLoS Computational Biology, is on the Scientific Advisory Boards of the Sanford-Burnham Medical Research Institute and the Institute for Systems Biology, and is a regular consultant for companies such as Monsanto and Mendel Biotechnology. He was named one of the Top 10 Innovators of 2006 by Technology Review magazine and was the recipient of the 2009 Overton Prize from the International Society for Computational Biology. His work has been featured in news outlets such as The Scientist, the San Diego Union Tribune, and Forbes magazine.

General Information on Papers and Presentations

The scientific core of the conference consists of rigorously peer-reviewed full-length papers reporting on original work. Accepted papers will be published in an archival proceedings volume (fully indexed in PubMed), and a number of the papers will be selected for presentation during the conference. Researchers wishing to present their research without official publication are encouraged to submit a one-page abstract, and present their work in a poster session.

Submission Information

Please note that the submitted papers are reviewed and accepted on a competitive basis. At least three reviewers will be assigned to each submitted manuscript.

Important Dates

— Paper submissions due: July 11, 2011
— Notification of paper acceptance: September 9, 2011
— Camera-ready final paper deadline: September 23, 2011 at 11:59pm PT
— Abstract deadline for non-reviewed posters: November 30, 2011 at noon ET

Paper Format

Please see the PSB paper format template and instructions at http://psb.stanford.edu/psb-online/psb-submit/.

The file formats we accept are: postscript (*.ps) and Adobe Acrobat (*.pdf)). Attached files should be named with the last name of the first author (e.g. altman.ps or altman.pdf). Hardcopy submissions or unprocessed TeX or LaTeX files will be rejected without review.

Each paper must be accompanied by a cover letter. The cover letter must state the following:

— The email address of the corresponding author.
— The specific PSB session that should review the paper or abstract.
— The submitted paper contains original, unpublished results, and is not currently under consideration elsewhere.
— All co-authors concur with the contents of the paper.

Submitted papers are limited to twelve (12) pages in our publication format. Please format your paper according to instructions found at http://psb.stanford.edu/psb-online/psb-submit/. If figures cannot be easily resized and placed precisely in the text, then it should be clear that with appropriate modifications, the total manuscript length would be within the page limit.

Contact Russ Altman for additional information about paper submission requirements.