Pattern Recognition in Biomedical Data: Challenges in Putting Big Data to Work


Analyses of big biomedical datasets must not only account for the heterogeneity, multidimensionality, noisiness and incompleteness of the data itself, but must also simultaneously consider the substantial computational resources required for data processing. Indeed, the data-intensive nature of problems in the biomedical informatics field warrants the development and use of novel, well-designed algorithms as well as massive computer infrastructure and advanced software tools, including those deployed in the cloud. In this session, we will address issues related to the optimization of tool development for large-scale datasets, such as compute time, storage, and the need for parallelization, with a focus on computational pattern recognition methods in particular. We are especially interested in innovative approaches to identify and overcome challenges associated with utilizing various types of biomedical data, including but not limited to electronic health records, medical images, genetic sequences, various 'omics data, and others. Finally, our session will also focus on the difficulties arising from integrating diverse biomedical data — such as unprocessed textual data, multi-omics data, cross-species data, cross-institutional data or summary-level statistics for instance — in order to accurately identify patterns across biomedical datasets.

Mauna Kea Mauna Kea Mauna Kea

Session Topics

Topics within the scope of this session include:

Development and use of novel deep learning and machine learning approaches to facilitate pattern recognition across data sources such as
  • Electronic Health Records (EHRs)
  • Medical imaging data
  • Natural language text
  • and others
Computational analyses that address challenges associated with dataset imperfections such as
  • Sparse data
  • Noisy data
  • Multidimensional data
  • Integrated data of various types and from diverse sources
Tools and approaches to optimize computational resource requirements for large-scale, high-dimensional data analysis
  • GPU
  • Hadoop
  • Cloud computing
  • Databases
Data mining approaches that promote the use of metadata in order to integrate data from various sources and predict disease outcomes.
Visualization of patterns in big biomedical data and novel approaches to enhance their reproducibility.
Other related topics.

Session Organizers

Dokyoon Kim

University of Pennsylvania

Shilpa Kobren

Harvard Medical School

Submission Information

Paper Submission Deadline: August 5, 2019 (THIS IS AN ABSOLUTE DEADLINE)
Notification of Acceptance: September 6, 2019
Poster/Abstract Submission Deadline: November 15, 2019
Conference Date: January 3 - 7, 2020

Papers must be submitted to the PSB paper management system. Please note that the submitted papers are reviewed and accepted on a competitive basis. At least three reviewers will be assigned to each submitted manuscript.

The accepted file formats are: postscript (*.ps) and Adobe Acrobat (*.pdf). Attached files should be named with the last name of the first author (e.g., or altman.pdf). Hardcopy submissions, unprocessed TeX or LaTeX files, or electronic submissions not submitted through the paper management system will be rejected without review.

Each paper must be accompanied by a cover letter. The cover letter should be the first page of your paper submission. The cover letter must state the following:

  • The email address of the corresponding author
  • The specific PSB session that should review the paper or abstract
  • The submitted paper contains original, unpublished results, and is not currently under consideration elsewhere
  • All co-authors concur with the contents of the paper

Submitted papers are limited to twelve (12) pages (not including the cover letter) in our publication format. Please format your paper according to these instructions. If figures cannot be easily resized and placed precisely in the text, then it should be clear that with appropriate modifications, the total manuscript length would be within the page limit.