PSB 2017 Workshop: Open Data
for Discovery Science
Philip R.O.
Payne, PhD1; Kun Huang, PhD2; Nigam H. Shah, MBBS, PhD3;
Jessica Tenenbaum, PhD4
1Washington
University in St. Louis, Institute for Informatics
2The
Ohio State University, Department of Biomedical Informatics
3Stanford
University, Center for Biomedical Informatics Research
4Duke
University, Department of Biostatistics and Bioinformatics
Contact Us:
Introduction:
The modern healthcare and life sciences ecosystem is moving towards an
increasingly open and data-centric approach to discovery science. This evolving paradigm is predicated on
a complex set of information needs related to our collective ability to share,
discover, reuse, integrate, and analyze open biological, clinical, and
population level data resources of varying composition, granularity, and
syntactic or semantic consistency.
Such an evolution is further impacted by a concomitant growth in the
size of data sets that can and should be employed for both hypothesis discovery
and testing. When such open data
can be accessed and employed for discovery purposes, a broad spectrum of high
impact end-points is made possible. These span the spectrum from identification
of de novo biomarker complexes that
can inform precision medicine, to the repositioning or repurposing of extant
agents for new and cost-effective therapies, to the assessment of population level
influences on disease and wellness.
Of note, these types of uses of open data can be either primary, wherein
open data is the substantive basis for inquiry, or secondary, wherein open data
is used to augment or enrich project-specific or proprietary data that is not
open in and of itself.
Given these opportunities and the current state of
knowledge concerning the use of open data across and between types and scales
for the purposes of discovery science, this workshop will address:
1.
The state-of-the-art in terms of
tools and methods targeting the use of open data for discovery science,
including but not limited to syntactic and semantic standards, platforms for
data sharing and discovery, and computational workflow orchestration
technologies that enable the creation of data analytics "pipelines"
2.
Practical approaches for the
automated and/or semi-automated harmonization, integration, analysis, and
presentation of "data products" to enable hypothesis discovery or testing
3.
Frameworks for the application of
open data to support or enable hypothesis generation and testing in projects
spanning the basic, translational, clinical, and population health research and
practice domains (e.g., from molecules to populations).
Workshop
Rationale:
PSB 2017 is
specifically intended to provide a forum in which participants can present: "work in databases, algorithms, interfaces,
natural language processing, modeling and other computational methods, as
applied to biological problems, with emphasis on applications in data-rich
areas of molecular biology." In
addition "A major goal of PSB is to
create productive interaction among the rather different research cultures of
computer science and biology."
As such, PSB 2017 provides an ideal venue for a vigorous and highly productive
exchange of knowledge and ideas surrounding the current and future directions
for the use of open data in order to support or enable discovery science,
an area which by its nature involves:
1. The creation, verification and validation of tools and methods that can
assist in the sharing, discovery, and analysis of open data in a primary or
secondary manner, including the development of databases, algorithms, and
modeling techniques therein
2. The conduct of discovery science in data-intensive experimental contexts
that leverage such open data resources
3. The interaction of multidisciplinary computational, biology, clinical,
and population health science teams to conduct research that serves to
translate such discovery into patient-level or broader intervention strategies
to improve human health and wellness.
Workshop
Organizers:
Philip R.O. Payne, PhD, FACMI
Dr. Payne is
the Founding Director of the Institute for Informatics at Washington University
in St. Louis. Previously, he served
as Professor and Chair of the Department of Biomedical Informatics at The Ohio
State University (OSU), where we was also the inaugural Director of the
Translational Data Analytics @ OSU, a campus-wide program to create a singular
presence in applied data analytics at one of the nation"s largest land-grant
universities. Dr. Payne is an
internationally recognized leader in the field of clinical research informatics
(CRI) and translational bioinformatics (TBI). His research portfolio is actively
supported by a combination of NCATS, NLM, and NCI grants and contracts, as well
a variety of awards from both non-profit and philanthropic organizations. Dr.
Payne received his Ph.D. with distinction in Biomedical Informatics from
Columbia University, where his research focused on the use of knowledge
engineering and human-computer interaction design principles in order to
improve the efficiency of multi-site clinical and translational research
programs. Dr. Payne is also the
co-founder of Signet Accel LLC, a healthcare information technology start-up
that delivers advanced data sharing and interoperability solutions to the
healthcare delivery, translational research, and bio-pharmaceutical
sectors. He is an elected fellow of
the American College of Medical Informatics (ACMI), and serves as a consultant
and advisor to a broad spectrum of academic, government, and private sector
informatics and data science initiatives at the international-level.
Kun Huang, PhD
Dr. Kun
Huang is Professor in Biomedical Informatics, Computer Science and Engineering,
and Biostatistics at The Ohio State University (OSU). He is also the Division
Director for Bioinformatics and Computational Biology in OSU Department of
Biomedical Informatics as well as Associate Dean for Genomic Informatics in the
OSU College of Medicine. His research program focuses on developing
bioinformatics tools for systems biology and translational research. He has
developed many methods for analyzing and integrating various types of high
throughput biomedical data including gene expression microarray, next
generation sequencing (NGS), qRT-PCR, proteomics and
microscopic imaging experiments. These methods have been successfully applied
to research projects on different diseases such as cancers, fibrosis,
cardiovascular diseases, wound healing, and inflammatory bowel diseases and
idiopathic pulmonary fibrosis (IPF). Recently he has been awarded multiple
grants for developing integrative genomics software and algorithms for disease
biomarker and therapeutic targets discovery. Dr. Kun Huang received
his BS degree in Biological Sciences from Tsinghua University in 1996 and his
MS degrees in Physiology, Electrical Engineering and Mathematics all from the
University of Illinois at Urbana-Champaign (UIUC). He then received his PhD in
Electrical and Computer Engineering from UIUC in 2004 with a focus on computer
vision and machine learning.
Nigam Shah,
MBBS, PhD, FACMI
Dr. Nigam
Shah is associate professor of Medicine (Biomedical Informatics) at Stanford
University, Assistant Director of the Center for Biomedical Informatics
Research, and a core member of the Biomedical Informatics Graduate Program. Dr.
Shah's research focuses on combining machine learning and prior knowledge in
medical ontologies to enable use cases of the learning health system. Dr. Shah
received the AMIA New Investigator Award for 2013 and the Stanford Biosciences
Faculty Teaching Award for outstanding teaching in his graduate class on "Data
driven medicine" (Biomedin 215). Dr. Shah was elected
into the American College of Medical Informatics (ACMI) in 2015 and to the
American Society for Clinical Investigation (ASCI) in 2016. He holds an MBBS
from Baroda Medical College, India, a PhD from Penn State University and
completed postdoctoral training at Stanford University. More at: https://med.stanford.edu/profiles/nigam-shah.
Jessica Tenenbaum, PhD
Dr. Tenenbaum is Assistant Professor in the Division of
Translational Biomedical Informatics, Department of Biostatistics and
Bioinformatics at Duke University, and Associate Director for Bioinformatics
for the Duke Translational Medicine Institute. Her primary areas of research
are: 1) Infrastructure and standards to enable research collaboration and integrative
data analysis; 2) Informatics to enable precision medicine; and) Ethical,
legal, and social issues that arise in translational research, direct to
consumer genetic testing, and data sharing. At Duke, Dr. Tenenbaum
oversaw development of the MURDOCK Integrated Data Repository (MIDR) for
management and integration of clinical, demographic, omic,
and bio-specimen data for the MURDOCK Study (www.murdock-study.com) and related
ancillary studies. Nationally, Dr. Tenenbaum plays a
leadership role in the American Medical Informatics Association, serving as
Chair of the Genomics and Translational Bioinformatics Working Group and as an
elected member of the Board of Directors. She is an Associate Editor for the
Journal of Biomedical Informatics and serves on the advisory panel for Nature
Publishing Group's Scientific Data initiative. After earning her bachelor"s degree in
biology from Harvard, Dr. Tenenbaum worked as a
program manager at Microsoft Corporation in Redmond, WA for six years before
pursuing a PhD in biomedical informatics at Stanford University.