Merging Heterogeneous Data to Enable Knowledge Discovery
1 Department of Medicine, (Biomedical Informatics)
Stanford University 2 Department of Pediatrics
University of Colorado Denver Anschutz Medical Campus
The digitalization of high value information is
generating measurements on dynamic processes, interactions, and
systems that cross multiple orders of magnitude. We are beginning to
see innovative results that have emerged by linking, integrating,
and harmonizing such data and knowledge across previously
independent data and knowledge sources. For this workshop we invite
submissions that will highlight new results linking and integrating
data and knowledge across heterogeneous sources (e.g. electronic
medical records, geo-code data, genetic information, social media).
The "digitalization of everything" is generating
digital measurements on dynamic processes, interactions, and systems
that cross multiple orders of magnitude. Weber, Mandl, and Kohane
(2016) describe an extensive existing data ecosystem they called the
"Tapestry of high-value information sources". Wearable multi-sensors
record continuous real-time measurements of personal biological
processes and physiological responses while digital homes and
sensors capture everyday environmental exposures. At the same time,
a similar explosion is occurring in digital knowledge through
electronic publication, large-scale ontology development and
knowledge grids within a learning healthcare system. Collectively,
these sources also capture knowledge on processes, interactions and
systems across physical, temporal, and systems scales never before
available. An aggressive effort by the open data community and
funding agencies seek to ensure that these extensive digital assets
are liquid, transparent, and linkable.
We invite researchers from different fields to present high impact
research in these areas including (but not limited to):
Innovations in linking and integrating resources (data
fusion), at all levels of the biological, clinical, and
environmental/exposure scales.
Efforts to create novel data and knowledge assets that
enable findings not possible with a single source.
Demonstrations of how integrated/linked data are more
robust to data quality anomalies that could prevent discovery
(missingness, bias, noisiness)
Following PSB2018 Workshop on Diversity and Disparity, we
seek examples where combined data sets enabled the study of
populations who are not adequately represented in medical research
to gain from the research findings.
Development and use of novel methods and tools for data
fusion, knowledge extraction, and visualization of disparate data.
Interested researchers please submit a one-page abstract of your presentation to boussard AT Stanford.edu by August 1, 2018. References and one figure are optional and do not contribute to page limit. Invitations for presentation will be sent out by August 15, 2018.