Best practices and lessons learned from reuse of 4 patient-derived metabolomics datasets in Alzheimer's disease

Jessica D. Tenenbaum1, Colette Blach2


1Department of Biostatistics & Bioinformatics, Duke University
2Duke Molecular Physiology Institute, Duke University
Email: jessie.tenenbaum@duke.edu, colette.blach@duke.edu

Pacific Symposium on Biocomputing 23:280-291(2018)

© 2018 World Scientific
Open Access chapter published by World Scientific Publishing Company and distributed under the terms of the Creative Commons Attribution (CC BY) 4.0 License.


Abstract

The importance of open data has been increasingly recognized in recent years. Although the sharing and reuse of clinical data for translational research lags behind best practices in biological science, a number of patient-derived datasets exist and have been published enabling translational research spanning multiple scales from molecular to organ level, and from patients to populations. In seeking to replicate metabolomic biomarker results in Alzheimer's disease our team identified three independent cohorts in which to compare findings. Accessing the datasets associated with these cohorts, understanding their content and provenance, and comparing variables between studies was a valuable exercise in exploring the principles of open data in practice. It also helped inform steps taken to make the original datasets available for use by other researchers. In this paper we describe best practices and lessons learned in attempting to identify, access, understand, and analyze these additional datasets to advance research reproducibility, as well as steps taken to facilitate sharing of our own data.


[Full-Text PDF] [PSB Home Page]