PSB - Abstract

Empowering Multi-Cohort Gene Expression Analysis to Increase Reproducibility

Winston A. Haynes^1,2,3, Francesco Vallania¹, Charles Liu^1,4, Erika Bongen¹, Aurelie Tomczak^1,3, Marta Andres-Terrè¹, Shane Lofgren¹, Andrew Tam¹, Cole A. Deisseroth^1,4, Matthew D. Li¹, Timothy E.Sweeney^1,3, and Purvesh Khatri^1,3

¹Stanford Institute for Immunity, Transplantation, and Infection, Stanford University
²Biomedical Informatics Training Program, Stanford University
³Stanford Center for Biomedical Informatics Research, Stanford University
⁴Stanford Institutes of Medicine Research Program, Stanford University
Email: pkhatri@stanford.edu

Pacific Symposium on Biocomputing 22:144-153(2017)

© 2017 World Scientific
Open Access chapter published by World Scientific Publishing Company and distributed under the terms of the Creative Commons Attribution (CC BY) 4.0 License.

Abstract

A major contributor to the scientific reproducibility crisis has been that the results from homogeneous, single-center studies do not generalize to heterogeneous, real world populations. Multi-cohort gene expression analysis has helped to increase reproducibility by aggregating data from diverse populations into a single analysis. To make the multi-cohort analysis process more feasible, we have assembled an analysis pipeline which implements rigorously studied meta-analysis best practices. We have compiled and made publicly available the results of our own multi-cohort gene expression analysis of 103 diseases, spanning 615 studies and 36,915 samples, through a novel and interactive web application. As a result, we have made both the process of and the results from multi-cohort gene expression analysis more approachable for non-technical users.

[Full-Text PDF] [PSB Home Page]