Cell-specific prediction and application of drug-induced gene expression

Rachel Hodos1,2,3, Ping Zhang4, Hao-Chih Lee1,2, Qiaonan Duan5,6,7, Zichen Wang5,6,7, Neil R. Clark5,6,7, Avi Ma'ayan5,6,7, Fei Wang4,8, Brian Kidd1,2,9, Jianying Hu4, David Sontag10, Joel Dudley1,2,9


1Institute for Next Generation Healthcare, Icahn School of Medicine at Mount Sinai
2Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai
3Courant Institute of Mathematical Sciences, New York University
4IBM T. J. Watson Research Center
5Department of Pharmacological Sciences
6BD2K-LINCS Data Coordination and Integration Center
7Mount Sinai Center for Bioinformatics, Icahn School of Medicine at Mount Sinai
8Healthcare Policy and Research, Weill Cornell Medical College, Cornell University
9Harris Center for Precision Wellness, Icahn School of Medicine at Mount Sinai
10Institute for Medical Engineering and Science, Massachusetts Institute of Technology

Pacific Symposium on Biocomputing 23:32-43(2018)

© 2018 World Scientific
Open Access chapter published by World Scientific Publishing Company and distributed under the terms of the Creative Commons Attribution (CC BY) 4.0 License.


Abstract

Gene expression profiling of in vitro drug perturbations is useful for many biomedical discovery applications including drug repurposing and elucidation of drug mechanisms. However, limited data availability across cell types has hindered our capacity to leverage or explore the cellspecificity of these perturbations. While recent efforts have generated a large number of drug perturbation profiles across a variety of human cell types, many gaps remain in this combinatorial drug-cell space. Hence, we asked whether it is possible to fill these gaps by predicting cell-specific drug perturbation profiles using available expression data from related conditions--i.e. from other drugs and cell types. We developed a computational framework that first arranges existing profiles into a three-dimensional array (or tensor) indexed by drugs, genes, and cell types, and then uses either local (nearest-neighbors) or global (tensor completion) information to predict unmeasured profiles. We evaluate prediction accuracy using a variety of metrics, and find that the two methods have complementary performance, each superior in different regions in the drug-cell space. Predictions achieve correlations of 0.68 with true values, and maintain accurate differentially expressed genes (AUC 0.81). Finally, we demonstrate that the predicted profiles add value for making downstream associations with drug targets and therapeutic classes.


[Full-Text PDF] [PSB Home Page]