Discriminative bag-of-cells for imaging-genomics

Benjamin Chidester1, Minh N. Do2, Jian Ma1


Computational Biology, School of Computer Science, Carnegie Mellon University
Electrical and Computer Engineering, University of Illinois at Urbana-Champaign
Email: bchidest@cs.cmu.edu, minhdo@illinois.edu, jianma@cs.cmu.edu

Pacific Symposium on Biocomputing 23:319-330(2018)

© 2018 World Scientific
Open Access chapter published by World Scientific Publishing Company and distributed under the terms of the Creative Commons Attribution (CC BY) 4.0 License.


Abstract

Connecting genotypes to image phenotypes is crucial for a comprehensive understanding of cancer. To learn such connections, new machine learning approaches must be developed for the better integration of imaging and genomic data. Here we propose a novel approach called Discriminative Bag-of-Cells (DBC) for predicting genomic markers using imaging features, which addresses the challenge of summarizing histopathological images by representing cells with learned discriminative types, or codewords. We also developed a reliable and efficient patch-based nuclear segmentation scheme using convolutional neural networks from which nuclear and cellular features are extracted. Applying DBC on TCGA breast cancer samples to predict basal subtype status yielded a class-balanced accuracy of 70% on a separate test partition of 213 patients. As data sets of imaging and genomic data become increasingly available, we believe DBC will be a useful approach for screening histopathological images for genomic markers. Source code of nuclear segmentation and DBC are available at: https://github.com/bchidest/DBC.


[Full-Text PDF] [PSB Home Page]