Chemical reaction vector embeddings: towards predicting drug metabolism in the human gut microbiome

Emily K. Mallory1,†, Ambika Acharya2,†, Stefano E. Rensi3, Peter J Turnbaugh4, Roselie A. Bright5, Russ B. Altman6


1Biomedical Informatics Training Program, Stanford University
2Computer Science Department, Stanford University
3Department of Bioengineering, Stanford University
4Department of Microbiology & Immunology, University of California, San Francisco
5Office of Health Informatics, Office of the Chief Scientist, Office of the Commissioner, Food and Drug
Administration (FDA)
6Departments of Bioengineering, Genetics, Medicine, and Biomedical Data Science, Stanford University
Authors contributed equally to this work
Email: rbaltman@stanford.edu

Pacific Symposium on Biocomputing 23:56-67(2018)

© 2018 World Scientific
Open Access chapter published by World Scientific Publishing Company and distributed under the terms of the Creative Commons Attribution (CC BY) 4.0 License.


Abstract

Bacteria in the human gut have the ability to activate, inactivate, and reactivate drugs with both intended and unintended effects. For example, the drug digoxin is reduced to the inactive metabolite dihydrodigoxin by the gut Actinobacterium E. lenta, and patients colonized with high levels of drug metabolizing strains may have limited response to the drug. Understanding the complete space of drugs that are metabolized by the human gut microbiome is critical for predicting bacteria-drug relationships and their effects on individual patient response. Discovery and validation of drug metabolism via bacterial enzymes has yielded >50 drugs after nearly a century of experimental research. However, there are limited computational tools for screening drugs for potential metabolism by the gut microbiome. We developed a pipeline for comparing and characterizing chemical transformations using continuous vector representations of molecular structure learned using unsupervised representation learning. We applied this pipeline to chemical reaction data from MetaCyc to characterize the utility of vector representations for chemical reaction transformations. After clustering molecular and reaction vectors, we performed enrichment analyses and queries to characterize the space. We detected enriched enzyme names, Gene Ontology terms, and Enzyme Consortium (EC) classes within reaction clusters. In addition, we queried reactions against drug-metabolite transformations known to be metabolized by the human gut microbiome. The top results for these known drug transformations contained similar substructure modifications to the original drug pair. This work enables high throughput screening of drugs and their resulting metabolites against chemical reactions common to gut bacteria.


[Full-Text PDF] [PSB Home Page]