AB Initio Prediction OfTranscription Factor Binding Sites

Liu LA, Bader JS

Department of Biomedical Engineering and High-Throughput Biology Center, Johns Hopkins University, Baltimore, MD 21218, USA3907
E-mail:joel.bader @ jhu.edu



Pac Symp Biocomput. 2007;:484-495.


Abstract

Transcription factors are DNA-binding proteins that control gene transcription by binding specific short DNA sequences. Experiments that identify transcrip- tion factor binding sites are often laborious and expensive, and the binding sites of many transcription factors remain unknown. We present a computa- tional scheme to predict the binding sites directly from transcription factor se- quence using all-atom molecular simulations. This method is a computational counterpart to recent high-throughput experimental technologies that identify transcription factor binding sites (ChIP-chip and protein-dsDNA binding mi- croarrays). The only requirement of our method is an accurate 3D structural model of a transcription factor–DNA complex. We apply free energy calcula- tions by thermodynamic integration to compute the change in binding energy of the complex due to a single base pair mutation. By calculating the binding free energy differences for all possible single mutations, we construct a position weight matrix for the predicted binding sites that can be directly compared with experimental data. As water-bridged hydrogen bonds between the tran- scription factor and DNA often contribute to the binding specificity, we include explicit solvent in our simulations. We present successful predictions for the yeast MAT- 2 homeodomain and GCN4 bZIP proteins. Water-bridged hydro- gen bonds are found to be more prevalent than direct protein-DNA hydrogen bonds at the binding interfaces, indicating why empirical potentials with im- plicit water may be less successful in predicting binding. Our methodology can be applied to a variety of DNA-binding proteins.


[Full-Text PDF] [PSB Home Page]