Structure-Based Function Prediction of Uncharacterized Protein Using Binding Sites Comparison
Open Access
- 14 November 2013
- journal article
- research article
- Published by Public Library of Science (PLoS) in PLoS Computational Biology
- Vol. 9 (11), e1003341
- https://doi.org/10.1371/journal.pcbi.1003341
Abstract
A challenge in structural genomics is prediction of the function of uncharacterized proteins. When proteins cannot be related to other proteins of known activity, identification of function based on sequence or structural homology is impossible and in such cases it would be useful to assess structurally conserved binding sites in connection with the protein's function. In this paper, we propose the function of a protein of unknown activity, the Tm1631 protein from Thermotoga maritima, by comparing its predicted binding site to a library containing thousands of candidate structures. The comparison revealed numerous similarities with nucleotide binding sites including specifically, a DNA-binding site of endonuclease IV. We constructed a model of this Tm1631 protein with a DNA-ligand from the newly found similar binding site using ProBiS, and validated this model by molecular dynamics. The interactions predicted by the Tm1631-DNA model corresponded to those known to be important in endonuclease IV-DNA complex model and the corresponding binding free energies, calculated from these models were in close agreement. We thus propose that Tm1631 is a DNA binding enzyme with endonuclease activity that recognizes DNA lesions in which at least two consecutive nucleotides are unpaired. Our approach is general, and can be applied to any protein of unknown function. It might also be useful to guide experimental determination of function of uncharacterized proteins. For a substantial proportion of proteins, their functions are not known since these proteins are not related in sequence to any other known proteins. Binding sites are evolutionarily conserved across very distant protein families, and finding similar binding sites between known and unknown proteins can provide clues as to functions of the unknown proteins. We choose one of the “unknown function” proteins, and found, using a novel strategy of binding site comparison to construct a hypothetical protein-ligand complex, subsequently validated by molecular dynamics that this protein most likely binds and repairs the damaged DNA similar to known DNA-repair enzymes. Our methodology is general and enables one to determine functions of other proteins currently labelled as “unknown function”. We envision that the methodology presented herein, the binding sites comparisons enhanced by molecular dynamics, will stimulate the function prediction of other uncharacterized proteins with structures in the Protein Data Bank and boost experimental functional studies of proteins of unknown functions.This publication has 44 references indexed in Scilit:
- The use of evolutionary patterns in protein annotationCurrent Opinion in Structural Biology, 2012
- ProBiS algorithm for detection of structurally similar protein binding sites by local structural alignmentBioinformatics, 2010
- DNA Duplex Stability: The Role of Preorganized ElectrostaticsThe Journal of Physical Chemistry B, 2010
- PDBWiki: added value through community annotation of the Protein Data BankDatabase: The Journal of Biological Databases and Curation, 2010
- CHARMM: The biomolecular simulation programJournal of Computational Chemistry, 2009
- CHARMMing: A New, Flexible Web Portal for CHARMMJournal of Chemical Information and Modeling, 2008
- Detecting evolutionary relationships across existing fold space, using sequence order-independent profile–profile alignmentsProceedings of the National Academy of Sciences of the United States of America, 2008
- Inference of Macromolecular Assemblies from Crystalline StateJournal of Molecular Biology, 2007
- Structure-based activity prediction for an enzyme of unknown functionNature, 2007
- Electrostatics of nanosystems: Application to microtubules and the ribosomeProceedings of the National Academy of Sciences of the United States of America, 2001