Detection of 3D atomic similarities and their use in the discrimination of small molecule protein-binding sites
Open Access
- 9 August 2008
- journal article
- research article
- Published by Oxford University Press (OUP) in Bioinformatics
- Vol. 24 (16), i105-i111
- https://doi.org/10.1093/bioinformatics/btn263
Abstract
Motivation: Current computational methods for the prediction of function from structure are restricted to the detection of similarities and subsequent transfer of functional annotation. In a significant minority of cases, global sequence or structural (fold) similarities do not provide clues about protein function. In these cases, one alternative is to detect local binding site similarities. These may still reflect more distant evolutionary relationships as well as unique physico-chemical constraints necessary for binding similar ligands, thus helping pinpoint the function. In the present work, we ask the following question: is it possible to discriminate within a dataset of non-homologous proteins those that bind similar ligands based on their binding site similarities? Methods: We implement a graph-matching-based method for the detection of 3D atomic similarities introducing some simplifications that allow us to extend its applicability to the analysis of large allatom binding site models. This method, called IsoCleft, does not require atoms to be connected either in sequence or space. We apply the method to a cognate-ligand bound dataset of non-homologous proteins. We define a family of binding site models with decreasing knowledge about the identity of the ligand-interacting atoms to uncouple the questions of predicting the location of the binding site and detecting binding site similarities. Furthermore, we calculate the individual contributions of binding site size, chemical composition and geometry to prediction performance. Results: We find that it is possible to discriminate between different ligand-binding sites. In other words, there is a certain uniqueness in the set of atoms that are in contact to specific ligand scaffolds. This uniqueness is restricted to the atoms in close proximity of the ligand in which case, size and chemical composition alone are sufficient to discriminate binding sites. Discrimination ability decreases with decreasing knowledge about the identity of the ligand-interacting binding site atoms. The decrease is quite abrupt when considering size and chemical composition alone, but much slower when including geometry. We also observe that certain ligands are easier to discriminate. Interestingly, the subset of binding site atoms belonging to highly conserved residues is not sufficient to discriminate binding sites, implying that convergently evolved binding sites arrived at dissimilar solutions. Availability: IsoCleft can be obtained from the authors. Contact:rafael.najmanovich@ebi.ac.ukKeywords
This publication has 19 references indexed in Scilit:
- Structural and Chemical Profiling of the Human Cytosolic SulfotransferasesPLoS Biology, 2007
- Shape Variation in Protein Binding Pockets and their LigandsJournal of Molecular Biology, 2007
- A method for localizing ligand binding pockets in protein structuresProteins-Structure Function and Bioinformatics, 2005
- The ConSurf‐HSSP database: The mapping of evolutionary conservation among homologs onto PDB structuresProteins-Structure Function and Bioinformatics, 2004
- Recognition of Functional Sites in Protein StructuresJournal of Molecular Biology, 2004
- Towards a structural classification of phosphate binding sites in protein–nucleotide complexes: An automated all‐against‐all structural comparison using geometric matchingProteins-Structure Function and Bioinformatics, 2004
- ConSurf: an algorithmic tool for the identification of functional regions in proteins by surface mapping of phylogenetic informationJournal of Molecular Biology, 2001
- SURFNET: A program for visualizing molecular surfaces, cavities, and intermolecular interactionsJournal of Molecular Graphics, 1995
- The rapid generation of mutation data matrices from protein sequencesBioinformatics, 1992
- Algorithm 457: finding all cliques of an undirected graphCommunications of the ACM, 1973