Recognizing protein–protein interfaces with empirical potentials and reduced amino acid alphabets

Open Access

27 July 2007

journal article
Published by Springer Science and Business Media LLC in BMC Bioinformatics

Vol. 8 (1), 270
https://doi.org/10.1186/1471-2105-8-270

Abstract

Background In structural genomics, an important goal is the detection and classification of protein–protein interactions, given the structures of the interacting partners. We have developed empirical energy functions to identify native structures of protein–protein complexes among sets of decoy structures. To understand the role of amino acid diversity, we parameterized a series of functions, using a hierarchy of amino acid alphabets of increasing complexity, with 2, 3, 4, 6, and 20 amino acid groups. Compared to previous work, we used the simplest possible functional form, with residue–residue interactions and a stepwise distance-dependence. We used increased computational ressources, however, constructing 290,000 decoys for 219 protein–protein complexes, with a realistic docking protocol where the protein partners are flexible and interact through a molecular mechanics energy function. The energy parameters were optimized to correctly assign as many native complexes as possible. To resolve the multiple minimum problem in parameter space, over 64000 starting parameter guesses were tried for each energy function. The optimized functions were tested by cross validation on subsets of our native and decoy structures, by blind tests on series of native and decoy structures available on the Web, and on models for 13 complexes submitted to the CAPRI structure prediction experiment. Results Performance is similar to several other statistical potentials of the same complexity. For example, the CAPRI target structure is correctly ranked ahead of 90% of its decoys in 6 cases out of 13. The hierarchy of amino acid alphabets leads to a coherent hierarchy of energy functions, with qualitatively similar parameters for similar amino acid types at all levels. Most remarkably, the performance with six amino acid classes is equivalent to that of the most detailed, 20-class energy function. Conclusion This suggests that six carefully chosen amino acid classes are sufficient to encode specificity in protein–protein interactions, and provide a starting point to develop more complicated energy functions.

Keywords

This publication has 46 references indexed in Scilit:

The Many Faces of Protein–Protein Interactions: A Compendium of Interface Geometry
PLoS Computational Biology, 2006
Hot Regions in Protein–Protein Interactions: The Organization and Contribution of Structurally Conserved Hot Spot Residues
Journal of Molecular Biology, 2005
Analysing Six Types of Protein–Protein Interfaces
Journal of Molecular Biology, 2003
Understanding hierarchical protein evolution from first principles
Journal of Molecular Biology, 2001
The Protein Data Bank
Nucleic Acids Research, 2000
Stability of Designed Proteins against Mutations
Physical Review Letters, 1999
Predicting protein stability changes upon mutation using database-derived potentials: solvent accessibility determines the importance of local versus non-local interactions along the sequence
Journal of Molecular Biology, 1997
Enlarged representative set of protein structures
Protein Science, 1994
Backbone-dependent Rotamer Library for Proteins Application to Side-chain Prediction
Journal of Molecular Biology, 1993
CHARMM: A program for macromolecular energy, minimization, and dynamics calculations
Journal of Computational Chemistry, 1983

Cited by 16 articles