A fast and accurate computational approach to protein ionization

Abstract
We report a very fast and accurate physics-based method to calculate pH-dependent electrostatic effects in protein molecules and to predict the pK values of individual sites of titration. In addition, a CHARMm-based algorithm is included to construct and refine the spatial coordinates of all hydrogen atoms at a given pH. The present method combines electrostatic energy calculations based on the Generalized Born approximation with an iterative mobile clustering approach to calculate the equilibria of proton binding to multiple titration sites in protein molecules. The use of the GBIM (Generalized Born with Implicit Membrane) CHARMm module makes it possible to model not only water-soluble proteins but membrane proteins as well. The method includes a novel algorithm for preliminary refinement of hydrogen coordinates. Another difference from existing approaches is that, instead of monopeptides, a set of relaxed pentapeptide structures are used as model compounds. Tests on a set of 24 proteins demonstrate the high accuracy of the method. On average, the RMSD between predicted and experimental pK values is close to 0.5 pK units on this data set, and the accuracy is achieved at very low computational cost. The pH-dependent assignment of hydrogen atoms also shows very good agreement with protonation states and hydrogen-bond network observed in neutron-diffraction structures. The method is implemented as a computational protocol in Accelrys Discovery Studio and provides a fast and easy way to study the effect of pH on many important mechanisms such as enzyme catalysis, ligand binding, protein-protein interactions, and protein stability.