Bayesian statistical analysis of protein side‐chain rotamer preferences

Abstract
We present a Bayesian statistical analysis of the conformations of side chains in proteins from the Protein Data Bank. This is an extension of the backbone-dependent rotamer library, and includes rotamer populations and average x angles for a full range of ø,ψ values. The Bayesian analysis used here provides a rigorous statistical method for taking account of varying amounts of data. Bayesian statistics requires the assumption of a prior distribution for parameters over their range of possible values. This prior distribution can be derived from previous data or from pooling some of the present data. The prior distribution is combined with the data to form the posterior distribution, which is a compromise between the prior distribution and the data. For the X2, X3, and X4 rotamer prior distributions, we assume that the probability of each rotamer type is dependent only on the previous X rotamer in the chain. For the backbone-dependence of the x1 rotamers, we derive prior distributions from the product of the ø-dependent and ψ-dependent probabilities. Molecular mechanics calculations with the CHARMM22 potential show a strong similarity with the experimental distributions, indicating that proteins attain their lowest energy rotamers with respect to local backbone-side-chain interactions. The new library is suitable for use in homology modeling, protein folding simulations, and the refinement of X-ray and NMR structures.