Statistical Mechanical Treatment of Protein Conformation. 4. A Four-State Model for Specific-Sequence Copolymers of Amino Acids

Abstract
One-dimensional short-range interaction models for specific-sequence copolymers of amino acids are being developed in this series of papers. In this paper, our earlier three-state model [involving helical (h), extended (epsilon), and coil (or other) (c) states] is extended to a four-state model by preserving the h and epsilon states, introducing the chain-reversal state (R and S), and redefining the c state. This model involves six parameters (wh, vh, vepsilon, vR, vS, and uc) and requires a 6 X 6 statistical weight matrix. A nearest-neighbor approximation of the four-state model is also formulated; it requires a 5 X 5 matrix, involving the same six parameters. By expressing the statistical weights relative to that of the epsilon state, only five parameters (wh, vh, vR, vS, and uc) are required in both the 6 X 6 and 5 X 5 matrices. The statistical weights for the four-state model are evaluated from the atomic coordinates of the x-ray structures of 26 native proteins. These statistical weights, and the four-state model, are used to develop a procedure to predict the backbone conformations of proteins. Since the prediction of helical and extended conformations is carried out by the procedure described in papers 1-3 of this series, we focus particular attention on chain-reversal conformations in this paper. The conformational-sequence probabilities of finding a residue in h, epsilon, R, S, or c states, and of finding two consecutive residues in a chain-reversal conformation, defined as relative values with respect to their average values over the whole molecule, are calculated for 23 proteins. By comparing these conformational-sequence probabilities to experimental X-ray observations, it was found that, in addition to the prediction of helical and extended conformations (reported in paper 3), 219 chain-reversal regions out of 372 observed by x-ray diffraction studies of 23 proteins were predicted correctly. These results suggest that the assumption of the dominance of short-range interactions in determining chain-reversal (as well as helical or extended) conformations in proteins, on which the predictive scheme is based, is a reasonable one. Finally, in the Appendix, the property of asymmetric nucleation of helical sequences is introduced into the (nearest-neighbor) four-state model.