Protein structure validation by generalized linear model root‐mean‐square deviation prediction

Abstract
Large‐scale initiatives for obtaining spatial protein structures by experimental or computational means have accentuated the need for the critical assessment of protein structure determination and prediction methods. These include blind test projects such as the critical assessment of protein structure prediction (CASP) and the critical assessment of protein structure determination by nuclear magnetic resonance (CASD‐NMR). An important aim is to establish structure validation criteria that can reliably assess the accuracy of a new protein structure. Various quality measures derived from the coordinates have been proposed. A universal structural quality assessment method should combine multiple individual scores in a meaningful way, which is challenging because of their different measurement units. Here, we present a method based on a generalized linear model (GLM) that combines diverse protein structure quality scores into a single quantity with intuitive meaning, namely the predicted coordinate root‐mean‐square deviation (RMSD) value between the present structure and the (unavailable) “true” structure (GLM‐RMSD). For two sets of structural models from the CASD‐NMR and CASP projects, this GLM‐RMSD value was compared with the actual accuracy given by the RMSD value to the corresponding, experimentally determined reference structure from the Protein Data Bank (PDB). The correlation coefficients between actual (model vs. reference from PDB) and predicted (model vs. “true”) heavy‐atom RMSDs were 0.69 and 0.76, for the two datasets from CASD‐NMR and CASP, respectively, which is considerably higher than those for the individual scores (−0.24 to 0.68). The GLM‐RMSD can thus predict the accuracy of protein structures more reliably than individual coordinate‐based quality scores.
Funding Information
  • Volkswagen Foundation
  • Deutsche Forschungsgemeinschaft (DFG grant JA1952/1-1 (to V.J. and P.G.))
  • e-NMR and WeNMR projects of the European Commission and Japan Society for the Promotion of Science (JSPS)
  • National Institutes of Health Protein Structure Initiative (U54 GM094597 (to G.T.M.))