Modeling protein cores with Markov random fields
- 31 December 1994
- journal article
- Published by Elsevier BV in Mathematical Biosciences
- Vol. 124 (2), 149-179
- https://doi.org/10.1016/0025-5564(94)90041-8
Abstract
A mathematical formalism is introduced that has general applicability to many protein structure models used in the various approaches to the “inverse protein folding problem.” The inverse nature of the problem arises from the fact that one begins with a set of assumed tertiary structures and searches for those most compatible with a new sequence, rather than attempting to predict the structure directly from the new sequence. The formalism is based on the well-known theory of Markov random fields (MRFs). Our MRF formulation provides explicit representations for the relevant amino acid position environments and the physical topologies of the structural contacts. In particular, MRF models can readily be constructed for the secondary structure packing topologies found in protein domain cores, or other structural motifs, that are anticipated to be common among large sets of both homologous and nonhomologous proteins. MRF models are probabilistic and can exploit the statistical data from the limited number of proteins having known domain structures. The MRF approach leads to a new scoring function for comparing different threadings (placements) of a sequence through different structure models. The scoring function is very important, because comparing alternative structure models with each other is a key step in the inverse folding problem. Unlike previously published scoring functions, the one derived in this paper is based on a comprehensive probabilistic formulation of the threading problem.Keywords
This publication has 26 references indexed in Scilit:
- Protein classification by stochastic modeling and optimal filtering of amino-acid sequencesMathematical Biosciences, 1994
- Prediction of Protein Structure by Evaluation of Sequence-structure Fitness: Aligning Sequences to Contact Profiles Derived from Three-dimensional StructuresJournal of Molecular Biology, 1993
- Contact potential that recognizes the correct folding of globular proteinsJournal of Molecular Biology, 1992
- Topology fingerprint approach to the inverse protein folding problemJournal of Molecular Biology, 1992
- A new approach to protein fold recognitionNature, 1992
- One thousand families for the molecular biologistNature, 1992
- Assessment of protein models with three-dimensional profilesNature, 1992
- Identification of native protein folds amongst a large number of incorrect models: The calculation of low energy conformations from potentials of mean forceJournal of Molecular Biology, 1990
- Calculation of conformational ensembles from potentials of mena forceJournal of Molecular Biology, 1990
- The protein data bank: A computer-based archival file for macromolecular structuresJournal of Molecular Biology, 1977