Alignments grow, secondary structure prediction improves

6 December 2001

journal article
research article
Published by Wiley in Proteins-Structure Function and Bioinformatics

Vol. 46 (2), 197-205
https://doi.org/10.1002/prot.10029

Abstract

Using information from sequence alignments significantly improves protein secondary structure prediction. Typically, more divergent profiles yield better predictions. Recently, various groups have shown that accuracy can be improved significantly by using PSI‐BLAST profiles to develop new prediction methods. Here, we focused on the influences of various alignment strategies on two 8‐year‐old PHD methods. The following results stood out. (i) PHD using pairwise alignments predicts about 72% of all residues correctly in one of the three states: helix, strand, and other. Using larger databases and PSI‐BLAST raised accuracy to 75%. (ii) More than 60% of the improvement originated from the growth of current sequence databases; about 20% resulted from detailed changes in the alignment procedure (substitution matrix, thresholds, and gap penalties). Another 20% of the improvement resulted from carefully using iterated PSI‐BLAST searches. (iii) It is of interest that we failed to improve prediction accuracy further when attempting to refine the alignment by dynamic programming (MaxHom and ClustalW). (iv) Improvement through family growth appears to saturate at some point. However, most families have not reached this saturation. Hence, we anticipate that prediction accuracy will continue to rise with database growth. Proteins 2002;46:197–205.

Keywords

This publication has 58 references indexed in Scilit:

The Protein Data Bank
Nucleic Acids Research, 2000
Protein secondary structure prediction based on position-specific scoring matrices 1 1Edited by G. Von Heijne
Journal of Molecular Biology, 1999
Gapped BLAST and PSI-BLAST: a new generation of protein database search programs
Nucleic Acids Research, 1997
Prediction of Protein Secondary Structure by Combining Nearest-neighbor Algorithms and Multiple Sequence Alignments
Journal of Molecular Biology, 1995
CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice
Nucleic Acids Research, 1994
Prediction of Transmembrane Segments in Proteins Utilising Multiple Sequence Alignments
Journal of Molecular Biology, 1994
Bona Fide Prediction of Aspects of Protein Conformation: Assigning Interior and Surface Residues from Patterns of Variation and Conservation in Homologous Protein Sequences
Journal of Molecular Biology, 1994
Prediction of Protein Secondary Structure at Better than 70% Accuracy
Journal of Molecular Biology, 1993
Predicting Coiled Coils from Protein Sequences
Science, 1991
Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features
Peptide Science, 1983

Cited by 159 articles