A method to recognize distant repeats in protein sequences
- 1 December 1993
- journal article
- research article
- Published by Wiley in Proteins-Structure Function and Bioinformatics
- Vol. 17 (4), 391-411
- https://doi.org/10.1002/prot.340170407
Abstract
An automated algorithm is presented that delineates protein sequence fragments which display similarity. The method incorporates a selection of a number of local nonoverlapping sequence alignments with the highest similarity scores and a graphtheoretical approach to elucidate the consistent start and end points of the fragments comprising one or more ensembles of related subsequences. The procedure allows the simultaneous identification of different types of repeats within one sequence. A multiple alignment of the resulting fragments is performed and a consensus sequence derived from the ensemble(s). Finally, a profile is constructed form the multiple alignment to detect possible and more distant members within the sequence. The method tolerates mutations in the repeats as well as insertions and deletions. The sequence spans between the various repeats or repeat clusters may be of different lengths. The technique has been applied to a number of proteins where the repeating fragments have been derived from information additional to the protein sequences.Keywords
This publication has 42 references indexed in Scilit:
- Sequence alignment approach to pick up conformationally similar protein fragmentsJournal of Molecular Biology, 1992
- Side-chain clusters in protein structures and their role in protein foldingJournal of Molecular Biology, 1991
- A sensitive procedure to compare amino acid sequencesJournal of Molecular Biology, 1987
- X-ray analysis of the eye lens protein γ-II crystallin at 1·9 Å resolutionJournal of Molecular Biology, 1983
- Analysis of gene duplication repeats in the myosin rodJournal of Molecular Biology, 1983
- Gene duplications in the structural evolution of chymotrypsinJournal of Molecular Biology, 1979
- The protein data bank: A computer-based archival file for macromolecular structuresJournal of Molecular Biology, 1977
- The 14-fold periodicity in α-tropomyosin and the interaction with actinJournal of Molecular Biology, 1976
- A molecular theory of lipid—protein interactions in the plasma lipoproteinsFEBS Letters, 1974
- Tests for comparing related amino-acid sequences. Cytochrome c and cytochrome c551Journal of Molecular Biology, 1971