Hidden Markov models of biological primary sequence information.

1 February 1994

journal article
research article
Published by Proceedings of the National Academy of Sciences in Proceedings of the National Academy of Sciences of the United States of America

Vol. 91 (3), 1059-1063
https://doi.org/10.1073/pnas.91.3.1059

Abstract

Hidden Markov model (HMM) techniques are used to model families of biological sequences. A smooth and convergent algorithm is introduced to iteratively adapt the transition and emission parameters of the models from the examples in a given family. The HMM approach is applied to three protein families: globins, immunoglobulins, and kinases. In all cases, the models derived capture the important statistical characteristics of the family and can be used for a number of tasks, including multiple alignments, motif detection, and classification. For K sequences of average length N, this approach yields an effective multiple-alignment algorithm which requires O(KN2) operations, linear in the number of sequences.

Keywords

This publication has 17 references indexed in Scilit:

Hidden Markov Models in Computational Biology
Journal of Molecular Biology, 1994
Dual-specificity protein kinases: will any hydroxyl do?
Trends in Biochemical Sciences, 1992
Expectation maximization algorithm for identifying protein-binding sites with variable lengths from unaligned DNA fragments
Journal of Molecular Biology, 1992
CLUSTAL V: improved software for multiple sequence alignment
Bioinformatics, 1992
Crystal Structure of the Catalytic Subunit of Cyclic Adenosine Monophosphate-Dependent Protein Kinase
Science, 1991
Motif recognition and alignment for many sequences by comparison of dot-matrices
Journal of Molecular Biology, 1991
A thousand and one protein kinases
Cell, 1987
Determinants of a protein fold: Unique features of the globin amino acid sequences
Journal of Molecular Biology, 1987
Similar Amino Acid Sequences: Chance or Common Ancestry?
Science, 1981
A general method applicable to the search for similarities in the amino acid sequence of two proteins
Journal of Molecular Biology, 1970

Cited by 286 articles