From HMM's to segment models: a unified view of stochastic modeling for speech recognition

1 September 1996

journal article
Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Speech and Audio Processing

Vol. 4 (5), 360-378
https://doi.org/10.1109/89.536930

Abstract

Many alternative models have been proposed to address some of the shortcomings of the hidden Markov model (HMM), which is currently the most popular approach to speech recognition. In particular, a variety of models that could be broadly classified as segment models have been described for representing a variable-length sequence of observation vectors in speech recognition applications. Since there are many aspects in common between these approaches, including the general recognition and training problems, it is useful to consider them in a unified framework. The paper describes a general stochastic model that encompasses most of the models proposed in the literature, pointing out similarities of the models in terms of correlation and parameter tying assumptions, and drawing analogies between segment models and HMMs. In addition, we summarize experimental results assessing different modeling assumptions and point out remaining open questions.

Keywords

This publication has 70 references indexed in Scilit:

A dynamical system model for generating fundamental frequency for speech synthesis
IEEE Transactions on Speech and Audio Processing, 1999
Analysis of the correlation structure for a neural predictive model with application to speech recognition
Neural Networks, 1994
Maximum likelihood clustering of Gaussians for speech recognition
IEEE Transactions on Speech and Audio Processing, 1994
Connectionist probability estimators in HMM speech recognition
IEEE Transactions on Speech and Audio Processing, 1994
Automatic labeling of prosodic patterns
IEEE Transactions on Speech and Audio Processing, 1994
A hybrid segmental neural net/hidden Markov model system for continuous speech recognition
IEEE Transactions on Speech and Audio Processing, 1994
Context modeling with the stochastic segment model
IEEE Transactions on Signal Processing, 1992
Fast algorithms for phone classification and recognition using segment-based models
IEEE Transactions on Signal Processing, 1992
Neural Network Classifiers Estimate Bayesian a posteriori Probabilities
Neural Computation, 1991
A tutorial on hidden Markov models and selected applications in speech recognition
Proceedings of the IEEE, 1989

Cited by 333 articles