Discriminative learning in sequential pattern recognition
- 26 September 2008
- journal article
- Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Signal Processing Magazine
- Vol. 25 (5), 14-36
- https://doi.org/10.1109/msp.2008.926652
Abstract
In this article, we studied the objective functions of MMI, MCE, and MPE/MWE for discriminative learning in sequential pattern recognition. We presented an approach that unifies the objective functions of MMI, MCE, and MPE/MWE in a common rational-function form of (25). The exact structure of the rational-function form for each discriminative criterion was derived and studied. While the rational-function form of MMI has been known in the past, we provided the theoretical proof that the similar rational-function form exists for the objective functions of MCE and MPE/MWE. Moreover, we showed that the rational function forms for objective functions of MMI, MCE, and MPE/MWE differ in the constant weighting factors CDT (s1 . . . sR) and these weighting factors depend only on the labeled sequence s1 . . . sR, and are independent of the parameter set - to be optimized. The derived rational-function form for MMI, MCE, and MPE/MWE allows the GT/EBW-based parameter optimization framework to be applied directly in discriminative learning. In the past, lack of the appropriate rational-function form was a difficulty for MCE and MPE/MWE, because without this form, the GT/EBW-based parameter optimization framework cannot be directly applied. Based on the unified rational-function form, in a tutorial style, we derived the GT/EBW-based parameter optimization formulas for both discrete HMMs and CDHMMs in discriminative learning using MMI, MCE, and MPE/MWE criteria. The unifying review provided in this article has been based upon a large number of earlier contributions that have been cited and discussed throughout the article. Here we provide a brief summary of such background work. Extension to large-scale speech recognition tasks was accomplished in the work of [59] and [60]. The dissertation of [47] further improved the MMI criterion to that of MPE/MWE. In a parallel vein, the work of [20] provided an alternative approach to that of [41], with an attempt to more rigorously provide a CDHMM model re-estimation formula that gives positive growth of the MMI objective function. A crucial error of this attempt was corrected in [2] for establishing an existence proof of such positive growth. The main goal of this article is to provide an underlying foundation for MMI, MCE, and MPE/MWE at the objective function level to facilitate the development of new parameter optimization techniques and to incorporate other pattern recognition concepts, e.g., discriminative margins [66], into the current discriminative learning paradigm.Keywords
This publication has 42 references indexed in Scilit:
- Haplotype inference using a Bayesian Hidden Markov modelGenetic Epidemiology, 2007
- A discrete contextual stochastic model for the off-line recognition of handwritten Chinese charactersIEEE Transactions on Pattern Analysis and Machine Intelligence, 2001
- Minimum error rate training for PHMM-based text recognitionIEEE Transactions on Image Processing, 1999
- Speech trajectory discrimination using the minimum classification error learningIEEE Transactions on Speech and Audio Processing, 1998
- Minimum classification error rate methods for speech recognitionIEEE Transactions on Speech and Audio Processing, 1997
- Hidden Markov model approach to skill learning and its application to teleroboticsIEEE Transactions on Robotics and Automation, 1994
- Discriminative learning for minimum error classification (pattern recognition)IEEE Transactions on Signal Processing, 1992
- First- and Second-Order Methods for Learning: Between Steepest Descent and Newton's MethodNeural Computation, 1992
- An inequality for rational functions with applications to some statistical estimation problemsIEEE Transactions on Information Theory, 1991
- Growth transformations for functions on manifoldsPacific Journal of Mathematics, 1968