Discriminative learning in sequential pattern recognition

26 September 2008

journal article
Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Signal Processing Magazine

Vol. 25 (5), 14-36
https://doi.org/10.1109/msp.2008.926652

Abstract

In this article, we studied the objective functions of MMI, MCE, and MPE/MWE for discriminative learning in sequential pattern recognition. We presented an approach that unifies the objective functions of MMI, MCE, and MPE/MWE in a common rational-function form of (25). The exact structure of the rational-function form for each discriminative criterion was derived and studied. While the rational-function form of MMI has been known in the past, we provided the theoretical proof that the similar rational-function form exists for the objective functions of MCE and MPE/MWE. Moreover, we showed that the rational function forms for objective functions of MMI, MCE, and MPE/MWE differ in the constant weighting factors CDT (s1 . . . sR) and these weighting factors depend only on the labeled sequence s1 . . . sR, and are independent of the parameter set - to be optimized. The derived rational-function form for MMI, MCE, and MPE/MWE allows the GT/EBW-based parameter optimization framework to be applied directly in discriminative learning. In the past, lack of the appropriate rational-function form was a difficulty for MCE and MPE/MWE, because without this form, the GT/EBW-based parameter optimization framework cannot be directly applied. Based on the unified rational-function form, in a tutorial style, we derived the GT/EBW-based parameter optimization formulas for both discrete HMMs and CDHMMs in discriminative learning using MMI, MCE, and MPE/MWE criteria. The unifying review provided in this article has been based upon a large number of earlier contributions that have been cited and discussed throughout the article. Here we provide a brief summary of such background work. Extension to large-scale speech recognition tasks was accomplished in the work of [59] and [60]. The dissertation of [47] further improved the MMI criterion to that of MPE/MWE. In a parallel vein, the work of [20] provided an alternative approach to that of [41], with an attempt to more rigorously provide a CDHMM model re-estimation formula that gives positive growth of the MMI objective function. A crucial error of this attempt was corrected in [2] for establishing an existence proof of such positive growth. The main goal of this article is to provide an underlying foundation for MMI, MCE, and MPE/MWE at the objective function level to facilitate the development of new parameter optimization techniques and to incorporate other pattern recognition concepts, e.g., discriminative margins [66], into the current discriminative learning paradigm.

Keywords

This publication has 42 references indexed in Scilit:

Haplotype inference using a Bayesian Hidden Markov model
Genetic Epidemiology, 2007
A discrete contextual stochastic model for the off-line recognition of handwritten Chinese characters
IEEE Transactions on Pattern Analysis and Machine Intelligence, 2001
Minimum error rate training for PHMM-based text recognition
IEEE Transactions on Image Processing, 1999
Speech trajectory discrimination using the minimum classification error learning
IEEE Transactions on Speech and Audio Processing, 1998
Minimum classification error rate methods for speech recognition
IEEE Transactions on Speech and Audio Processing, 1997
Hidden Markov model approach to skill learning and its application to telerobotics
IEEE Transactions on Robotics and Automation, 1994
Discriminative learning for minimum error classification (pattern recognition)
IEEE Transactions on Signal Processing, 1992
First- and Second-Order Methods for Learning: Between Steepest Descent and Newton's Method
Neural Computation, 1992
An inequality for rational functions with applications to some statistical estimation problems
IEEE Transactions on Information Theory, 1991
Growth transformations for functions on manifolds
Pacific Journal of Mathematics, 1968

Cited by 86 articles