Nonspeech segment rejection based on prosodic information for robust speech recognition

Abstract
A new scheme for nonspeech rejection is proposed by considering that most nonspeech segments do not have well-defined prosodic structures as speech segments do. Certain parameters characterizing the smoothness of the peak index series and of the peak amplitude series of the normalized autocorrelation function are used to make nonspeech segment rejection decisions. The receiver-operating-characteristics curve and recognition word-error-rate reduction measures show that our approach is more effective than garbage-model-based schemes when used in telephone speech recognition.

This publication has 2 references indexed in Scilit: