Abstract
A single-microphone noise suppression algorithm is described that is based on a novel approach for the estimation of the signal-to-noise ratio (SNR) in different frequency channels: The input signal is transformed into neurophysiologically-motivated spectro-temporal input features. These patterns are called amplitude modulation spectrograms (AMS), as they contain information of both center frequencies and modulation frequencies within each 32 ms-analysis frame. The different representations of speech and noise in AMS patterns are detected by a neural network, which estimates the present SNR in each frequency channel. Quantitative experiments show a reliable estimation of the SNR for most types of nonspeech background noise. For noise suppression, the frequency bands are attenuated according to the estimated present SNR using a Wiener filter approach. Objective speech quality measures, informal listening tests, and the results of automatic speech recognition experiments indicate a substantial benefit from AMS-based noise suppression, in comparison to unprocessed noisy speech.

This publication has 23 references indexed in Scilit: