Towards unsupervised pattern discovery in speech

Abstract
We present an unsupervised algorithm for discovering acoustic patterns in speech by finding matching subsequences between pairs of utterances. The approach we describe is, in theory, language and topic independent, and is particularly well suited for processing large amounts of speech from a single speaker. A variation of dynamic time warping (DTW), which we call segmental DTW, is used to performing the pairwise utterance comparison. Using academic lecture data, we describe two potentially useful applications for the segmental DTW output: augmenting speech recognition transcriptions for information retrieval and speech segment clustering for unsupervised word discovery. Some preliminary qualitative results for both experiments are shown and the implications for future work and applications are discussed

This publication has 7 references indexed in Scilit: