Towards unsupervised pattern discovery in speech

Abstract

We present an unsupervised algorithm for discovering acoustic patterns in speech by finding matching subsequences between pairs of utterances. The approach we describe is, in theory, language and topic independent, and is particularly well suited for processing large amounts of speech from a single speaker. A variation of dynamic time warping (DTW), which we call segmental DTW, is used to performing the pairwise utterance comparison. Using academic lecture data, we describe two potentially useful applications for the segmental DTW output: augmenting speech recognition transcriptions for information retrieval and speech segment clustering for unsupervised word discovery. Some preliminary qualitative results for both experiments are shown and the implications for future work and applications are discussed

Keywords

This publication has 7 references indexed in Scilit:

Using Audio Fingerprinting for Duplicate Detection and Thumbnail Generation
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2006
Speaker Detection Without Models
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2006
Automatic Processing of Audio Lectures for Information Retrieval: Vocabulary Selection and Language Modeling
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2006
Efficient algorithms for locating the length-constrained heaviest segments with applications to biomolecular sequence analysis
Journal of Computer and System Sciences, 2002
An overview of audio information retrieval
Multimedia Systems, 1999
Combinatorial pattern discovery in biological sequences: The TEIRESIAS algorithm.
Bioinformatics, 1998
Approaches to the Automatic Discovery of Patterns in Biosequences
Journal of Computational Biology, 1998

Cited by 34 articles