A NOVEL DTW-BASED DISTANCE MEASURE FOR SPEAKER SEGMENTATION

1 January 2006

conference paper
conference paper
Published by Institute of Electrical and Electronics Engineers (IEEE) in 2006 IEEE Spoken Language Technology Workshop

p. 22-25
https://doi.org/10.1109/slt.2006.326807

Abstract

We present a novel distance measure for comparing two speech segments that uses a local version of the well-known DTW algorithm. Our approach is based on the idea of finding word-level speech patterns that are repeated by the same speaker. Using this distance measure, we develop a speaker segmentation procedure and apply it to the task of segmenting multi-speaker lectures. We demonstrate that our approach is able to generate segmentations that correlate well to independently generated human segmentations. In experiments performed on over ten hours of multi-speaker lecture data, we were able to find speaker change points with precision and recall rates of 80% and 100%, respectively.

Keywords

This publication has 5 references indexed in Scilit:

Speaker Detection Without Models
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2006
Scale-space filtering: A new approach to multi-scale description
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2005
Towards unsupervised pattern discovery in speech
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2005
Efficient algorithms for locating the length-constrained heaviest segments with applications to biomolecular sequence analysis
Journal of Computer and System Sciences, 2002
DISTBIC: A speaker-based segmentation for audio data indexing
Speech Communication, 2000

Cited by 3 articles