A NOVEL DTW-BASED DISTANCE MEASURE FOR SPEAKER SEGMENTATION

Abstract
We present a novel distance measure for comparing two speech segments that uses a local version of the well-known DTW algorithm. Our approach is based on the idea of finding word-level speech patterns that are repeated by the same speaker. Using this distance measure, we develop a speaker segmentation procedure and apply it to the task of segmenting multi-speaker lectures. We demonstrate that our approach is able to generate segmentations that correlate well to independently generated human segmentations. In experiments performed on over ten hours of multi-speaker lecture data, we were able to find speaker change points with precision and recall rates of 80% and 100%, respectively.

This publication has 5 references indexed in Scilit: