Tandem decoding of children's speech for keyword detection in a child-robot interaction scenario
- 18 August 2011
- journal article
- research article
- Published by Association for Computing Machinery (ACM) in ACM Transactions on Speech and Language Processing
- Vol. 7 (4), 1-22
- https://doi.org/10.1145/1998384.1998386
Abstract
In this article, we focus on keyword detection in children's speech as it is needed in voice command systems. We use the FAU Aibo Emotion Corpus which contains emotionally colored spontaneous children's speech recorded in a child-robot interaction scenario and investigate various recent keyword spotting techniques. As the principle of bidirectional Long Short-Term Memory (BLSTM) is known to be well-suited for context-sensitive phoneme prediction, we incorporate a BLSTM network into a Tandem model for flexible coarticulation modeling in children's speech. Our experiments reveal that the Tandem model prevails over a triphone-based Hidden Markov Model approach.Keywords
Funding Information
- Seventh Framework Programme (211486 (SEMAINE)IST-2002-50742 (HUMAINE)IST-2001-37599 (PF-STAR))
This publication has 30 references indexed in Scilit:
- Being bored? Recognising natural interest by extensive audiovisual integration for real-life applicationImage and Vision Computing, 2009
- Learning long-term dependencies with recurrent neural networksNeurocomputing, 2008
- Framewise phoneme classification with bidirectional LSTM and other neural network architecturesNeural Networks, 2005
- Creating conversational interfaces for childrenIEEE Transactions on Speech and Audio Processing, 2002
- An overview of audio information retrievalMultimedia Systems, 1999
- Long Short-Term MemoryNeural Computation, 1997
- Learning long-term dependencies in NARX recurrent neural networksIEEE Transactions on Neural Networks, 1996
- Learning Complex, Extended Sequences Using the Principle of History CompressionNeural Computation, 1992
- A time-delay neural network architecture for isolated word recognitionNeural Networks, 1990
- Some observations on the development of anticipatory coarticulationThe Journal of the Acoustical Society of America, 1986