The Impact of Temporally Coherent Visual Cues on Speech Perception in Complex Auditory Environments
Open Access
- 7 June 2021
- journal article
- research article
- Published by Frontiers Media SA in Frontiers in Neuroscience
Abstract
Speech perception often takes place in noisy environments, where multiple auditory signals compete with one another. The addition of visual cues such as talkers’ faces or lip movements to an auditory signal can help improve the intelligibility of speech in those suboptimal listening environments. This is referred to as audiovisual benefits. The current study aimed to delineate the signal-to-noise ratio (SNR) conditions under which visual presentations of the acoustic amplitude envelopes have their most significant impact on speech perception. Seventeen adults with normal hearing were recruited. Participants were presented with spoken sentences in babble noise either in auditory-only or auditory-visual conditions with various SNRs at −7, −5, −3, −1, and 1 dB. The visual stimulus applied in this study was a sphere that varied in size syncing with the amplitude envelope of the target speech signals. Participants were asked to transcribe the sentences they heard. Results showed that a significant improvement in accuracy in the auditory-visual condition versus the audio-only condition was obtained at the SNRs of −3 and −1 dB, but no improvement was observed in other SNRs. These results showed that dynamic temporal visual information can benefit speech perception in noise, and the optimal facilitative effects of visual amplitude envelope can be observed under an intermediate SNR range.This publication has 45 references indexed in Scilit:
- Challenges in quantifying multisensory integration: alternative criteria, models, and inverse effectivenessExperimental Brain Research, 2009
- The Principle of Inverse Effectiveness in Multisensory Integration: Some Statistical ConsiderationsBrain Topography, 2009
- Lip-Reading Aids Word Recognition Most in Moderate Noise: A Bayesian Explanation Using High-Dimensional Feature SpacePLOS ONE, 2009
- Erratum: Multisensory integration: current issues from the perspective of the single neuronNature Reviews Neuroscience, 2008
- Multisensory integration: current issues from the perspective of the single neuronNature Reviews Neuroscience, 2008
- Temporally Nonadjacent Nonlinguistic Sounds Affect Speech CategorizationPsychological Science, 2005
- Evaluating the articulation index for auditory–visual inputThe Journal of the Acoustical Society of America, 1991
- On the Objects of Speech PerceptionEcological Psychology, 1989
- The motor theory of speech perception revisedCognition, 1985
- Hearing lips and seeing voicesNature, 1976