Speech perception at the interface of neurobiology and linguistics

21 September 2007

journal article
research article
Published by The Royal Society in Philosophical Transactions B

Vol. 363 (1493), 1071-1086
https://doi.org/10.1098/rstb.2007.2160

Abstract

Speech perception consists of a set of computations that take continuously varying acoustic waveforms as input and generate discrete representations that make contact with the lexical representations stored in long-term memory as output. Because the perceptual objects that are recognized by the speech perception enter into subsequent linguistic computation, the format that is used for lexical representation and processing fundamentally constrains the speech perceptual processes. Consequently, theories of speech perception must, at some level, be tightly linked to theories of lexical representation. Minimally, speech perception must yield representations that smoothly and rapidly interface with stored lexical items. Adopting the perspective of Marr, we argue and provide neurobiological and psychophysical evidence for the following research programme. First, at the implementational level, speech perception is a multi-time resolution process, with perceptual analyses occurring concurrently on at least two time scales (approx. 20–80 ms, approx. 150–300 ms), commensurate with (sub)segmental and syllabic analyses, respectively. Second, at the algorithmic level, we suggest that perception proceeds on the basis of internal forward models, or uses an ‘analysis-by-synthesis’ approach. Third, at the computational level (in the sense of Marr), the theory of lexical representation that we adopt is principally informed by phonological research and assumes that words are represented in the mental lexicon in terms of sequences of discrete segments composed of distinctive features. One important goal of the research programme is to develop linking hypotheses between putative neurobiological primitives (e.g. temporal primitives) and those primitives derived from linguistic inquiry, to arrive ultimately at a biologically sensible and theoretically satisfying model of representation and computation in speech.

Keywords

This publication has 69 references indexed in Scilit:

Temporal window of integration in auditory-visual speech perception
Neuropsychologia, 2007
Cerebral mechanisms of prosodic sensory integration using low‐frequency bands of connected speech
Human Brain Mapping, 2005
Classical and Bayesian Inference in Neuroimaging: Theory
NeuroImage, 2002
The role of inferior frontal cortex in phonological processing
Cognitive Science, 2001
Pure word deafness and the bilateral processing of the speech code
Cognitive Science, 2001
Temporal constraints on the McGurk effect
Perception & Psychophysics, 1996
The auditory “Primal Sketch”: A multiscale model of rhythmic grouping
Journal of New Music Research, 1994
Perceived order in different sense modalities.
Journal of Experimental Psychology, 1961
An Analysis of Perceptual Confusions Among Some English Consonants
The Journal of the Acoustical Society of America, 1955
Korean Phonemics
Language, 1951

Cited by 368 articles