Toward a Model for Speech Recognition

Abstract
An approach to the design of a machine for the recognition and synthesis of speech is proposed, with particular emphasis on problems of acoustical analysis. As a recognizer, the proposed machine accepts a speechwave at its input and generates a sequence of phonetic symbols at its output; as a synthesizer it accepts a sequence of symbols at its input and generates speechwave. Coupling between the acoustical speech signal and the machine is achieved through two peripheral units: one an analog filter set or equivalent, and the other a model of the vocal tract. Between the analog filters and the phonetic output the signal undergoes an intermediate form of representation that is related to vocal‐tract configurations and excitations but is not necessarily described specifically in these terms. Each stage of analysis is performed by synthesis of a number of alternative signals or patterns according to rules stored within the machine and by comparison of the synthesized patterns with the input signals that are under analysis. Possible advantages of the proposed method of analysis are discussed. An experimental study based on the general analysis approach is described in an Appendix. In this study a method for the determination of the frequencies of vocal‐tract resonances from the speechwave is simulated on a digital computer.