Measuring spoken language: a unit for all reasons

Abstract
The analysis of spoken language requires a principled way of dividing transcribed data into units in order to assess features such as accuracy and complexity. If such analyses are to be comparable across different studies, there must be agreement on the nature of the unit, and it must be possible to apply this unit reliably to a range of different types of speech data. There are a number of different units in use, the various merits of which have been discussed by Crookes (1990). However, while these have been used to facilitate the analysis of spoken language data, there is presently no comprehensive, accessible definition of any of them, nor are detailed guides available on how to identify such units in data sets. Research reports tend to provide simplistic two-line definitions of units exemplified, if at all, by unproblematic written examples. These are inadequate when applied to transcriptions of complex oral data, which tend not to lend themselves easily to a clear division into units. This paper was motivated by the need each of the three authors felt for a reliable and comprehensively defined unit to assist with the analysis of a variety of recordings of native and non-native speakers of English. We first discuss in very general terms the criteria according to which such a unit might be selected. Next, we examine the main categories of unit which have been adopted previously and provide a justification for the particular type of unit that we have chosen. Focusing on this unit, we identify a number of problems which are associated with the definition and exemplification of units of this type, and give examples of the awkward cases found in actual data. Finally we offer a definition of our unit, the Analysis of Speech Unit (AS-unit), providing adequate detail to address the problematic data analyses we have illustrated.