Validity and threats to the validity of vocabulary measurement

Abstract
Introduction In the first chapter of this volume, Paul Nation draws our attention to six factors that are potential threats to the validity of language assessment. By way of introduction to this chapter we will briefly consider three of these before concentrating on issues connected with ‘multiple measures’ and ‘the unit of counting’ in more detail. We will omit ‘testing vocabulary in use’, but not because we do not think it is important. More naturalistic approaches to language testing including aspects of normal language use are advocated in many areas of applied linguistics (see Porter, 1997, on second language oral testing; Wells, 1985, on early first language assessment; Holmes and Singh, 1996, on vocabulary in aphasia; Bucks, Singh, Cuerdon and Wilcock, 2000, on analysing lexical performance in dementia patients). Issues that touch on this will be addressed under ‘multiple measures’. All three remaining factors discussed by Nation – frequency lists, learner attitude, and first and second language formats – strike a chord with our own concerns related to research, language teaching, and the education of teachers. In the past, we have been sceptical about the use of word-frequency lists in second language assessment (Malvern, Richards, Chipere and Duràn, 2004). At lower and even intermediate levels, exposure to vocabulary can be too limited and too heavily biased towards the lexical idiosyncrasies of teachers and textbooks for native speaker norms to be a reliable metric. Recently, however, we have been forced to think again.