The Spoken BNC2014
Top Cited Papers
Open Access
- 5 July 2022
- journal article
- Published by John Benjamins Publishing Company in Corpus Studies of Language Through Time
- Vol. 22 (3), 319-344
- https://doi.org/10.1075/ijcl.22.3.02lov
Abstract
This paper introduces the Spoken British National Corpus 2014, an 11.5-million-word corpus of orthographically transcribed conversations among L1 speakers of British English from across the UK, recorded in the years 2012–2016. After showing that a survey of the recent history of corpora of spoken British English justifies the compilation of this new corpus, we describe the main stages of the Spoken BNC2014’s creation: design, data and metadata collection, transcription, XML encoding, and annotation. In doing so we aim to (i) encourage users of the corpus to approach the data with sensitivity to the many methodological issues we identified and attempted to overcome while compiling the Spoken BNC2014, and (ii) inform (future) compilers of spoken corpora of the innovations we implemented to attempt to make the construction of corpora representing spontaneous speech in informal contexts more tractable, both logistically and practically, than in the past.Keywords
This publication has 33 references indexed in Scilit:
- Variation in morphological productivity in the BNC: Sociolinguistic and methodological considerationsCorpus Linguistics and Linguistic Theory, 2011
- The making of a BNC customised spoken corpus for comparative purposesCorpora, 2009
- Corpora in Language TeachingPublished by Wiley ,2009
- A corpus-based sociolinguistic study of amplifiers in British EnglishSociolinguistic Studies, 2008
- Judging the Frequency of English WordsApplied Linguistics, 2007
- Coming to terms with conversational grammarCorpus Studies of Language Through Time, 2006
- The Scottish Corpus of Texts and Speech: Problems of Corpus DesignLiterary and Linguistic Computing, 2003
- Spoken Corpus TranscriptionLiterary and Linguistic Computing, 1994
- Spoken Corpus DesignLiterary and Linguistic Computing, 1993
- Corpus Design CriteriaLiterary and Linguistic Computing, 1992