The Spoken BNC2014

Top Cited Papers

Open Access

5 July 2022

journal article
Published by John Benjamins Publishing Company in Corpus Studies of Language Through Time

Vol. 22 (3), 319-344
https://doi.org/10.1075/ijcl.22.3.02lov

Abstract

This paper introduces the Spoken British National Corpus 2014, an 11.5-million-word corpus of orthographically transcribed conversations among L1 speakers of British English from across the UK, recorded in the years 2012–2016. After showing that a survey of the recent history of corpora of spoken British English justifies the compilation of this new corpus, we describe the main stages of the Spoken BNC2014’s creation: design, data and metadata collection, transcription, XML encoding, and annotation. In doing so we aim to (i) encourage users of the corpus to approach the data with sensitivity to the many methodological issues we identified and attempted to overcome while compiling the Spoken BNC2014, and (ii) inform (future) compilers of spoken corpora of the innovations we implemented to attempt to make the construction of corpora representing spontaneous speech in informal contexts more tractable, both logistically and practically, than in the past.

Keywords

This publication has 33 references indexed in Scilit:

Variation in morphological productivity in the BNC: Sociolinguistic and methodological considerations
Corpus Linguistics and Linguistic Theory, 2011
The making of a BNC customised spoken corpus for comparative purposes
Corpora, 2009
Corpora in Language Teaching
Published by Wiley ,2009
A corpus-based sociolinguistic study of amplifiers in British English
Sociolinguistic Studies, 2008
Judging the Frequency of English Words
Applied Linguistics, 2007
Coming to terms with conversational grammar
Corpus Studies of Language Through Time, 2006
The Scottish Corpus of Texts and Speech: Problems of Corpus Design
Literary and Linguistic Computing, 2003
Spoken Corpus Transcription
Literary and Linguistic Computing, 1994
Spoken Corpus Design
Literary and Linguistic Computing, 1993
Corpus Design Criteria
Literary and Linguistic Computing, 1992

Cited by 162 articles