
The Spoken BNC2014
Published: 5 July 2022
International Journal of Corpus Linguistics
,
Volume 22,
pp 319-344; https://doi.org/10.1075/ijcl.22.3.02lov
Abstract: This paper introduces the Spoken British National Corpus 2014, an 11.5-million-word corpus of orthographically transcribed conversations among L1 speakers of British English from across the UK, recorded in the years 2012–2016. After showing that a survey of the recent history of corpora of spoken British English justifies the compilation of this new corpus, we describe the main stages of the Spoken BNC2014’s creation: design, data and metadata collection, transcription, XML encoding, and annotation. In doing so we aim to (i) encourage users of the corpus to approach the data with sensitivity to the many methodological issues we identified and attempted to overcome while compiling the Spoken BNC2014, and (ii) inform (future) compilers of spoken corpora of the innovations we implemented to attempt to make the construction of corpora representing spontaneous speech in informal contexts more tractable, both logistically and practically, than in the past.
Keywords: Spoken BNC2014 / British / corpus / attempted / English / survey / corpora / compilation
Scifeed alert for new publications
Never miss any articles matching your research from any publisher- Get alerts for new papers matching your research
- Find out the new papers from selected authors
- Updated daily for 49'000+ journals and 6000+ publishers
- Define your Scifeed now
Click here to see the statistics on "International Journal of Corpus Linguistics" .