The Written British National Corpus 2014 – design and comparability
Open Access
- 12 August 2021
- journal article
- research article
- Published by Walter de Gruyter GmbH in Text & Talk - An Interdisciplinary Journal of Language, Discourse & Communication Studies
- Vol. 41 (5-6), 595-615
- https://doi.org/10.1515/text-2020-0052
Abstract
The British National Corpus 2014 is a major project led by Lancaster University to create a 100-million-word corpus of present day British English. This corpus has been constructed as a comparable counterpart of the original British National Corpus (referred to as the BNC1994 in this article), which was compiled in the early 1990s. This article starts with the justification of the project answering the question of ‘Why do we need a new BNC?’. We then provide a general overview of the construction of the Written British National Corpus 2014 (Written BNC2014); we also briefly discuss some issues of data collection before looking in detail at the design of the corpus. Compiling a large general corpus such as the Written BNC2014 has been a major undertaking involving teamwork and collaboration. It also required generosity on the part of the many individuals and organisations who contributed to the data collection.This publication has 23 references indexed in Scilit:
- The BE06 Corpus of British English and recent language changeCorpus Studies of Language Through Time, 2009
- Exploring genre and register in contemporary EnglishEnglish Today, 2008
- Swearing in Modern British English: The Case of Fuck in the BNCLanguage and Literature: International Journal of Stylistics, 2004
- 1. A study of the most frequent word families in the British National CorpusPublished by John Benjamins Publishing Company ,2004
- Changing Conventions of Writing: The Dynamics of Genres, Text Types, and Text TraditionsEuropean Journal of English Studies, 2001
- Corpus Construction: a Principle for Qualitative Data CollectionPublished by SAGE Publications ,2000
- Text typologyPublished by John Benjamins Publishing Company ,1997
- Representativeness in Corpus DesignLiterary and Linguistic Computing, 1993
- Corpus Design CriteriaLiterary and Linguistic Computing, 1992
- A typology of English textsLinguistics, 1989