The Written British National Corpus 2014 – design and comparability

Abstract
The British National Corpus 2014 is a major project led by Lancaster University to create a 100-million-word corpus of present day British English. This corpus has been constructed as a comparable counterpart of the original British National Corpus (referred to as the BNC1994 in this article), which was compiled in the early 1990s. This article starts with the justification of the project answering the question of ‘Why do we need a new BNC?’. We then provide a general overview of the construction of the Written British National Corpus 2014 (Written BNC2014); we also briefly discuss some issues of data collection before looking in detail at the design of the corpus. Compiling a large general corpus such as the Written BNC2014 has been a major undertaking involving teamwork and collaboration. It also required generosity on the part of the many individuals and organisations who contributed to the data collection.

This publication has 23 references indexed in Scilit: