The Trinity Lancaster Corpus

Top Cited Papers

Open Access

5 July 2022

journal article
Published by John Benjamins Publishing Company in International Journal of Learner Corpus Research

Vol. 5 (2), 126-158
https://doi.org/10.1075/ijlcr.19001.gab

Abstract

This paper introduces a new corpus resource for language learning research, the Trinity Lancaster Corpus (TLC), which contains 4.2 million words of interaction between L1 and L2 speakers of English. The corpus includes spoken production from over 2,000 L2 speakers from different linguistic and cultural backgrounds at different levels of proficiency engaged in two to four tasks. The paper provides a description of the TLC and places it in the context of current learner corpus development and research. The discussion of practical decisions taken in the construction of the TLC also enables a critical reflection on current methodological issues in corpus construction.

Keywords

This publication has 35 references indexed in Scilit:

Some current transcription systems for spoken discourse: A critical analysis
Pragmatics, 2022
Transcription design principles for spoken discourse research
Pragmatics, 2022
Building a spoken corpus
Published by Taylor & Francis Ltd ,2015
Pragmatic markers
Published by Cambridge University Press (CUP) ,2014
Significant or random?
Corpus Studies of Language Through Time, 2014
Communicative Language Testing (CLT): Reflections on the “Issues Revisited” From the Perspective of an Examinations Board
Language Assessment Quarterly, 2014
Proficiency Level--a Fuzzy Variable in Computer Learner Corpora
Applied Linguistics, 2012
Interactive aspects of vagueness in conversation
Journal of Pragmatics, 2003
Grammars of Spoken English: New Outcomes of Corpus‐Oriented Research
Language Learning, 2000
Spoken Corpus Transcription
Literary and Linguistic Computing, 1994

Cited by 17 articles