DART – The dialogue annotation and research tool

1 January 2016

journal article
Published by Walter de Gruyter GmbH in Corpus Linguistics and Linguistic Theory

Vol. 12 (2)
https://doi.org/10.1515/cllt-2014-0051

Abstract

Corpus-based research into pragmatics is suffering from a distinct lack of suitably annotated corpora. This dilemma has so far generally forced researchers in corpus-based pragmatics to focus on well-known fixed expressions (e. g. discourse markers, politeness formulae, etc.) in their research, rather than being able to investigate interaction on the level of speech acts and other pragmatics-relevant features on a larger scale. This article describes a research environment that aims at remedying this problem (currently for English only) by making large-scale annotation of, and research into, speech acts and other linguistic levels possible in an efficient manner, at the same time discussing the difficulties and complexities inherent in such an endeavour. It then goes on to illustrate the efficiency of the approach, and how the resulting annotations represent an improvement over existing models in the form of a brief case study. The latter includes an illustrative discussion of the performance of the tool in annotating a subset of 100 files from the Switchboard corpus, plus a more detailed comparison of the automatically annotated version of one of the files with its original, manually annotated, version.

Keywords

This publication has 2 references indexed in Scilit:

SPAACy – A semi-automated tool for annotating dialogue acts
Corpus Studies of Language Through Time, 2003
The Hcrc Map Task Corpus
Language and Speech, 1991

Cited by 15 articles