Towards Automatic Annotation of Anaphoric Links in Corpora

31 December 1999

journal article
Published by John Benjamins Publishing Company in Corpus Studies of Language Through Time

Vol. 4 (2), 261-280
https://doi.org/10.1075/ijcl.4.2.04mit

Abstract

The paper proposes a methodology for the semi-automatic annotation of pronoun-antecedent pairs in corpora. The proposal is based on robust, knowledge-poor pronoun resolution followed by post-editing. The paper is structured as follows. The introduction comments on the fact that automatic identification of referential links in corpora has lagged behind in comparison with similar lexical, syntactical, and even semantic tasks. The second section of the paper outlines the author s robust, knowledge-based approach to pronoun resolution which will subsequently be put forward as the core of a larger architecture proposed for the automatic tagging of referential links. Section 3 briefly presents other related knowledge-poor approaches, while Section 4 discusses the limitations and advantages of the knowledge-poor approach outlined in Section 2. The main argument of the paper is to be found in Section 5, which presents the idea of developing a semi-automatic environment for annotating anaphoric links and outlines the components of such a program. Finally, the conclusion looks at the anticipated success rate of the approach.

Towards Automatic Annotation of Anaphoric Links in Corpora

Abstract

Keywords