Strategies in tracing linguistic variation in a corpus of Old Irish texts (CorPH)

Open Access

20 September 2022

journal article
research article
Published by John Benjamins Publishing Company in Corpus Studies of Language Through Time

Vol. 27 (4), 529-553
https://doi.org/10.1075/ijcl.22018.sti

Abstract

This article introduces Corpus PalaeoHibernicum (CorPH), a corpus currently consisting of 78 texts in Early Irish (c. 7th–10th cent.) created by the ERC-funded Chronologicon Hibernicum (ChronHib) project by bringing together pre-existing lexical and syntactic databases and adding further crucial texts from the period. In addition to being annotated for POS, morphological and syntactic information, another layer of annotation has been developed for CorPH – ‘Variation Tagging’, i.e. a tagset that numerically encodes synchronic language variation during the Early Irish period, thus allowing for much improved research on the chronological variation among the material. Another new pillar of studying linguistic variation is Bayesian Language Variation Analysis (BLaVA), in order to address the challenge that “not-so-big data” poses to statistical corpus methods. Instead of reflecting feature frequencies, BLaVA models language variation as probabilities of variation.

Keywords

This publication has 15 references indexed in Scilit:

Dated language phylogenies shed light on the ancestry of Sino-Tibetan
Proceedings of the National Academy of Sciences of the United States of America, 2019
Chronologicon Hibernicum: A Probabilistic Chronological Framework for Dating Early Irish Language Developments and Literature
Published by Springer Science and Business Media LLC ,2018
Early Irish Lexicography ‒ A Research Survey
Kratylos, 2018
Quantitative approaches to diachronic corpus linguistics
Published by Cambridge University Press (CUP) ,2016
Syntactic variation and lexical preference in the dative-shift alternation
Published by Brill ,2012
Morphosyntactic Tagging of Old Icelandic Texts and Its Use in Studying Syntactic Variation and Change
Published by Springer Science and Business Media LLC ,2011
Modeling diachronic change in the third person singular: a multifactorial, verb- and author-specific exploratory approach
English Language and Linguistics, 2010
A Dictionary of Linguistics and Phonetics
Published by Wiley ,2008
Animacy, agentivity, and the spread of the progressive in Modern English
English Language and Linguistics, 2004
2. Would as a hedging device in an Irish context
Published by John Benjamins Publishing Company ,2002