Type-token models: a comparative study

28 October 2014

journal article
research article
Published by Taylor & Francis Ltd in Journal of Quantitative Linguistics

Vol. 22 (1), 1-21
https://doi.org/10.1080/09296174.2014.974456

Abstract

The type (V) – token (N) relationship has been studied for almost a century. Although a number of models have been developed to examine this relationship, comparative studies have been rare. Thirty-six published and 14 new models were examined using 28 Latin texts. The results were similar to those found in other languages. The best fitting models were 1/V = a + b/N^c where a, b and c are constants. The ratio V/N was also found to be well fitted by the relation a + b log N.

Keywords

This publication has 31 references indexed in Scilit:

Psycholinguistic word information in second language oral discourse
Second Language Research, 2011
Fitting Ranked Linguistic Data with Two-Parameter Functions
Entropy, 2010
Cutting the Gordian Knot: The Moving-Average Type–Token Ratio (MATTR)
Journal of Quantitative Linguistics, 2010
Some elaborations upon Gani's model for the type-token relationship
Journal of Applied Probability, 1981
A new look at the statistical model identification
IEEE Transactions on Automatic Control, 1974
Distribution of Residual Autocorrelations in Autoregressive-Integrated Moving Average Time Series Models
Journal of the American Statistical Association, 1970
On Information and Sufficiency
The Annals of Mathematical Statistics, 1951
Statistics of Vocabulary
Science, 1928
Prolegomena to an Apology for Pragmaticism
Monist, 1906
Note to the memoir by Professor Karl Pearson, F. R. S., on spurious correlation
Proceedings of the Royal Society of London, 1897

Cited by 17 articles