Word Length Distribution in Mongolian
- 21 February 2014
- journal article
- research article
- Published by Taylor & Francis Ltd in Journal of Quantitative Linguistics
- Vol. 21 (2), 123-152
- https://doi.org/10.1080/09296174.2014.882191
Abstract
This paper addresses the distribution features of word length and stem length in Mongolian, employing both dynamic (a corpus of 1 million Mongolian word tokens) and static (an orthographic Mongolian dictionary and a Mongolian stem dictionary) language resources. The results show that the Mongolian words and stems abide by the Poisson distribution. Concretely, the word length from the dynamic corpus abide by the Dacey-Poisson distribution, and all the others abide by the Conway-Maxwell-Poisson distribution. In addition, the Mongolian word lengths are influenced by word frequencies, basically abiding by Zipf’s Principle of Least Effort. The fitting experiments of power functions relationship between Mongolian word lengths and word frequencies using individual short texts, continuous long texts, and fixed-length texts indicate that the individual texts with fixed length (about 2000 words) yield the best fitting results.Keywords
This publication has 9 references indexed in Scilit:
- Word Length and Word FrequencyPublished by Springer Science and Business Media LLC ,2007
- History and Methodology of Word Length StudiesPublished by Springer Science and Business Media LLC ,2006
- Contributions to the Science of Text and LanguagePublished by Springer Science and Business Media LLC ,2006
- Quantitative LinguistikPublished by Walter de Gruyter GmbH ,2005
- Modification of Probability Distributions Applied to Word Length ResearchJournal of Quantitative Linguistics, 1999
- Towards a theory of word length distribution*Journal of Quantitative Linguistics, 1994
- Modelling the Distribution of Word Length: Some Methodological ProblemsPublished by Springer Science and Business Media LLC ,1993
- Ein statistisches Modell für die Verteilung der WortlängeZeitschrift für Sprachwissenschaft, 1982
- Mathematische Analyse von Sprachelementen, Sprachstil und SprachenPublished by Springer Science and Business Media LLC ,1955