Can the Probability Distribution of Dependency Distance Measure Language Proficiency of Second Language Learners?

26 October 2017

journal article
research article
Published by Taylor & Francis Ltd in Journal of Quantitative Linguistics

Vol. 25 (4), 1-19
https://doi.org/10.1080/09296174.2017.1373991

Abstract

It has been found that the length distribution of many linguistic units fits well the same model, the Zipf-Alekseev function. In this article, we aimed to find out whether this holds for English learners’ interlanguage and whether the parameters in probability distribution of dependency distance can measure the language proficiency of second language learners. We selected 367 participants of English learners of nine consecutive grades and fitted different probability distribution models to dependency distances of their writings in English and of self-built contrastive dependency treebanks based on Wall Street Journal Corpus. It was found that: (1) the Zipf-Alekseev distribution well captures the probability distribution of dependency distance of each grade and native speakers; (2) the probability distribution of dependency distance well measures second language learners’ language proficiency at different learning stages; (3) high-level learners don’t present exactly the same parameters in the probability distribution of dependency distance as those of native speakers, which means learners’ language proficiency is not as high as that of English native speakers and second language learners’ syntactic acquisition process is always constrained by the tendency of dependency distance minimization. This study corroborates that quantitative linguistic methods can be well utilized in second language acquisition researches.

Keywords

Funding Information

National Social Science Foundation of China (17AYY021)

This publication has 19 references indexed in Scilit:

The effects of sentence length on dependency distance, dependency direction and the implications–Based on a parallel English–Chinese dependency treebank
Language Sciences, 2015
Word-length Entropies and Correlations of Natural Language Written Texts
Journal of Quantitative Linguistics, 2015
Word Length Distribution in Mongolian
Journal of Quantitative Linguistics, 2014
Evaluating goodness-of-fit of discrete distribution models in quantitative linguistics
Journal of Quantitative Linguistics, 2013
A Contribution to the Theory of Word Length Distribution Based on a Stochastic Word Length Distribution Model
Journal of Quantitative Linguistics, 2013
Model generation for word length frequencies in texts with the application of Zipf's order approach
Journal of Quantitative Linguistics, 2012
Probability Distribution of Dependencies Based on a Chinese Dependency Treebank
Journal of Quantitative Linguistics, 2009
Using a Chinese treebank to measure dependency distance
Corpus Linguistics and Linguistic Theory, 2009
Morpheme Length Distribution in Lakota
Journal of Quantitative Linguistics, 2005
Euclidean distance between syntactically linked words
Physical Review E, 2004

Cited by 24 articles