Using a Chinese treebank to measure dependency distance

Abstract
This article describes a method for calculating the ‘dependency distance’ between the words in a text – i.e. the number of words that separate each word from the word on which it depends syntactically – and reports the results of applying this method to a Chinese treebank. This study shows that Chinese dependencies tend strongly to be governor-final and that the mean dependency distance of words is much higher for Chinese than for other languages that have been studied including English, German and Japanese. It is unclear whether this difference means that Chinese is syntactically more difficult to process.