Probability Distributions of Syntactic Units and Properties

Abstract
In Köhler (1999), an attempt was made to set up a basic functional-analytic model of a syntactic subsystem in the framework of synergetic linguistics. In that paper, functional dependencies among selected properties, viz. frequency, complexity, length, depth of embedding, and information, and the quantities polyfunctionality and synfunctionality are postulated, derived, and empirically tested on data from the Susanne corpus (Sampson, 1995). The analysis of the probability distributions of the quantities under consideration was postponed and will be tackled in the present study. It will be shown that the properties of syntactic constructions are lawfully distributed according to only a few distributions which belong to a common family of probability distributions, and that hypotheses can be set up from which the corresponding distributions can be derived, thus explaining the empirical findings. The empirical database is extended by another language, viz. by data from the German Negra-Korpus (Brants, 1999, p. 102). The empirical tests yield results which are compatible with the hypotheses. Syntactic constructions and categories were considered as basic units. In the case of the Susanne corpus, the clause, phrase, and word class tags were evaluated as operationalizations of these units, and in the case of the Negra-Korpus, all node tags and word tags.