Group-size dependence: a rationale for choice between numerical classifications

Abstract
The problem of erecting criteria for choice between numerical classifications is briefly surveyed: it is concluded that the most difficult case arises in the synoptic classification of highly heterogeneous data, for which a powerful clustering system is essential, and for which several alternative strategies are in common use. In all such strategies an inter-group or individual-group measure is dependent on the size of the group, but the nature of this dependence has not previously been investigated. It is here investigated for four widely-used strategies, and appropriate conclusions drawn as to their differing applicability to particular types of problem.