Independence, Measurement Complexity, and Classification Performance

Abstract
If f(x) and g(x) are the densities for the N-dimensional measurement vector x, conditioned on the classes c 1 and c 2 , and if finite sets of samples from the two classes are available, then a decision function based on estimates f(x) and ĝ(x) can be used to classify future observations. In general, however, when the measurement complexity (the dimensionality N) is increased arbitrarily and the sets of training samples remain finite, a ''peaking phenomenon'' of the following kind is observed: classification accuracy improves at first, peaks at a finite value of N, called the optimum measurement complexity, and starts deteriorating thereafter. We derive, for the case of statistically independent measurements, general conditions under which it can be guaranteed that the peaking phenomenon will not occur, and the correct classification probability will keep increasing to value unity as N → ∞. Several applications are considered which together indicate, contrary to general belief, that independence of measurements alone does not guarantee the absence of the peaking phenomenon.

This publication has 6 references indexed in Scilit: