Pattern recognition preprocessing by similarity functionals

Abstract

A procedure for preprocessing pattern recognition data is presented, and it is shown that design and use of recognizers on such preprocessed pattern data results in increased overall recognition accuracy and reduced storage requirements. The preprocessing procedure operates by estimating probability densities p(c, x) of each pattern class c and pattern vector x, using sample pattern data. Sequential reduction steps are then guided by evaluating a statistical similarity functional Sc,x, whose value measures the ability of reduced pattern vectors x'(x) to discriminate between the classes. Since it is usually impossible to successfully estimate the entire joint probability density function of the unreduced pattern vector, the first step is to evaluate the class similarity of each measurement and select a manageable subset. Next, the significant statistical dependencies are accounted for by combining measurements into features and reducing the complexity of these features by a requantization procedure. Experimental results are presented on recognition of hand printed character data. The utility of these pre-processing methods is illustrated by a reduction of several orders of magnitude in the storage requirements of a pattern recognizer operating on the reduced data, without degrading recognition accuracy.

Keywords

This publication has 4 references indexed in Scilit:

On Some Clustering Techniques
IBM Journal of Research and Development, 1964
The characteristic selection problem in recognition systems
IEEE Transactions on Information Theory, 1962
Approximating probability distributions to reduce storage requirements
Information and Control, 1959
Use of a computer to design character recognition logic
Published by Association for Computing Machinery (ACM) ,1959

Cited by 2 articles