Neurocomputational bases of object and face recognition

Abstract
A number of behavioural phenomena distinguish the recognition of faces and objects, even when members of a set of objects are highly similar. Because faces have the same parts in approximately the same relations, individuation of faces typically requires specification of the metric variation in a holistic and integral representation of the facial surface. The direct mapping of a hypercolumn–like pattern of activation onto a representation layer that preserves relative spatial filter values in a two–dimensional (2D) coordinate space, as proposed by C. von der Malsburg and his associates, may account for many of the phenomena associated with face recognition. An additional refinement, in which each column of filters (termed a ‘jet’) is centered on a particular facial feature (or fiducial point), allows selectivity of the input into the holistic representation to avoid incorporation of occluding or nearby surfaces. The initial hypercolumn representation also characterizes the first stage of object perception, but the image variation for objects at a given location in a 2D coordinate space may be too great to yield sufficient predictability directly from the output of spatial kernels. Consequently, objects can be represented by a structural description specifying qualitative (typically, non–accidental) characterizations of an object's parts, the attributes of the parts, and the relations among the parts, largely based on orientation and depth discontinuities (as shown by Hummel and Biederman). A series of experiments on the name priming or physical matching of complementary images (in the Fourier domain) of objects and faces documents that whereas face recognition is strongly dependent on the original spatial filter values, evidence from object recognition indicates strong invariance to these values, even when distinguishing among objects that are as similar as faces.