Unsupervised Learning of Object Features from Video Sequences

Abstract
We develop an efficient algorithm for unsupervised learning of object models as constellations of features, from low resolution video sequences. The input images typically contain single or multiple objects that change in pose, scale and degree of occlusion. Also, the objects can move significantly between consecutive frames. The content of an input sequence is unlabeled so the learner has to cluster the data based on the data's implicit coherence over time and space. Our approach takes advantage of the dependent pairwise co-occurrences of objects' features within local neighborhoods vs. the independent behavior of unrelated features. We couple or decouple pairs of features based on a probabilistic interpretation of their pairwise statistics and then extract objects as connected components of features.

This publication has 6 references indexed in Scilit: