Finding people by sampling

Abstract
We show how to use a sampling method to find sparsely clad people in static images. People are modeled as an assembly of nine cylindrical segments. Segments are found using an EM algorithm and then assembled into hypotheses incrementally, using a learned likelihood model. Each assembly step passes on a set of samples of its likelihood to the next; this yields effective pruning of the space of hypotheses. The collection of available nine-segment hypotheses is then represented by a set of equivalence classes, which yield an efficient pruning process. The posterior for the number of people is obtained from the class representatives. People are counted quite accurately in images of real scenes using a MAP estimate. We show the method allows top-down as well as bottom up reasoning. While the method can be overwhelmed by very large numbers of segments, we show that this problem can be avoided by quite simple pruning steps.

This publication has 6 references indexed in Scilit: