Abstract
Tracking of humans in videos is important for many applications. A major source of difficulty in performing this task is due to inter-human or scene occlusion. We present an approach based on representing humans as an assembly of four body parts and detection of the body parts in single frames which makes the method insensitive to camera motions. The responses of the body part detectors and a combined human detector provide the "observations" used for tracking. Trajectory initialization and termination are both fully automatic and rely on the confidences computed from the detection responses. An object is tracked by data association if its corresponding detection response can be found; otherwise it is tracked by a meanshift style tracker. Our method can track humans with both inter-object and scene occlusions. The system is evaluated on three sets of videos and compared with previous method.

This publication has 11 references indexed in Scilit: