Pose search: Retrieving people using their pose

Abstract
We describe a method for retrieving shots containing a particular 2D human pose from unconstrained movie and TV videos. The method involves first localizing the spatial layout of the head, torso and limbs in individual frames using pictorial structures, and associating these through a shot by tracking. A feature vector describing the pose is then constructed from the pictorial structure. Shots can be retrieved either by querying on a single frame with the desired pose, or through a pose classifier trained from a set of pose examples. Our main contribution is an effective system for retrieving people based on their pose, and in particular we propose and investigate several pose descriptors which are person, clothing, background and lighting independent. As a second contribution, we improve the performance over existing methods for localizing upper body layout on unconstrained video. We compare the spatial layout pose retrieval to a baseline method where poses are retrieved using a HOG descriptor. Performance is assessed on five episodes of the TV series 'Buffy the Vampire Slayer', and pose retrieval is demonstrated also on three Hollywood movies..

This publication has 22 references indexed in Scilit: