Poselet Key-Framing: A Model for Human Activity Recognition

Abstract

In this paper, we develop a new model for recognizing human actions. An action is modeled as a very sparse sequence of temporally local discriminative key frames - collections of partial key-poses of the actor(s), depicting key states in the action sequence. We cast the learning of key frames in a max-margin discriminative framework, where we treat key frames as latent variables. This allows us to (jointly) learn a set of most discriminative key frames while also learning the local temporal context between them. Key frames are encoded using a spatially-localizable pose let-like representation with HoG and BoW components learned from weak annotations, we rely on structured SVM formulation to align our components and mine for hard negatives to boost localization performance. This results in a model that supports spatio-temporal localization and is insensitive to dropped frames or partial observations. We show classification performance that is competitive with the state of the art on the benchmark UT-Interaction dataset and illustrate that our model outperforms prior methods in an on-line streaming setting.

Keywords

This publication has 27 references indexed in Scilit:

A discriminative key pose sequence model for recognizing human interactions
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2011
Recognizing human actions by attributes
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2011
Learning hierarchical poselets for human parsing
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2011
Recognizing human actions from still images with latent poses
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2010
Efficient Subwindow Search: A Branch and Bound Framework for Object Localization
IEEE Transactions on Pattern Analysis and Machine Intelligence, 2009
Pose primitive based human action recognition in videos or still images
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2008
Information Theoretic Key Frame Selection for Action Recognition
Published by British Machine Vision Association and Society for Pattern Recognition ,2008
Searching Video for Complex Activities with Finite State Models
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2007
On Space-Time Interest Points
International Journal of Computer Vision, 2005
Activation in Human MT/MST by Static Images with Implied Motion
Journal of Cognitive Neuroscience, 2000

Cited by 167 articles