Efficient object localization using Convolutional Networks

1 June 2015

conference paper
conference paper
Published by Institute of Electrical and Electronics Engineers (IEEE)

No. 10636919,p. 648-656
https://doi.org/10.1109/cvpr.2015.7298664

Abstract

Recent state-of-the-art performance on human-body pose estimation has been achieved with Deep Convolutional Networks (ConvNets). Traditional ConvNet architectures include pooling and sub-sampling layers which reduce computational requirements, introduce invariance and prevent over-training. These benefits of pooling come at the cost of reduced localization accuracy. We introduce a novel architecture which includes an efficient `position refinement' model that is trained to estimate the joint offset location within a small region of the image. This refinement model is jointly trained in cascade with a state-of-the-art ConvNet model [21] to achieve improved accuracy in human joint location estimation. We show that the variance of our detector approaches the variance of human annotations on the FLIC [20] dataset and outperforms all existing approaches on the MPII-human-pose dataset [1].

Other Versions

This publication has 15 references indexed in Scilit:

Strong Appearance and Expressive Spatial Models for Human Pose Estimation
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2013
Human Pose Estimation Using Body Parts Dependent Joint Regressors
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2013
Articulated Pose Estimation Using Discriminative Armlet Classifiers
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2013
Learning effective human pose estimation from inaccurate annotation
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2011
Clustered Pose and Nonlinear Appearance Models for Human Pose Estimation
Published by British Machine Vision Association and Society for Pattern Recognition ,2010
Poselets: Body part detectors trained using 3D human pose annotations
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2009
Pictorial structures revisited: People detection and articulated pose estimation
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2009
A discriminatively trained, multiscale, deformable part model
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2008
Gradient-based learning applied to document recognition
Proceedings of the IEEE, 1998
SIGNATURE VERIFICATION USING A “SIAMESE” TIME DELAY NEURAL NETWORK
International Journal of Pattern Recognition and Artificial Intelligence, 1993

Cited by 732 articles