Resolving Ambiguous Hand Pose Predictions by Exploiting Part Correlations

Abstract
The positions of the hand joints are important high-level features for hand-based human-computer interaction. We present a novel method to predict the 3-D joint positions from the depth images and the parsed hand parts obtained with a pretrained classifier. The hand parts are utilized as the additional cue to resolve the multimodal predictions produced by the previous regression-based method without increasing the computational cost significantly. In addition, we further enforce the hand motion constraints to fuse the per-pixel prediction results. The posterior distribution of the joints is formulated as a weighted product of experts model based on the individual pixel predictions, which is maximized via the expectation-maximization algorithm on a learned low-dimensional space of the hand joint parameters. The experimental results show the proposed method improves the prediction accuracy considerably compared with the rivals that also regress for the joint locations from the depth images. Especially, we show that the regressor learned on synthesized dataset also gives accurate prediction on real-world depth images by enforcing the hand part correlations despite their discrepancies.
Funding Information
  • National Research Foundation Singapore, Singapore, through the International Research Centre within the Singapore Funding Initiative and administered by the IDM Programme Office, which was carried out at the BeingThere Centre, Institute of Media Innovatio

This publication has 28 references indexed in Scilit: