Parsing the Hand in Depth Images

Abstract
Hand pose tracking and gesture recognition are useful for human-computer interaction, while a major problem is the lack of discriminative features for compact hand representation. We present a robust hand parsing scheme to extract a high-level description of the hand from the depth image. A novel distance-adaptive selection method is proposed to get more discriminative depth-context features. Besides, we propose a Superpixel-Markov Random Field (SMRF) parsing scheme to enforce the spatial smoothness and the label co-occurrence prior to remove the misclassified regions. Compared to pixel-level filtering, the SMRF scheme is more suitable to model the misclassified regions. By fusing the temporal constraints, its performance can be further improved. Overall, the proposed hand parsing scheme is accurate and efficient. The tests on synthesized dataset show it gives much higher accuracy for single-frame parsing and enhanced robustness for continuous sequence parsing compared to benchmarks. The tests on real-world depth images of the hand and human body show the robustness to complex hand configurations of our method and its generalization power to different kinds of articulated objects.

This publication has 28 references indexed in Scilit: