Hallucinated Humans as the Hidden Context for Labeling 3D Scenes

1 June 2013

conference paper
conference paper
Published by Institute of Electrical and Electronics Engineers (IEEE)

p. 2993-3000
https://doi.org/10.1109/cvpr.2013.385

Abstract

For scene understanding, one popular approach has been to model the object-object relationships. In this paper, we hypothesize that such relationships are only an artifact of certain hidden factors, such as humans. For example, the objects, monitor and keyboard, are strongly spatially correlated only because a human types on the keyboard while watching the monitor. Our goal is to learn this hidden human context (i.e., the human-object relationships), and also use it as a cue for labeling the scenes. We present Infinite Factored Topic Model (IFTM), where we consider a scene as being generated from two types of topics: human configurations and human-object relationships. This enables our algorithm to hallucinate the possible configurations of the humans in the scene parsimoniously. Given only a dataset of scenes containing objects but not humans, we show that our algorithm can recover the human object relationships. We then test our algorithm on the task of attribute and object labeling in 3D scenes and show consistent improvements over the state-of-the-art.

Keywords

This publication has 14 references indexed in Scilit:

Learning human activities and object affordances from RGB-D videos
The International Journal of Robotics Research, 2013
3D-Based Reasoning with Blocks, Support, and Stability
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2013
Detecting activities of daily living in first-person camera views
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2012
Learning to place new objects in a scene
The International Journal of Robotics Research, 2012
Toward Holistic Scene Understanding: Feedback Enabled Cascaded Classification Models
IEEE Transactions on Pattern Analysis and Machine Intelligence, 2011
From 3D scene geometry to human workspace
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2011
Modeling mutual context of object and human pose in human-object interaction activities
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2010
Categorizing object-action relations from semantic scene graphs
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2010
Make3D: Learning 3D Scene Structure from a Single Still Image
Ieee Transactions On Pattern Analysis and Machine Intelligence, 2008
Markov Chain Sampling Methods for Dirichlet Process Mixture Models
Journal of Computational and Graphical Statistics, 2000

Cited by 86 articles