Learning Image Representations Tied to Ego-Motion

Abstract

Understanding how images of objects and scenes behave in response to specific ego-motions is a crucial aspect of proper visual development, yet existing visual learning methods are conspicuously disconnected from the physical source of their images. We propose to exploit proprioceptive motor signals to provide unsupervised regularization in convolutional neural networks to learn visual representations from egocentric video. Specifically, we enforce that our learned features exhibit equivariance, i.e, they respond predictably to transformations associated with distinct ego-motions. With three datasets, we show that our unsupervised feature learning approach significantly outperforms previous approaches on visual recognition and next-best-view prediction tasks. In the most challenging test, we show that features learned from video captured on an autonomous driving platform improve large-scale scene recognition in static images from a disjoint domain.

Keywords

Other Versions

Version 2, 2015-05-09, preprints

This publication has 24 references indexed in Scilit:

Slowness and Sparseness Have Diverging Effects on Complex Cell Learning
PLoS Computational Biology, 2014
Vision meets robotics: The KITTI dataset
The International Journal of Robotics Research, 2013
Learning to Relate Images
IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013
Learning Intermediate-Level Representations of Form and Motion from Natural Movies
Neural Computation, 2012
Moving Object Segmentation Using Motor Signals
Lecture Notes in Computer Science, 2012
Transforming Auto-Encoders
Lecture Notes in Computer Science, 2011
Transformation Equivariant Boltzmann Machines
Lecture Notes in Computer Science, 2011
Local Invariant Feature Detectors: A Survey
Foundations and Trends® in Computer Graphics and Vision, 2007
SIGNATURE VERIFICATION USING A “SIAMESE” TIME DELAY NEURAL NETWORK
International Journal of Pattern Recognition and Artificial Intelligence, 1993
Movement-produced stimulation in the development of visually guided behavior.
Journal of Comparative and Physiological Psychology, 1963

Cited by 128 articles