Cross-Domain Self-Supervised Multi-task Feature Learning Using Synthetic Imagery

1 June 2018

conference paper
conference paper
Published by Institute of Electrical and Electronics Engineers (IEEE)

p. 762-771
https://doi.org/10.1109/cvpr.2018.00086

Abstract

In human learning, it is common to use multiple sources of information jointly. However, most existing feature learning approaches learn from only a single task. In this paper, we propose a novel multi-task deep network to learn generalizable high-level visual representations. Since multitask learning requires annotations for multiple properties of the same training instance, we look to synthetic images to train our network. To overcome the domain difference between real and synthetic data, we employ an unsupervised feature space domain adaptation method based on adversarial learning. Given an input synthetic RGB image, our network simultaneously predicts its surface normal, depth, and instance contour, while also minimizing the feature space domain differences between real and synthetic data. Through extensive experiments, we demonstrate that our network learns more transferable representations compared to single-task baselines. Our learned representation produces state-of-the-art transfer learning results on PASCAL VOC 2007 classification and 2012 detection.

Keywords

This publication has 43 references indexed in Scilit:

Holistically-Nested Edge Detection
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2015
Fully convolutional networks for semantic segmentation
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2015
Estimating image depth using shape collections
ACM Transactions on Graphics, 2014
Seeing 3D Chairs: Exemplar Part-Based 2D-3D Alignment Using a Large Dataset of CAD Models
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2014
Representation Learning: A Review and New Perspectives
IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013
The Pascal Visual Object Classes (VOC) Challenge
International Journal of Computer Vision, 2009
Reducing the Dimensionality of Data with Neural Networks
Science, 2006
Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2005
The princeton shape benchmark
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2004
Gradient-based learning applied to document recognition
Proceedings of the IEEE, 1998

Cited by 108 articles