Large Scale Fine-Grained Categorization and Domain-Specific Transfer Learning
- 1 June 2018
- conference paper
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
- p. 4109-4118
- https://doi.org/10.1109/cvpr.2018.00432
Abstract
Transferring the knowledge learned from large scale datasets (e.g., ImageNet) via fine-tuning offers an effective solution for domain-specific fine-grained visual categorization (FGVC) tasks (e.g., recognizing bird species or car make & model). In such scenarios, data annotation often calls for specialized domain knowledge and thus is difficult to scale. In this work, we first tackle a problem in large scale FGVC. Our method won first place in iNaturalist 2017 large scale species classification challenge. Central to the success of our approach is a training scheme that uses higher image resolution and deals with the long-tailed distribution of training data. Next, we study transfer learning via fine-tuning from large scale datasets to small scale, domain-specific FGVC datasets. We propose a measure to estimate domain similarity via Earth Mover's Distance and demonstrate that transfer learning benefits from pre-training on a source domain that is similar to the target domain by this measure. Our proposed transfer learning outperforms ImageNet pre-training and obtains state-of-the-art results on multiple commonly used FGVC datasets.Keywords
This publication has 42 references indexed in Scilit:
- Going deeper with convolutionsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2015
- Building a bird recognition app and large scale dataset with citizen scientists: The fine print in fine-grained dataset collectionPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2015
- Leveraging the Wisdom of the Crowd for Fine-Grained RecognitionIEEE Transactions on Pattern Analysis and Machine Intelligence, 2015
- ImageNet Large Scale Visual Recognition ChallengeInternational Journal of Computer Vision, 2015
- Learning and Transferring Mid-level Image Representations Using Convolutional Neural NetworksPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2014
- Capturing Long-Tail Distributions of Object SubcategoriesPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2014
- Rich Feature Hierarchies for Accurate Object Detection and Semantic SegmentationPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2014
- 3D Object Representations for Fine-Grained CategorizationPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2013
- The Pascal Visual Object Classes (VOC) ChallengeInternational Journal of Computer Vision, 2009
- The Monge–Kantorovich Mass Transference Problem and Its Stochastic ApplicationsTheory of Probability and Its Applications, 1985