Leveraging the Wisdom of the Crowd for Fine-Grained Recognition

Abstract
Fine-grained recognition concerns categorization at sub-ordinate levels, where the distinction between object classes is highly local. Compared to basic level recognition, fine-grained categorization can be more challenging as there are in general less data and fewer discriminative features. This necessitates the use of a stronger prior for feature selection. In this work, we include humans in the loop to help computers select discriminative features. We introduce a novel online game called “Bubbles” that reveals discriminative features humans use. The player's goal is to identify the category of a heavily blurred image. During the game, the player can choose to reveal full details of circular regions (“bubbles”), with a certain penalty. With proper setup the game generates discriminative bubbles with assured quality. We next propose the “BubbleBank” representation that uses the human selected bubbles to improve machine recognition performance. Finally, we demonstrate how to extend BubbleBank to a view-invariant 3D representation. Experiments demonstrate that our approach yields large improvements over the previous state of the art on challenging benchmarks.
Funding Information
  • Intel
  • ISTC
  • ONR-MURI (NSF-IIS-1115313)
  • Max Planck Center for Visual Computing and Communication

This publication has 37 references indexed in Scilit: