Fast Action Localization in Large-Scale Video Archives

Abstract
Finding content in large video archives has so far required textual annotation to enable search by keywords. Our aim is to support retrieval from such archives using queries based on the example video clips that contain meaningful human actions. We propose a solution for the scalable search of actions in large-scale archives by leveraging the complementarity between the description at the frame level and the aggregation in time of descriptors. To permit fast search, we introduce a two-level cascade. The inexpensive first level employs aggregation to filter out a large part of the video. At the second level, aided by feature selection, a more discriminative comparison by frame alignment ranks the remaining video sequences. We improve upon the state of the art on popular data sets, and we introduce and show the results on a novel video archive data set that is significantly larger than previous ones.
Funding Information
  • French National Research Agency within the Joint French-Mexican Project Mex-Culture (ANR-11-IS02-001)

This publication has 33 references indexed in Scilit: