Fast Action Localization in Large-Scale Video Archives

2 September 2015

journal article
research article
Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Circuits and Systems for Video Technology

Vol. 26 (10), 1917-1930
https://doi.org/10.1109/tcsvt.2015.2475835

Abstract

Finding content in large video archives has so far required textual annotation to enable search by keywords. Our aim is to support retrieval from such archives using queries based on the example video clips that contain meaningful human actions. We propose a solution for the scalable search of actions in large-scale archives by leveraging the complementarity between the description at the frame level and the aggregation in time of descriptors. To permit fast search, we introduce a two-level cascade. The inexpensive first level employs aggregation to filter out a large part of the video. At the second level, aided by feature selection, a more discriminative comparison by frame alignment ranks the remaining video sequences. We improve upon the state of the art on popular data sets, and we introduce and show the results on a novel video archive data set that is significantly larger than previous ones.

Keywords

Funding Information

French National Research Agency within the Joint French-Mexican Project Mex-Culture (ANR-11-IS02-001)

This publication has 33 references indexed in Scilit:

Efficient Search and Localization of Human Actions in Video Databases
IEEE Transactions on Circuits and Systems for Video Technology, 2013
Dense Trajectories and Motion Boundary Descriptors for Action Recognition
International Journal of Computer Vision, 2013
Human Focused Action Localization in Video
Lecture Notes in Computer Science, 2012
Real-time human action search using random forest based hough voting
Published by Association for Computing Machinery (ACM) ,2011
Relevance feedback for real-world human action retrieval
Pattern Recognition Letters, 2011
Improving the Fisher Kernel for Large-Scale Image Classification
Lecture Notes in Computer Science, 2010
Tracklet Descriptors for Action Modeling and Video Analysis
Lecture Notes in Computer Science, 2010
On Space-Time Interest Points
International Journal of Computer Vision, 2005
Robust Real-Time Face Detection
International Journal of Computer Vision, 2004
Dynamic programming algorithm optimization for spoken word recognition
IEEE Transactions on Acoustics, Speech, and Signal Processing, 1978

Cited by 17 articles