Fast Action Localization in Large-Scale Video Archives
- 2 September 2015
- journal article
- research article
- Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Circuits and Systems for Video Technology
- Vol. 26 (10), 1917-1930
- https://doi.org/10.1109/tcsvt.2015.2475835
Abstract
Finding content in large video archives has so far required textual annotation to enable search by keywords. Our aim is to support retrieval from such archives using queries based on the example video clips that contain meaningful human actions. We propose a solution for the scalable search of actions in large-scale archives by leveraging the complementarity between the description at the frame level and the aggregation in time of descriptors. To permit fast search, we introduce a two-level cascade. The inexpensive first level employs aggregation to filter out a large part of the video. At the second level, aided by feature selection, a more discriminative comparison by frame alignment ranks the remaining video sequences. We improve upon the state of the art on popular data sets, and we introduce and show the results on a novel video archive data set that is significantly larger than previous ones.Keywords
Funding Information
- French National Research Agency within the Joint French-Mexican Project Mex-Culture (ANR-11-IS02-001)
This publication has 33 references indexed in Scilit:
- Efficient Search and Localization of Human Actions in Video DatabasesIEEE Transactions on Circuits and Systems for Video Technology, 2013
- Dense Trajectories and Motion Boundary Descriptors for Action RecognitionInternational Journal of Computer Vision, 2013
- Human Focused Action Localization in VideoLecture Notes in Computer Science, 2012
- Real-time human action search using random forest based hough votingPublished by Association for Computing Machinery (ACM) ,2011
- Relevance feedback for real-world human action retrievalPattern Recognition Letters, 2011
- Improving the Fisher Kernel for Large-Scale Image ClassificationLecture Notes in Computer Science, 2010
- Tracklet Descriptors for Action Modeling and Video AnalysisLecture Notes in Computer Science, 2010
- On Space-Time Interest PointsInternational Journal of Computer Vision, 2005
- Robust Real-Time Face DetectionInternational Journal of Computer Vision, 2004
- Dynamic programming algorithm optimization for spoken word recognitionIEEE Transactions on Acoustics, Speech, and Signal Processing, 1978