Automatic multimedia cross-modal correlation discovery

22 August 2004

conference paper
conference paper
Published by Association for Computing Machinery (ACM)

p. 653-658
https://doi.org/10.1145/1014052.1014135

Abstract

Given an image (or video clip, or audio song), how do we automatically assign keywords to it? The general problem is to find correlations across the media in a collection of multimedia objects like video clips, with colors, and/or motion, and/or audio, and/or text scripts. We propose a novel, graph-based approach, "MMG", to discover such cross-modal correlations.Our "MMG" method requires no tuning, no clustering, no user-determined constants; it can be applied to any multimedia collection, as long as we have a similarity function for each medium; and it scales linearly with the database size. We report auto-captioning experiments on the "standard" Corel image database of 680 MB, where it outperforms domain specific, fine-tuned methods by up to 10 percentage points in captioning accuracy (50% relative improvement).

Keywords

This publication has 12 references indexed in Scilit:

Automatic linguistic indexing of pictures by a statistical modeling approach
IEEE Transactions on Pattern Analysis and Machine Intelligence, 2003
Learning the semantics of words and pictures
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2002
Topic-sensitive PageRank
Published by Association for Computing Machinery (ACM) ,2002
MARSYAS: a framework for audio analysis
Organised Sound, 2000
Normalized cuts and image segmentation
IEEE Transactions on Pattern Analysis and Machine Intelligence, 2000
Diameter of the World-Wide Web
Nature, 1999
Lessons learned from building a terabyte digital video library
Computer, 1999
Name-It: naming and detecting faces in news videos
IEEE MultiMedia, 1999
A semidiscrete matrix decomposition for latent semantic indexing information retrieval
ACM Transactions on Information Systems, 1998
Latent semantic indexing
Published by Association for Computing Machinery (ACM) ,1998

Cited by 325 articles