A Bipartite Graph Model for Associating Images and Text
Source:
Hyderabad, India (2007)
URL:
http://cobweb.ecn.purdue.edu/~malcolm/yahoo/Srini2007(BipartiteGraphImagesTextIJCAI).pdf
Abstract:
The joint modeling of image and textual content is even more important now because of the the availability of large databases of image-rich web pages and the tagging phenomenon. Much of the current work focused on one-way association (image to text or tags). The association is often captured by building a model with hidden variables. In this paper, we propose a simple model based on random walks on bipartite graphs for joint modeling of image and textual content. We show its effectiveness for several tasks — automatic image annotation, tag association, tag localization, and spurious tag detection. Such random walk models are useful for other tasks such as web search. Joint keyword–image modeling has a relatively short but rich history. Two important issues in joint modeling are: image representation and the statistical modeling. Images are represented as either collections of blobs [Barnard et al., 2003] or as collections of salient points [Bosch et al., 2006]. Each blob is described by features – color and texture vectors. There are several techniques for detection of interest points [Schmid et al., 2000]. Interest points are usually represented by Scale Invariant Feature Transform or SIFT [Lowe, 2004]. The feature vectors are often vector quantized for representational simplicity. The vector quantized features are a form of “visual word” and then the joint modeling problem is a machine translation problem [Duygulu et al., 2002].