Diversifying Image Search with User Generated Content
Source:
Conference on Multimedia Information Retrieval, Vancouver, British Columbia (2008)
Keywords:
pseudo-relevance feedback, diversity, image retrieval, Flickr, retrieval performance, ambiguity
Abstract:
Large-scale image retrieval on the Web relies on the avail-
ability of short snippets of text associated with the image.
This user-generated content is a primary source of infor-
mation about the content and context of an image. While
traditional information retrieval models focus on nding the
most relevant document without consideration for diversity,
image search requires results that are both diverse and rele-
vant. This is problematic for images because they are repre-
sented very sparsely by text, and as with all user-generated
content the text for a given image can be extremely noisy.
The contribution of this paper is twofold. First, we present
a retrieval model which provides diverse results as a property
of the model itself, rather than in a post-retrieval step. Rele-
vance models oer a unied framework to aord the greatest
diversity without harming precision. Second, we show that
it is possible to minimize the trade-o between precision and
diversity, and estimating the query model from the distribu-
tion of tags favors the dominant sense of a query. Relevance
models operating only on tags oers the highest level of di-
versity with no signicant decrease in precision.
Download: