Compressing Tags to Find Interesting Media Groups
Source:
CIKM (2009)
Abstract:
On photo sharing websites like Flickr and Zooomr, users
are oered the possibility to assign tags to their uploaded
pictures. Using these tags to nd interesting groups of semantically
related pictures in the result set of a given query
is a problem with obvious applications. We analyse this
problem from a Minimum Description Length (MDL) perspective
and develop an algorithm that nds the most interesting
groups. The method is based on Krimp, which
nds small sets of patterns that characterise the data using
compression. These patterns are sets of tags, often assigned
together to photos.
The better a database compresses, the more structure it
contains and thus the more homogeneous it is. Following this
observation we devise a compression-based measure. Our
experiments on Flickr data show that the most interesting
and homogeneous groups are found. We show extensive examples
and compare to clusterings on the Flickr website.
Download: