Who's In the Picture
Source:
Advances in Neural Information Processing Systems 17, MIT Press, Cambridge, MA, p.137-144 (2005)
Abstract:
The context in which a name appears in a caption
provides powerful cues as to who is depicted in the associated
image. We obtain 44,773 face images, using a face detector, from
approximately half a million captioned news images and
automatically link names, obtained using a named entity recognizer,
with these faces. A simple clustering method can produce fair
results. We improve these results significantly by combining the
clustering process with a model of the probability that an
individual is depicted given its context. Once the labeling
procedure is over, we have an accurately labeled set of faces, an
appearance model for each individual depicted, and a natural
language model that can produce accurate results on captions in
isolation.