Estimating number of citations using author reputation
Source:
String Processing and Information Retrieval (SPIRE) (2007)
Abstract:
We study the problem of predicting the popularity of items in
a dynamic environment in which authors post continuously new items and
provide feedback on existing items.
This problem can be applied to predict popularity of blog posts,
rank photographs in a photo-sharing system, or predict the citations
of a scientific article using author information and monitoring the
items of interest for a short period of time after their creation.
As a case study, we show how to estimate the number of citations for
an academic paper using information about past articles written by the
same author(s) of the paper.
If we use only the citation information over a short period of time,
we obtain a predicted value that has a correlation of r=0.57 with the
actual value. This is our baseline prediction. Our best-performing
system can improve that prediction by adding features extracted from
the past publishing history of its authors, increasing the correlation
between the actual and the predicted values to r=0.81.