Publication

Distribution of Relevant Documents in Domain-level Aggregates for Topic Distillation

Source:

Alternate Track Papers and Posters of the 13th International World Wide Web Conference, p.372--373 (2004)

ISBN:

1-58113-912-8

Abstract:

In this paper, we study the distribution of relevant documents in aggregates, formed by grouping the retrieved documents according to their domain. For each aggregate, we take into account its size, and a measure of the correlation between its incoming and outgoing hyperlinks. We report on a preliminary experiment with two TREC topic distillation tasks, where we find that larger aggregates, or those aggregates with correlated hyperlinks, are more likely to contain relevant documents. This result shows that the distribution of domain-level aggregates is potentially useful for finding relevant documents.

Download: