WIRE: an open-source Web information retrieval environment
Source:
Workshop on Open Source Web Information Retrieval (OSWIR), Compiegne, France, p.27--30 (2005)
URL:
http://www.dcc.uchile.cl/%7Eccastill/papers/castillo_05_web_information_retrieval_environment.pdf
Keywords:
crawling
Abstract:
In this paper, we describe the WIRE (Web Information Retrieval Environment) project and focus on some details of its crawler component. The WIRE crawler is a scalable, highly configurable, high performance, open-source Web crawler which we have used to study the characteristics of large Web collections.