MySpiders: Evolve Your Own Intelligent Web Crawlers

作者:Gautam Pant, Filippo Menczer

摘要

The dynamic nature of the World Wide Web makes it a challenge to find information that is both relevant and recent. Intelligent agents can complement the power of search engines to meet this challenge. We present a Web tool called MySpiders, which implements an evolutionary algorithm managing a population of adaptive crawlers who browse the Web autonomously. Each agent acts as an intelligent client on behalf of the user, driven by a user query and by textual and linkage clues in the crawled pages. Agents autonomously decide which links to follow, which clues to internalize, when to spawn offspring to focus the search near a relevant source, and when to starve. The tool is available to the public as a threaded Java applet. We discuss the development and deployment of such a system.

论文关键词:web informational retrieval, topic-driver crawlers, online search, InfoSpiders, MySpiders, applet

论文评审过程:

论文官网地址:https://doi.org/10.1023/A:1014853428272