Thursday, April 17, 2003

Crawl Grub, crawl!

Wired News featured a story on a new web search engine concept Grub. It relies on a distributed web crawling engine that anyone can download and run on their computer and also run as a screensaver, much like the famous SETI@home project screensaver. As the screen saver runs it throughs a node map onto your screen of recently crawled URLs and the statistics.

The Grub concept really makes a lot of sense to me. As the amount of material on the web increases the amount of work to crawl and index it increases too. Relying on a small but finite percentage of machines out there indexing for you is much more practical than trying to centralize a single huge collection of indexing machines.

Grub isn't actually providing its own search engine form yet (if ever), its making its crawling results available to third parties for doing searches. The first example is the WiseNutsearch engine, others will be coming soon. Crawl data is also available to users directly via an XML interface.

If all goes according to plan Grub plans to accumulate enough indexing clients to crawl the entire web every day. Compare this to Google that typically expects to take two to four weeks to crawl the web. I expect its only a matter of time before Google launches its own Google screen saver distributed web crawling engine. But much as I love Google, I'd much rather support the young upstart Grub.

Crawl Grub, crawl!

No comments: