[Rack] Crawling Noisebridge-discuss Archives

Jared Dunne jareddunne at gmail.com
Sat May 12 19:05:03 PDT 2012


rack-

I wanted to give you a heads up that I am crawling:
https://www.noisebridge.net/pipermail/noisebridge-discuss/

With the User-Agent of:
Noisebridge-discuss Drama Detector Crawler

I am using Nutch 1.4 to with the default fetcher.server.delay setting of 5
seconds between requests to the same server.  I suspect I can go faster
than that but I'll err on the side of caution.

This for a Noisebridge related project.  Please let me know if there are
any problems with this.

Thanks,
Jared-
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.noisebridge.net/pipermail/rack/attachments/20120512/d0bb4625/attachment.htm 


More information about the Rack mailing list