[ANN] Crawler 0.0.1

Brian_Denny1 · 23 October 2002 03:18

Hi all,

I recently searched in vain for a web-crawling library in ruby. If
there is one out there already, i’d love to know about it.

In any case, I have written my own, and I hope it proves some use to
somebody besides me.

You can find Crawler at
http://implementality.com/projects/crawler/

This is the first time I have announced a piece of software (however
small) to a public forum. I am sure that Crawler has much room for
improvement. (Starting, perhaps, with its name? Any suggestions?) So
please feel free to provide feedback, criticism, patches, etc.

For the impatient, here is how crawler works:

instantiate crawler with a callback routine

crawler = Crawler.new do | url, page_data |
do_something_with_url()
do_something_with_page_data()
end

crawl to depth 3, invoking the above callback for each page

crawler.crawl (‘http://www.rubycentral.org/’, 3)

happy crawling,

-brian

Eric_Hodel1 · 23 October 2002 03:23

(ruby-)miner
(ruby-)digger
etc

···

Brian Denny (brian@implementality.org) wrote:

This is the first time I have announced a piece of software (however
small) to a public forum. I am sure that Crawler has much room for
improvement. (Starting, perhaps, with its name? Any suggestions?) So
please feel free to provide feedback, criticism, patches, etc.

–
Eric Hodel - drbrain@segment7.net - http://segment7.net
All messages signed with fingerprint:
FEC2 57F1 D465 EB15 5D6E 7C11 332A 551C 796C 9F04

Robert_Linder · 23 October 2002 13:25

Hi Brian,
There is a module called ‘webfetcher’ in
http://www.ruby-lang.org/en/raa.html

robert_linder_2000@yahoo.com

···

-----Original Message-----
From: Brian Denny [mailto:brian@implementality.org]
Sent: Tuesday, October 22, 2002 11:19 PM
To: ruby-talk ML
Subject: [ANN] Crawler 0.0.1

Hi all,

I recently searched in vain for a web-crawling library in ruby. If
there is one out there already, i’d love to know about it.

In any case, I have written my own, and I hope it proves some use to
somebody besides me.

You can find Crawler at
http://implementality.com/projects/crawler/

This is the first time I have announced a piece of software (however
small) to a public forum. I am sure that Crawler has much room for
improvement. (Starting, perhaps, with its name? Any suggestions?) So
please feel free to provide feedback, criticism, patches, etc.

For the impatient, here is how crawler works:

instantiate crawler with a callback routine

crawler = Crawler.new do | url, page_data |
do_something_with_url()
do_something_with_page_data()
end

crawl to depth 3, invoking the above callback for each page

crawler.crawl (‘http://www.rubycentral.org/’, 3)

happy crawling,

-brian

Brian_Denny · 23 October 2002 16:00

Hi Brian,
There is a module called ‘webfetcher’ in
http://www.ruby-lang.org/en/raa.html

thanks for the tip, that looks like a good package and i will keep it in
mind for future projects.

my Crawler program is a lot less full-featured, and probably less robust too
(considering i just wrote it). nevertheless i hope that it may be useful
for those with simple webcrawling needs.

-brian

Topic		Replies	Views
Looking for web crawler written in Ruby ruby-talk	6	131	2 February 2006
Web spider ruby-talk	2	64	4 May 2005
Ruby web spiders? ruby-talk	0	51	5 March 2004
Ruby web articles etc ruby-talk	2	62	11 January 2003
[ANN] Finder v0.1.0 released ruby-talk	0	141	11 February 2012

[ANN] Crawler 0.0.1

instantiate crawler with a callback routine

crawl to depth 3, invoking the above callback for each page

instantiate crawler with a callback routine

crawl to depth 3, invoking the above callback for each page

Related Topics