Seeking for a ruby spider robot example

Hi, i want to write a little spider to do some web search
but have no idea how to start it. is there any example or something like that?
thanks :slight_smile:

yuesefa

Haofei wrote:

Hi, i want to write a little spider to do some web search
but have no idea how to start it. is there any example or something like that?
thanks :slight_smile:

yuesefa

You can see an extremely simple/limited one that I made a while back:

http://students.seattleu.edu/collinsj/programs_netcrawler.html

It may give you a place to start, but there are very good libraries for getting and parsing websites, like Rubyful Soup, Mechanize, open-uri, and so on.

Also, try searching the archives for more.

-Justin

You can write one with WWW::Mechanize. I have an example on my blog:

http://tenderlovemaking.com/2006/05/26/mechanize-one-liners/

There is also an example spider that comes along with Mechanize, just
look in the 'eg' directory.

Here is the spider for those that don't want to click (its not perfect,
but its small!):

(mech = WWW::Mechanize.new).get(ARGV[0])
(a = lambda { |p|
聽聽聽mech.page.links.each { |l|
聽聽聽聽聽mech.click(l) && p.call(p) if ! mech.visited? l
聽聽聽}
}).call(a)

--Aaron

路路路

On Thu, Aug 24, 2006 at 02:09:28AM +0900, Haofei wrote:

Hi, i want to write a little spider to do some web search
but have no idea how to start it. is there any example or something like
that?
thanks :slight_smile:

yuesefa

I found hpricot easy to use :
http://code.whytheluckystiff.net/hpricot/

There's some good code examples on the site too.

Chris

路路路

--
Posted via http://www.ruby-forum.com/.