Hi, i want to write a little spider to do some web search
but have no idea how to start it. is there any example or something like that?
thanks
yuesefa
Hi, i want to write a little spider to do some web search
but have no idea how to start it. is there any example or something like that?
thanks
yuesefa
Haofei wrote:
Hi, i want to write a little spider to do some web search
but have no idea how to start it. is there any example or something like that?
thanksyuesefa
You can see an extremely simple/limited one that I made a while back:
http://students.seattleu.edu/collinsj/programs_netcrawler.html
It may give you a place to start, but there are very good libraries for getting and parsing websites, like Rubyful Soup, Mechanize, open-uri, and so on.
Also, try searching the archives for more.
-Justin
You can write one with WWW::Mechanize. I have an example on my blog:
http://tenderlovemaking.com/2006/05/26/mechanize-one-liners/
There is also an example spider that comes along with Mechanize, just
look in the 'eg' directory.
Here is the spider for those that don't want to click (its not perfect,
but its small!):
(mech = WWW::Mechanize.new).get(ARGV[0])
(a = lambda { |p|
mech.page.links.each { |l|
mech.click(l) && p.call(p) if ! mech.visited? l
}
}).call(a)
--Aaron
On Thu, Aug 24, 2006 at 02:09:28AM +0900, Haofei wrote:
Hi, i want to write a little spider to do some web search
but have no idea how to start it. is there any example or something like
that?
thanksyuesefa
I found hpricot easy to use :
http://code.whytheluckystiff.net/hpricot/
There's some good code examples on the site too.
Chris
--
Posted via http://www.ruby-forum.com/.