and here is the error I get
ruby test.rb
/usr/local/lib/ruby/1.8/open-uri.rb:290:in `open_http': 500 Internal
Server Error (OpenURI::HTTPError)
from /usr/local/lib/ruby/1.8/open-uri.rb:629:in `buffer_open'
from /usr/local/lib/ruby/1.8/open-uri.rb:167:in `open_loop'
from /usr/local/lib/ruby/1.8/open-uri.rb:165:in `open_loop'
from /usr/local/lib/ruby/1.8/open-uri.rb:135:in `open_uri'
from /usr/local/lib/ruby/1.8/open-uri.rb:531:in `open'
from /usr/local/lib/ruby/1.8/open-uri.rb:86:in `open'
from test.rb:2
However
require 'open-uri' # allows the use of a file like API for URLs
open( "http://www.google.com/") { |file|
lines = file.read
puts lines
Maybe the service was down?
Or they may have it restricted to prevent scraping?
You may need to provide some info to fool the site into
thinking your a regular browser...
Use something like Ethereal to capture the packets sent between your browser
and the service. Then imitate that in code. You will just need to send the
same HTTP headers and follow any redirects that the server sends.
Good luck!
Justin
···
On 7/26/06, akanksha <akanksha.baid@gmail.com> wrote:
> Or they may have it restricted to prevent scraping?
> You may need to provide some info to fool the site into
> thinking your a regular browser...
How would I go about doing that ...could you plz point me to some
info?
Thank you.
yes that works and so does mechanize ....thanks!!!
···
ara.t.howard@noaa.gov wrote:
On Thu, 27 Jul 2006, akanksha wrote:
>> Maybe the service was down?
>
> The service was not down. Both urls open in a browser.
>
>
>
>> Or they may have it restricted to prevent scraping?
>> You may need to provide some info to fool the site into
>> thinking your a regular browser...
>
> How would I go about doing that ...could you plz point me to some
> info?
> Thank you.
you need to set user-agent to a 'real' browser. something like 'Mozilla/4.0'
-a
--
suffering increases your inner strength. also, the wishing for suffering
makes the suffering disappear.
- h.h. the 14th dali lama