Ok guys, lets say i wanted to grab the source for google.com or
something... it wont allow if unless i send the correct headers to spoof
the program.. Can anyone give me a working example of how to send
headers and download a webpage source?
I tried looking through all of the docs and coming up with something but
i failed...
With open-uri[0] you can open URIs just like local files. That would be
entirely sufficient to get the content of the index page of google.com, for
example. Instead of a simple URL you can also pass the open call a URI[1]
object, for which you can explicitly call headers if you need to.
You could then also also use Hpricot[2] to do all sorts of nifty HTML
parsing
-----Original Message-----
From: list-bounce@example.com
[mailto:list-bounce@example.com] On Behalf Of Haze Noc
Sent: Wednesday, August 15, 2007 10:05 AM
To: ruby-talk ML
Subject: HTTP headers and source
Ok guys, lets say i wanted to grab the source for google.com
or something... it wont allow if unless i send the correct
headers to spoof the program.. Can anyone give me a working
example of how to send headers and download a webpage source?
I tried looking through all of the docs and coming up with
something but i failed...
LHH shows all HTTP chatter, so there's nothing that a server can see
that you can't. From there it's just a matter of imitating the headers
with Net::HTTP.
Remember, though, that you have some vague sort of obligation to
maintain netiquette. If a server rejects automated requests, they may
have a good reason to, and you're going against their wishes to mimic
a real browser. I doubt the Feds are going to come kicking your door
in over it, but it's still worth trying to be respectful.