Hi,
Is there any way to remotely extract a page source with Ruby from a
website that uses session data?
Thank you,
W
Hi,
Is there any way to remotely extract a page source with Ruby from a
website that uses session data?
Thank you,
W
wintermute wrote:
Hi,
Is there any way to remotely extract a page source with Ruby from a
website that uses session data?
Mechanize probably does what you're looking for.
--
Alex
You can do it with plain old net http that comes with Ruby, though it
would take a couple of lines and require more work.
If you're using windows, the easiest way would be watir:
#literally
require 'watir'
@ie=Watir::IE.start("http://whatever/website/your/getting/source/from")
puts @ie.html
Hum. Got me thinking, it's been awhile since I've used Net::HTTP, I
may have misspoken
So watir to html file
#only problem is it leaves an ie window hanging around, but it IS tiny
require 'watir'
File.open("/yea.html","w"){|f|
f.write(Watir::IE.start("http://news.google.com").html)}
Net::HTTP to html file
#just as small, easy and clean!
require 'net/http'
File.open("/yea.html","w"){|f|
f.write(Net::HTTP.get("news.google.com","/index.html"))}
Now the complexity comes when you want to navigate from page to page.
With Watir you do things like
@ie=Watir::IE.start(blah)
@ie.link(:text,/tofu/).click
Easy! And there is also firewatir which is a workalike for firefox
(haven't tried it myself)
NetHTTP is a bit more involved, but quite doable, and you don't need
windows or a gui or anything. Sadly I can't find the code I wrote
using it awhile ago.
Still lookie here, it's useful.
http://www.ruby-doc.org/stdlib/libdoc/net/http/rdoc/classes/Net/HTTP.html
Thank you both. Exactly what I needed.