Best way to automate web browser tasks?

I know there's Watir or something... but I'm not using
IE (rather Firefox) and I don't want to "control the
browser" per se...

Rather I want to automate some repetitive tasks... like
go to a form, fill in a couple of fields, click a checkbox,
pick from a dropdown, and click on Save.

How would you do this?

Hal

If it doesn't involve javascript, mechanize should do just fine.

···

On 10/24/06, Hal Fulton <hal9000@hypermetrics.com> wrote:

Rather I want to automate some repetitive tasks... like
go to a form, fill in a couple of fields, click a checkbox,
pick from a dropdown, and click on Save.

How would you do this?

WWW::Mechanize should be able to do it.

···

On 2006.10.25 10:00, Hal Fulton wrote:

I know there's Watir or something... but I'm not using
IE (rather Firefox) and I don't want to "control the
browser" per se...

Rather I want to automate some repetitive tasks... like
go to a form, fill in a couple of fields, click a checkbox,
pick from a dropdown, and click on Save.

How would you do this?

WWW::Mechanize[1] is pretty cool for stuff like this. There also exists SafariWatir and FireWatir for safari and firefox respectively. But firewatir is 20 times slower then normal watir at the moment.

[1] http://rubyforge.org/forum/forum.php?forum_id=9457

-- Ezra Zygmuntowicz-- Lead Rails Evangelist
-- ez@engineyard.com
-- Engine Yard, Serious Rails Hosting
-- (866) 518-YARD (9273)

···

On Oct 24, 2006, at 6:00 PM, Hal Fulton wrote:

I know there's Watir or something... but I'm not using
IE (rather Firefox) and I don't want to "control the
browser" per se...

Rather I want to automate some repetitive tasks... like
go to a form, fill in a couple of fields, click a checkbox,
pick from a dropdown, and click on Save.

How would you do this?

Hal

curl?

seriously - do you have to do it via firefox?

-a

···

On Wed, 25 Oct 2006, Hal Fulton wrote:

I know there's Watir or something... but I'm not using
IE (rather Firefox) and I don't want to "control the
browser" per se...

Rather I want to automate some repetitive tasks... like
go to a form, fill in a couple of fields, click a checkbox,
pick from a dropdown, and click on Save.

How would you do this?

Hal

--
my religion is very simple. my religion is kindness. -- the dalai lama

I know there's Watir or something... but I'm not using
IE (rather Firefox) and I don't want to "control the
browser" per se...

Rather I want to automate some repetitive tasks... like
go to a form, fill in a couple of fields, click a checkbox,
pick from a dropdown, and click on Save.

How would you do this?

If I were going to do it in a browser, I'd use Selenium RC [1] along with
the included selenium.rb to do things like this:

  selenium = Selenium::SeleneseInterpreter.new("localhost", 4444,

"*firefox", "http://www.google.com/&quot;, 10000);

  selenium.start
  selenium.open "http://www.google.com/&quot;
  browser.type "name=q", "ruby"
  browser.click_and_wait "name=btnG"

Cheers,
/Nick

[1] http://www.openqa.org/selenium-rc/

···

On 10/24/06, Hal Fulton <hal9000@hypermetrics.com> wrote:

I know there's Watir or something... but I'm not using
IE (rather Firefox) and I don't want to "control the
browser" per se...

Rather I want to automate some repetitive tasks... like
go to a form, fill in a couple of fields, click a checkbox,
pick from a dropdown, and click on Save.

How would you do this?

One way is by e-mail:

http://www.faqs.org/faqs/internet-services/access-via-email/

http://www.expita.com/index.html

Some of this info may be obsolete now. Forms are more difficult
and depend on the server you use. Agora cannot do this, unless
someone's version has been improved. Getweb could do forms when I
last looked some years back. I don't know which getweb servers are
still running.

Hal

        Hugh

···

On Wed, 25 Oct 2006, Hal Fulton wrote:

Don't forget about DHTML. Without a browser you will have to write your
own browser engine to process java script. For the simple forms without
html you can use TCP to get and send HTML requests.

Alex
http://webunittesting.com

Hal Fulton wrote:

···

I know there's Watir or something... but I'm not using
IE (rather Firefox) and I don't want to "control the
browser" per se...

Rather I want to automate some repetitive tasks... like
go to a form, fill in a couple of fields, click a checkbox,
pick from a dropdown, and click on Save.

How would you do this?

Hal

No, I *don't* have to do it via FF or any other browser.

Can curl do that sort of thing? I've never used it
except for simple sucking-down of pages.

Hal

···

ara.t.howard@noaa.gov wrote:

curl?

seriously - do you have to do it via firefox?

I worked on something very similar recently. We looked at Beautiful
Soup, Rubyful Soup, and Mechanize, and went with Mechanize in the end.
I think Beautiful Soup is actually better, but it's in Python. For me
that was a minus, and for the other programmer on the project, it was
a deal-breaker. Rubyful Soup is a direct port of Beautiful Soup which
is literally ten times slower. Mechanize has performance equivalent to
Beautiful Soup and is pretty easy to use as well. I think some of the
underlying code comes from why the lucky stiff. That's generally
considered a good thing.

···

--
Giles Bowkett
http://www.gilesgoatboy.org

+1

It executes inside the browser, so it executes Javascript and such.

···

On Wed, Oct 25, 2006 at 11:02:37AM +0900, Nick Sieger wrote:

On 10/24/06, Hal Fulton <hal9000@hypermetrics.com> wrote:
>
>I know there's Watir or something... but I'm not using
>IE (rather Firefox) and I don't want to "control the
>browser" per se...
>
>Rather I want to automate some repetitive tasks... like
>go to a form, fill in a couple of fields, click a checkbox,
>pick from a dropdown, and click on Save.
>
>How would you do this?

If I were going to do it in a browser, I'd use Selenium RC [1] along with
the included selenium.rb to do things like this:

> selenium = Selenium::SeleneseInterpreter.new("localhost", 4444,
"*firefox", "http://www.google.com/&quot;, 10000);
> selenium.start
> selenium.open "http://www.google.com/&quot;
> browser.type "name=q", "ruby"
> browser.click_and_wait "name=btnG"

--
Esteban Manchado Velázquez <zoso@foton.es> - http://www.foton.es
EuropeSwPatentFree - http://EuropeSwPatentFree.hispalinux.es

curl?

seriously - do you have to do it via firefox?

No, I *don't* have to do it via FF or any other browser.

Can curl do that sort of thing? I've never used it
except for simple sucking-down of pages.

Yes.

-F/--form <name=content>

(HTTP) This lets curl emulate a filled in form in which a user has
pressed the submit button. This causes curl to POST data using the
content-type multipart/form-data according to RFC1867. This enables
uploading of binary files etc. To force the content part to be be a
file, prefix the file name with an @ sign. To just get the content part
from a file, prefix the file name with the letter <. The difference
between @ and < is then that @ makes a file get attached in the post as
a file upload, while the < makes a text field and just get the contents
for that text field from a file.

Example, to send your password file to the server, where password is
the name of the form-field to which /etc/passwd will be the input:

curl -F password=@/etc/passwd www.mypasswords.com

To read the files content from stdin insted of a file, use
- where the file name shouldve been. This goes for both @ and <
   constructs.

You can also tell curl what Content-Type to use for the file upload
part, by using type=, in a manner similar to:

curl -F "web=@index.html;type=text/html" url.com

See further examples and details in the MANUAL.

This option can be used multiple times.

that, and then some.

here's a script i use to post to sciruby:

     #! /usr/bin/env ruby

     $VERBOSE = nil

···

On Wed, 25 Oct 2006, Hal Fulton wrote:

ara.t.howard@noaa.gov wrote:

curl?

seriously - do you have to do it via firefox?

No, I *don't* have to do it via FF or any other browser.

Can curl do that sort of thing? I've never used it
except for simple sucking-down of pages.

Hal

     #
     # built-in
     #
       require "getoptlong"
     #
     # setup
     #
       uri = "http://sciruby.codeforpeople.com/sr.cgi&quot;
       moin_id = ENV['SCIRUBY_MOIN_ID']
     #
     # options
     #
       opts = {}

       GetoptLong::new(
         [ "--moin_id", "-m", GetoptLong::REQUIRED_ARGUMENT ]
       ).each{|opt, arg| opts[opt.delete("-")] = arg}

       moin_id = opts["moin_id"] || ENV["MOIN_ID"] || moin_id
     #
     # argv
     #
       page, infile = ARGV.shift, ARGV.shift
     #
     # run
     #
       abort "#{ $0 } page [infile or stdin] [--moin_id=moin_id]" unless page

       page = "http://sciruby.codeforpeople.com/sr.cgi/#\{ page }" unless
         page =~ %r/^http/

       data = (infile.nil? or infile == "-") ? STDIN.read : open(infile){|f| f.read}

       command = <<-sh
       curl "#{ page }" \
              -s -S --stderr - \
              -bMOIN_ID=#{ moin_id } -A=Mozilla/4.0 \
              -F action=savepage -F comment=curl -F "savetext=<-"
       sh
       command = command.strip.split(%r/\s+/).join(" ")

       STDERR.puts command
       IO::popen("#{ command }", "r+") do |pipe|
         pipe.puts data
         pipe.close_write
         while((line = pipe.gets))
           print line
         end
       end

       abort "command <#{ command }> failed with <#{ $?.inspect }>" unless
         $? == 0

you might also consider http-access2, here's an example

   http://codeforpeople.com/lib/ruby/rubyforge/rubyforge-0.1.1/bin/rubyforge

nearly all of what you need to know is at the begining or the very end.

cheers.

-a
--
my religion is very simple. my religion is kindness. -- the dalai lama

Philip Hallstrom wrote:

curl?

seriously - do you have to do it via firefox?

No, I *don't* have to do it via FF or any other browser.

Can curl do that sort of thing? I've never used it
except for simple sucking-down of pages.

Yes.

-F/--form <name=content>

[snip snip]

Doing a 'man curl' I see now that it has a plethora of options --
someone once said, a metric sh*tload.

It looks a little painful, though. I suppose for a dropdown you'd
have to type the full value of the selected option? Or am I thinking
of a checkbox?

I see now that this thing has a lot of Javascript in it. When I hover
over the New button, it says:

   javascript:hideMainMenu();submitbutton('new');

which just complicates things that much more.

What about Mechanize? Better/worse/different?

Thanks,
Hal

···

(HTTP) This lets curl emulate a filled in form in which a user has
pressed the submit button. This causes curl to POST data using the
content-type multipart/form-data according to RFC1867. This enables
uploading of binary files etc. To force the content part to be be a
file, prefix the file name with an @ sign. To just get the content part
from a file, prefix the file name with the letter <. The difference
between @ and < is then that @ makes a file get attached in the post as
a file upload, while the < makes a text field and just get the contents
for that text field from a file.

Example, to send your password file to the server, where password is
the name of the form-field to which /etc/passwd will be the input:

curl -F password=@/etc/passwd www.mypasswords.com

To read the files content from stdin insted of a file, use
- where the file name shouldve been. This goes for both @ and <
  constructs.

You can also tell curl what Content-Type to use for the file upload
part, by using type=, in a manner similar to:

curl -F "web=@index.html;type=text/html" url.com

See further examples and details in the MANUAL.

This option can be used multiple times.