Pkg for getting web pages

Hi:

What is the ‘best’ package for getting pages
from the web. Webfetcher looks like it has
not been updated for a while and http-access
is listed as not supporting cookies.

Is there a common pkg that people are currently
using?

Thanks

···


Jim Freeze

Be braver – you can’t cross a chasm in two small jumps.

wget is good enough: Wget - GNU Project - Free Software Foundation

Gennady.

···

----- Original Message -----
From: “Jim Freeze” jim@freeze.org
To: “ruby-talk ML” ruby-talk@ruby-lang.org
Sent: Friday, June 06, 2003 8:50 AM
Subject: Pkg for getting web pages

Hi:

What is the ‘best’ package for getting pages
from the web. Webfetcher looks like it has
not been updated for a while and http-access
is listed as not supporting cookies.

Is there a common pkg that people are currently
using?

Thanks


Jim Freeze

Be braver – you can’t cross a chasm in two small jumps.

There was one which was announced on ruby-talk around the beginning of this
year, and I used it successfully to download a copy of the pickaxe as HTML
(before I realised it was available for direct download anyway :slight_smile:

Ah yes, found it: httpsnapshot-0.3.5.36.tar.gz

I only used it once but it seemed to do the job (if mirroring a website is
what you’re after)

I can’t find the announcment on the ruby-talk archives, but it is on RAA.

Cheers,

Brian.

···

On Sat, Jun 07, 2003 at 12:50:35AM +0900, Jim Freeze wrote:

What is the ‘best’ package for getting pages
from the web. Webfetcher looks like it has
not been updated for a while and http-access
is listed as not supporting cookies.

Thanks. I was actually looking for a ruby pkg, but this
looks very nice.

···

On Saturday, 7 June 2003 at 3:48:59 +0900, Gennady wrote:

wget is good enough: Wget - GNU Project - Free Software Foundation


Jim Freeze

1.79 x 10^12 furlongs per fortnight – it’s not just a good idea, it’s
the law!

Oops, I did not get it from your post. But anyway, Ruby is great for
interfacing to other programs and if something is being done well by some
specialized tool it is worth using it from Ruby instead of reinventing a
bicycle in Ruby ;-), especially when the tool is readily available.

Gennady.

···

----- Original Message -----
From: “Jim Freeze” jim@freeze.org
To: “ruby-talk ML” ruby-talk@ruby-lang.org
Sent: Friday, June 06, 2003 12:19 PM
Subject: Re: Pkg for getting web pages

On Saturday, 7 June 2003 at 3:48:59 +0900, Gennady wrote:

wget is good enough: Wget - GNU Project - Free Software Foundation

Thanks. I was actually looking for a ruby pkg, but this
looks very nice.


Jim Freeze

1.79 x 10^12 furlongs per fortnight – it’s not just a good idea, it’s
the law!

Saluton!

  • Gennady; 2003-06-06, 20:14 UTC:

But anyway, Ruby is great for interfacing to other programs and if
something is being done well by some specialized tool it is worth
using it from Ruby instead of reinventing a bicycle in Ruby ;-),
especially when the tool is readily available.

The question is still open to discussion: Which tool. Many people
know w3 but that is not the only tool that fits. There are others,
most notably curl - it comes with a library (for C IIRC).

*curl *
is a client to get documents/files from or send documents to a
server, using any of the supported protocols (HTTP, HTTPS, FTP,
GOPHER, DICT, TELNET, LDAP or FILE). The command is designed to work
without user interaction or any kind of interactivity. curl offers a
busload of useful tricks like proxy support, user authentication, ftp
upload, HTTP post, SSL (https:) connections, cookies, file transfer
resume and more.

http://curl.haxx.se/

snarf
can transfer files through the http, gopher, finger, and ftp
protocols without user interaction. It is small and fast and can
force active FTP (default is passive), resume downloading a partially
transferred file and spoof MSIE us well as Navigator user-agent
string and checks the SNARF_PROXY, FTP_PROXY, GOPHER_PROXY,
HTTP_PROXY, and PROXY environment variables.

http://www.xach.com/snarf/index.html

Gis,

Josef ‘Jupp’ Schugt