Reading from a URL

This is what I’m currently doing to read from a URL through the company
proxy server and it’s working.

uri = URI.parse(someURL)

# Get the proxy server and port.
proxy = ENV['HTTP_PROXY'] || ENV['http_proxy']
%r{(\w+)://([\w.]+):(\d+)}.match(proxy)
protocol, server, port = $1, $2, $3

# Connect through the proxy server if there is one.
Net::HTTP::Proxy(server, port).start(uri.host) do |http|
  response = http.get(uri.request_uri)
  # use response.body
end
···

-----Original Message-----
From: Alexander Bokovoy [mailto:a.bokovoy@sam-solutions.net]
Sent: Tuesday, October 08, 2002 4:55 AM
To: ruby-talk@ruby-lang.org
Subject: Re: reading from a URL

On Tue, Oct 08, 2002 at 03:22:07AM +0900, GOTO Kentaro wrote:

At Tue, 8 Oct 2002 02:33:52 +0900, > > Volkmann, Mark Mark.Volkmann@AGEDWARDS.com wrote:

I need to read the content of an HTTP URL. I can break
the URL up into
server, port and path and use Net::HTTP to read it, but
it seems like there
should be an easier way. I’d really like to pass the URL
string to a method
of some class that would return an object from a class
that inherits from
IO. Does something like that exist? If not, is there an
easier way than
using Net::HTTP?

If you have Ruby 1.7, try

require “net/http”
require “uri”
content =
Net::HTTP.get_print(URI.parse(“Example Domain”))

If not,

require “net/http”
require “uri”
uri = URI.parse(“Example Domain”)
content = Net::HTTP.get(uri.host, uri.request_uri, uri.port)

Read net/http.rb if you would need http auth, redirection
or proxy server.
This library includes the documentation of itself.
BTW, current Net::HTTP code does not support proxies with
authentication
enabled which makes it rather unusable in corporate networks.
I’ve sent a
patch to fix it to Matz and he forwarded it to Minero Anoki
(net/http.rb
maintainer) but fix still isn’t in CVS.

I’m attaching it here in meantime.


/ Alexander Bokovoy

Q: What is the sound of one cat napping?
A: Mu.


WARNING: All e-mail sent to and from this address will be received or
otherwise recorded by the A.G. Edwards corporate e-mail system and is
subject to archival, monitoring or review by, and/or disclosure to,
someone other than the recipient.


You miss the point. Your proxy isn’t requiring authorization and thus it
works. With authorization enabled this won’t work because
‘Proxy-Authorization:’ header field isn’t set.

···

On Tue, Oct 08, 2002 at 11:39:43PM +0900, Volkmann, Mark wrote:

This is what I’m currently doing to read from a URL through the company
proxy server and it’s working.

/ Alexander Bokovoy


I’m having an EMOTIONAL OUTBURST!! But, uh, WHY is there a WAFFLE in
my PAJAMA POCKET??

Alexander Bokovoy a.bokovoy@sam-solutions.net writes:

You miss the point. Your proxy isn’t requiring authorization and thus it

A bit off topic. Since I have never worked in an enterprise: why do
they put username/passwd authorisation on a web proxy? Can’t the proxy
be set as to serve only the internal network and reject connection
from outside?

Or is it because the proxy is a two-way proxy, allowing
ppl with username/passwd use the proxy to fetch internal pages? If so,
why not use VPN and allow authenticated ppl from outside do more than
fetching internal pages?

YS.

At PACCAR, the proxy was used to track which users view which pages via
the proxy, as IP address were assigned by DHCP, and you could only trace
the porn viewing back to a desktop. Authenticating users allowed IT to
find out who (rather, which username/password) view the porn.

···

Yohanes Santoso (ysantoso@jenny-gnome.dyndns.org) wrote:

A bit off topic. Since I have never worked in an enterprise: why do
they put username/passwd authorisation on a web proxy? Can’t the proxy
be set as to serve only the internal network and reject connection
from outside?


Eric Hodel - drbrain@segment7.net - http://segment7.net
All messages signed with fingerprint:
FEC2 57F1 D465 EB15 5D6E 7C11 332A 551C 796C 9F04