Http/net

Hi,

is this possible with http/net .

data =lynx http://linux-lx:80/ -auth=hansen:123456 -dump
puts data

The page need a password and a user for login.

Manfred

The page need a password and a user for login.

Well, if you use basic authentification you have in the documentation

=== Basic Authentication

    require 'net/http'
    Net::HTTP.version_1_1 # declear to use 1.1 features.

    Net::HTTP.start( 'auth.some.domain' ) {|http|
        response, body = http.get( '/need-auth.cgi',
                'Authorization' => 'Basic ' + ["#{account}:#{password}"].pack('m').
strip )
        print body
    }

In version 1.2 (Ruby 1.7 or later), you can write like this:

    require 'net/http'
    Net::HTTP.version_1_2 # declear to use 1.2 features.

    req = Net::HTTP::Get.new('/need-auth.cgi')
    req.basic_auth 'account', 'password'
    Net::HTTP.start( 'auth.some.domain' ) {|http|
        response = http.request(req)
        print response.body
    }

Guy Decoux

Thanks for your advice.

Manfred Hansen

ts wrote:

···

The page need a password and a user for login.

Well, if you use basic authentification you have in the documentation

=== Basic Authentication

require 'net/http'
Net::HTTP.version_1_1   # declear to use 1.1 features.

Net::HTTP.start( 'auth.some.domain' ) {|http|
    response, body = http.get( '/need-auth.cgi',
            'Authorization' => 'Basic ' +
            ["#{account}:#{password}"].pack('m').

strip )
print body
}

In version 1.2 (Ruby 1.7 or later), you can write like this:

require 'net/http'
Net::HTTP.version_1_2   # declear to use 1.2 features.

req = Net::HTTP::Get.new('/need-auth.cgi')
req.basic_auth 'account', 'password'
Net::HTTP.start( 'auth.some.domain' ) {|http|
    response = http.request(req)
    print response.body
}

Guy Decoux

First, let me say that I am a net/http dummy.

I would like to collect information from a website
at work. From my Netscape perspective, I type in
the URL:

http://xyz.mycompany.com

It automatically redirects to

http://xyz.mycompany.com:3000/scripts/isynch.dll

If I have not logged on, I get a login popup dialogue asking
for my username and password.

I tried the following code, but it gave an error.
Can someone point me in the right direction?

···

On Thu, Sep 26, 2002 at 01:05:53AM +0900, ts wrote:

The page need a password and a user for login.

Well, if you use basic authentification you have in the documentation

=== Basic Authentication

require 'net/http'
Net::HTTP.version_1_1   # declear to use 1.1 features.

Net::HTTP.start( 'auth.some.domain' ) {|http|
    response, body = http.get( '/need-auth.cgi',
            'Authorization' => 'Basic ' + ["#{account}:#{password}"].pack('m').

strip )
print body
}

================
require ‘net/http’
Net::HTTP.version_1_1 # declear to use 1.1 features.

url0 = “http://xyz.mycompany.com
url = “http://xyz.mycompany.com:3000/
auth = “/scripts/isynch.dll”
account = “me”
passwd = “notmyrealpassword”

Net::HTTP.start( url ) {|http|
response, body = http.get( auth,
‘Authorization’ => 'Basic ’ + [“#{account}:#{passwd}”].pack(‘m’).strip )
print body
}

Thanks

Jim

In version 1.2 (Ruby 1.7 or later), you can write like this:

require 'net/http'
Net::HTTP.version_1_2   # declear to use 1.2 features.

req = Net::HTTP::Get.new('/need-auth.cgi')
req.basic_auth 'account', 'password'
Net::HTTP.start( 'auth.some.domain' ) {|http|
    response = http.request(req)
    print response.body
}

Guy Decoux


Jim Freeze

Programming Ruby
def initialize; fun; end
A language with class

I tried the following code, but it gave an error.
Can someone point me in the right direction?

================
require ‘net/http’
Net::HTTP.version_1_1 # declear to use 1.1 features.

url0 = “http://xyz.mycompany.com
url = “http://xyz.mycompany.com:3000/
auth = “/scripts/isynch.dll”
account = “me”
passwd = “notmyrealpassword”

Net::HTTP.start( url ) {|http|
response, body = http.get( auth,
‘Authorization’ => 'Basic ’ + [“#{account}:#{passwd}”].pack(‘m’).strip )
print body
}

Oops, forgot to send the error message:

./get.rb
/usr/local/lib/ruby/1.6/net/protocol.rb:469:in new': getaddrinfo: no address associated with hostname. (SocketError) from /usr/local/lib/ruby/1.6/net/protocol.rb:469:in connect’
from /usr/local/lib/ruby/1.6/net/protocol.rb:468:in timeout' from /usr/local/lib/ruby/1.6/net/protocol.rb:468:in connect’
from /usr/local/lib/ruby/1.6/net/protocol.rb:462:in initialize' from /usr/local/lib/ruby/1.6/net/protocol.rb:159:in new’
from /usr/local/lib/ruby/1.6/net/protocol.rb:159:in conn_socket' from /usr/local/lib/ruby/1.6/net/protocol.rb:148:in connect’
from /usr/local/lib/ruby/1.6/net/protocol.rb:142:in _start' from /usr/local/lib/ruby/1.6/net/protocol.rb:128:in start’
from /usr/local/lib/ruby/1.6/net/http.rb:455:in `start’
from ./get.rb:14

···

On Thu, Sep 26, 2002 at 03:20:19AM +0900, Jim Freeze wrote:

Thanks

Jim


Jim Freeze

Programming Ruby
def initialize; fun; end
A language with class

Net::HTTP.start( url ) {|http|

should be Net::HTTP.start(host[, port]){|http|

./get.rb
/usr/local/lib/ruby/1.6/net/protocol.rb:469:in `new’: getaddrinfo: no address associated with hostname. (SocketError)

this was caused when Net::HTTP would lookup address of host “http://…”.
Because http://… is not a hostname and DNS lookup failed.

net/http.rb include reference and samples. Try:

% rd2 http.rb > http.html

or

% ruby -ne ‘print if /^=begin/…/^/=end/’ http.rb | more

– Gotoken

···

At Thu, 26 Sep 2002 04:10:34 +0900, Jim Freeze wrote:

Thanks, that helped. I changed the code to the following:

···

On Thu, Sep 26, 2002 at 04:47:51AM +0900, GOTO Kentaro wrote:

At Thu, 26 Sep 2002 04:10:34 +0900, > Jim Freeze wrote:

Net::HTTP.start( url ) {|http|

should be Net::HTTP.start(host[, port]){|http|

./get.rb
/usr/local/lib/ruby/1.6/net/protocol.rb:469:in `new’: getaddrinfo: no address associated with hostname. (SocketError)

this was caused when Net::HTTP would lookup address of host “http://…”.
Because http://… is not a hostname and DNS lookup failed.

net/http.rb include reference and samples. Try:

% rd2 http.rb > http.html

or

% ruby -ne ‘print if /^=begin/…/^/=end/’ http.rb | more

– Gotoken

================================================
require ‘net/http’
Net::HTTP.version_1_1 # declear to use 1.1 features.

url = “xyz.mycompany.com
port = 3000
path = “/scripts/isynch.dll”
account = “me”
passwd = “notarealpassword”

Net::HTTP.start( url, port ) {|http|
begin
response, body = http.get( path,
‘Authorization’ => 'Basic ’ + [“#{account}:#{passwd}”].pack(‘m’).strip )
print body
rescue Net::ProtoRetriableError => err
if m = %rhttp://([^/]+).match( err.response[‘location’] )
host = m[1].strip
path = m.post_match
retry
end
end
}

The code works, but I get back a frame:

Modeling Request System

Is there way to automatically get the frame documents, or do
I have to search the returned text and filter out the
new paths?

Thanks


Jim Freeze

Programming Ruby
def initialize; fun; end
A language with class

You can pick up values of SRC from html as follows:

html = …

require “uri”
current_uri = URI.parse(“http://xyz.mycompany.com:3000/scripts/isynch.dll”)

pattern = /|<.+?>|[^<>]+/m
href = /href\s*=\s*“(.+?)”/im
frame = html.scan(pattern).grep(/^<\s*frame\s+.*src=/i).map{|src|
/src=“(.+?)”|src=([^\s>]+)/i.match(src)
current_uri.merge($1 || $2)
}
p frame

– Gotoken

···

At Thu, 26 Sep 2002 05:47:19 +0900, Jim Freeze wrote:

Is there way to automatically get the frame documents, or do
I have to search the returned text and filter out the
new paths?

Ok, thanks. Now frame has the value:

[#<URI::HTTP:0x409000e URL:http://xyz.mycompany.com:3000/scripts/isynch.dll?SyncNotesTopMenu>, #<URI::HTTP:0x408fd66 URL:http://xyz.mycompany.com:3000/scripts/isynch.dll?panel=TclScript&file=QuickView.tcl>, #<URI::HTTP:0x408fa50 URL:http://xyz.mycompany.com:3000/scripts/isynch.dll?SyncNotesTopIntro>]

…but, sorry to say, I am still a big dummy with regards to net/http.
Is there a way to directly use a uri to get the page data?

Thanks

···

On Thu, Sep 26, 2002 at 06:18:24AM +0900, GOTO Kentaro wrote:

At Thu, 26 Sep 2002 05:47:19 +0900, > Jim Freeze wrote:

Is there way to automatically get the frame documents, or do
I have to search the returned text and filter out the
new paths?

You can pick up values of SRC from html as follows:

html = …

require “uri”
current_uri = URI.parse(“http://xyz.mycompany.com:3000/scripts/isynch.dll”)

pattern = /|<.+?>|[^<>]+/m
href = /href\s*=\s*“(.+?)”/im
frame = html.scan(pattern).grep(/^<\s*frame\s+.*src=/i).map{|src|
/src=“(.+?)”|src=([^\s>]+)/i.match(src)
current_uri.merge($1 || $2)
}
p frame

– Gotoken


Jim Freeze

Programming Ruby
def initialize; fun; end
A language with class

Hi,

In mail “Re: http/net”

…but, sorry to say, I am still a big dummy with regards to net/http.
Is there a way to directly use a uri to get the page data?

Net::HTTP.get accepts URI objects.
Other Net::HTTP methods do NOT accept URI objects.
Because:

Net::HTTP.start(‘www.example.com’) {|http|
print http.get(URI.parse(‘http://www.ruby-lang.org/pub/ruby-1.6.6.tar.gz’))
}

This operation does not make sense.

– Minero Aoki

···

Jim Freeze jim@freeze.org wrote:

In mail “Re: http/net”

···

Minero Aoki aamine@mx.edit.ne.jp wrote:

Net::HTTP.get accepts URI objects.

This change is done in 1.7.

– Minero Aoki.