Net/HTTP is flaky?

Inexplicably, the following code fails:

http = Net::HTTP.new("somesite.com")
puts http.get("/")[1]

    ...with this error:

ruby/1.8/net/http.rb:925:in '[]': undefined method 'downcase' for 1:Fixnum
(NoMethodError)
        from web.rb:5

    Now, I do have access to a Linux system with Ruby 1.6 and it sometimes
succeeds but often fails with this error:

/usr/lib/ruby/1.6/net/protocol.rb:221:in `error!': 403 "Forbidden"
(Net::ProtoFatalError)
        from /usr/lib/ruby/1.6/net/http.rb:1217:in `value'
        from /usr/lib/ruby/1.6/net/http.rb:605:in `get'
        from ./web.rb:16

    How is this possible? How can it sometimes succeed and sometimes fail?
Why can't this reliably work?
    Thank you...

    Inexplicably, the following code fails:

http = Net::HTTP.new("somesite.com")
puts http.get("/")[1]

    ...with this error:

ruby/1.8/net/http.rb:925:in '': undefined method 'downcase' for 1:Fixnum
(NoMethodError)
        from web.rb:5

The interface to Net::HTTP changed between 1.6 and 1.8
You want
http.get('/').body

    Now, I do have access to a Linux system with Ruby 1.6 and it sometimes
succeeds but often fails with this error:

/usr/lib/ruby/1.6/net/protocol.rb:221:in `error!': 403 "Forbidden"
(Net::ProtoFatalError)
        from /usr/lib/ruby/1.6/net/http.rb:1217:in `value'
        from /usr/lib/ruby/1.6/net/http.rb:605:in `get'
        from ./web.rb:16

    How is this possible? How can it sometimes succeed and sometimes fail?
Why can't this reliably work?
    Thank you...

It doesn't reliably work because this is the internet. Also the particular error you are seeing means that the webserver has forbidden access to the url you are trying to, um, access.

···

On Jul 19, 2006, at 7:50 AM, Just Another Victim of the Ambient Morality wrote:

"Logan Capaldo" <logancapaldo@gmail.com> wrote in message
news:70EBC0F7-3357-4DCD-BF2F-3BD0FBBAE92F@gmail.com...

The interface to Net::HTTP changed between 1.6 and 1.8
You want
http.get('/').body

    Ah, thanks...

    Now, I do have access to a Linux system with Ruby 1.6 and it
sometimes
succeeds but often fails with this error:

/usr/lib/ruby/1.6/net/protocol.rb:221:in `error!': 403 "Forbidden"
(Net::ProtoFatalError)
        from /usr/lib/ruby/1.6/net/http.rb:1217:in `value'
        from /usr/lib/ruby/1.6/net/http.rb:605:in `get'
        from ./web.rb:16

    How is this possible? How can it sometimes succeed and sometimes
fail?
Why can't this reliably work?
    Thank you...

It doesn't reliably work because this is the internet. Also the
particular error you are seeing means that the webserver has forbidden
access to the url you are trying to, um, access.

    But I can access the page just fine with a web browser. For instance,
go to "/" at "en.wikipedia.org" and it will reliably work with a browser.
Try it with Ruby and it will usually fail, although I did get it to work
occasionally, last night, which I don't get.
    Hell, "/" at "music.com" has _never_ worked and there doesn't appear to
be any kind of redirect or anything.
    Does any of this make sense?
    Thank you...

"Logan Capaldo" <logancapaldo@gmail.com> wrote in message
news:70EBC0F7-3357-4DCD-BF2F-3BD0FBBAE92F@gmail.com...

    Now, I do have access to a Linux system with Ruby 1.6 and it
sometimes
succeeds but often fails with this error:

/usr/lib/ruby/1.6/net/protocol.rb:221:in `error!': 403 "Forbidden"
(Net::ProtoFatalError)
        from /usr/lib/ruby/1.6/net/http.rb:1217:in `value'
        from /usr/lib/ruby/1.6/net/http.rb:605:in `get'
        from ./web.rb:16

    How is this possible? How can it sometimes succeed and sometimes
fail?
Why can't this reliably work?
    Thank you...

It doesn't reliably work because this is the internet. Also the
particular error you are seeing means that the webserver has forbidden
access to the url you are trying to, um, access.

But I can access the page just fine with a web browser. For instance,
go to "/" at "en.wikipedia.org" and it will reliably work with a browser.

No, it redirects to /wiki/Main_Page

Try it with Ruby and it will usually fail, although I did get it to work
occasionally, last night, which I don't get.

$ ruby -rnet/http
Net::HTTP.start 'en.wikipedia.org' do |http| res = http.get '/'; p res; end
#<Net::HTTPMovedPermanently 301 Moved Permanently readbody=true>

Matches browser behavior.

Hell, "/" at "music.com" has _never_ worked and there doesn't appear to
be any kind of redirect or anything.

$ ruby -rnet/http
Net::HTTP.start 'music.com' do |http| res = http.get '/'; p res; end
#<Net::HTTPMovedPermanently 301 Moved Permanently readbody=true>

Redirects to www.music.com in the browser, Net::HTTP matches browser behavior.

···

On Jul 19, 2006, at 1:20 PM, Just Another Victim of the Ambient Morality wrote:

--
Eric Hodel - drbrain@segment7.net - http://blog.segment7.net
This implementation is HODEL-HASH-9600 compliant

http://trackmap.robotcoop.com

"Eric Hodel" <drbrain@segment7.net> wrote in message
news:087CE005-B271-4B73-9C98-7DF62B1AC652@segment7.net...

$ ruby -rnet/http
Net::HTTP.start 'en.wikipedia.org' do |http| res = http.get '/'; p res;
end
#<Net::HTTPMovedPermanently 301 Moved Permanently readbody=true>

Matches browser behavior.

Hell, "/" at "music.com" has _never_ worked and there doesn't appear to
be any kind of redirect or anything.

$ ruby -rnet/http
Net::HTTP.start 'music.com' do |http| res = http.get '/'; p res; end
#<Net::HTTPMovedPermanently 301 Moved Permanently readbody=true>

Redirects to www.music.com in the browser, Net::HTTP matches browser
behavior.

    Okay, so the site doesn't exist where I think it does and I get
redirected... in my browser. Net::HTTP doesn't get redirected, it simply
fails. _This_ behaviour doesn't match my browser. Is there any way I can
get redirected or find out where it wants to redirect me and go there,
myself?
    Thanks...

You might want to check out the API docs for Net::HTTP.

Copy/pasted from
http://ruby-doc.org/stdlib/libdoc/net/http/rdoc/classes/Net/HTTP.html:

    require 'net/http'
    require 'uri'

    def fetch(uri_str, limit = 10)
      # You should choose better exception.
      raise ArgumentError, 'HTTP redirect too deep' if limit == 0

      response = Net::HTTP.get_response(URI.parse(uri_str))
      case response
      when Net::HTTPSuccess then response
      when Net::HTTPRedirection then fetch(response['location'], limit - 1)
      else
        response.error!
      end
    end

    print fetch('http://www.ruby-lang.org')

···

On 7/19/06, Just Another Victim of the Ambient Morality <ihatespam@rogers.com> wrote:

"Eric Hodel" <drbrain@segment7.net> wrote in message
news:087CE005-B271-4B73-9C98-7DF62B1AC652@segment7.net...
>
> $ ruby -rnet/http
> Net::HTTP.start 'en.wikipedia.org' do |http| res = http.get '/'; p res;
> end
> #<Net::HTTPMovedPermanently 301 Moved Permanently readbody=true>
>
> Matches browser behavior.
>
>> Hell, "/" at "music.com" has _never_ worked and there doesn't appear to
>> be any kind of redirect or anything.
>
> $ ruby -rnet/http
> Net::HTTP.start 'music.com' do |http| res = http.get '/'; p res; end
> #<Net::HTTPMovedPermanently 301 Moved Permanently readbody=true>
>
> Redirects to www.music.com in the browser, Net::HTTP matches browser
> behavior.

    Okay, so the site doesn't exist where I think it does and I get
redirected... in my browser. Net::HTTP doesn't get redirected, it simply
fails. _This_ behaviour doesn't match my browser. Is there any way I can
get redirected or find out where it wants to redirect me and go there,
myself?

"Eric Hodel" <drbrain@segment7.net> wrote in message
news:087CE005-B271-4B73-9C98-7DF62B1AC652@segment7.net...

$ ruby -rnet/http
Net::HTTP.start 'en.wikipedia.org' do |http| res = http.get '/'; p res;
end
#<Net::HTTPMovedPermanently 301 Moved Permanently readbody=true>

Matches browser behavior.

Hell, "/" at "music.com" has _never_ worked and there doesn't appear to
be any kind of redirect or anything.

$ ruby -rnet/http
Net::HTTP.start 'music.com' do |http| res = http.get '/'; p res; end
#<Net::HTTPMovedPermanently 301 Moved Permanently readbody=true>

Redirects to www.music.com in the browser, Net::HTTP matches browser
behavior.

Okay, so the site doesn't exist where I think it does and I get
redirected... in my browser. Net::HTTP doesn't get redirected, it simply
fails.

No, it doesn't fail, it gives you a redirect. Follow it.

_This_ behaviour doesn't match my browser. Is there any way I can
get redirected or find out where it wants to redirect me and go there,
myself?

Follow the redirect. Look at open-uri for example code.

···

On Jul 19, 2006, at 2:55 PM, Just Another Victim of the Ambient Morality wrote:

--
Eric Hodel - drbrain@segment7.net - http://blog.segment7.net
This implementation is HODEL-HASH-9600 compliant

http://trackmap.robotcoop.com

Ok, so Net:HTTP doesn't fail, it just returns pretty low-level
information. If you want browser behavior for redirects, you have to
roll it yourself. I ran into something similar 2 days ago when I was
trying to fetch pages which required basic authentication (Which, I
just learned, is the mechanism behind those urername & password login
popups that the browser generates). I did manage to roll my own,
based on some examples from this mailing list.
Then I had to write a wrapper around that because sometimes my URI's
pointed to file:// instead of http:// - another thing the browser
handles invisibly.
Then there are those pages that use cookies to track logins...

So my question is, is there an existing library out there that
implements all these things a browser does? Where get(uri).page
returns the source of the same page I would see if I put the URI in
firefox?

-Adam

···

On 7/19/06, Joe Van Dyk <joevandyk@gmail.com> wrote:

On 7/19/06, Just Another Victim of the Ambient Morality > <ihatespam@rogers.com> wrote:
>
> Okay, so the site doesn't exist where I think it does and I get
> redirected... in my browser. Net::HTTP doesn't get redirected, it simply
> fails. _This_ behaviour doesn't match my browser. Is there any way I can
> get redirected or find out where it wants to redirect me and go there,
> myself?

You might want to check out the API docs for Net::HTTP.

"Eric Hodel" <drbrain@segment7.net> wrote in message
news:A270FB20-30C3-472C-8BA9-922490FFD19B@segment7.net...

No, it doesn't fail, it gives you a redirect. Follow it.

    You know, it would be easier to follow the redirect if I knew that it
was giving me a redirect and if I knew how to follow one were I to be given
it...

_This_ behaviour doesn't match my browser. Is there any way I can
get redirected or find out where it wants to redirect me and go there,
myself?

Follow the redirect. Look at open-uri for example code.

    You know, this is where Ruby's lack of documentation is really biting me
in the ass. I found some rudimentary docs on how to use it but, you know
what? For the life of me, no amount of googling can reveal exactly where I
can get the open-uri module... Seriously, where do I download this thing
from and how could I have known that?
    Thank you, from a very frustrated would-be Ruby programmer...

···

On Jul 19, 2006, at 2:55 PM, Just Another Victim of the Ambient Morality > wrote:

"Eric Hodel" <drbrain@segment7.net> wrote in message
news:A270FB20-30C3-472C-8BA9-922490FFD19B@segment7.net...

No, it doesn't fail, it gives you a redirect. Follow it.

    You know, it would be easier to follow the redirect if I knew that it
was giving me a redirect and if I knew how to follow one were I to be given
it...

Well, if you expect to use Net::HTTP you need to know both HTTP and the library. You might prefer WWW::Mechanize instead.

You'll get Top Quality Answers if you tell us what you really want to do. Help with the intricacies of a library may not let us guide you down the right path :slight_smile:

Quoting myself:

$ ruby -rnet/http
Net::HTTP.start 'en.wikipedia.org' do |http| res = http.get '/'; p res; end
#<Net::HTTPMovedPermanently 301 Moved Permanently readbody=true>

That shows Net::HTTP returning a redirect when you perform a get.

_This_ behaviour doesn't match my browser. Is there any way I can
get redirected or find out where it wants to redirect me and go there,
myself?

Follow the redirect. Look at open-uri for example code.

You know, this is where Ruby's lack of documentation is really biting me in the ass. I found some rudimentary docs on how to use it but, you know what?

Unfortunately the Net::HTTP documentation isn't enabled in 1.8. You can use the 1.9 documentation however:

http://ruby-doc.org/core-1.9/classes/Net/HTTP.html

As luck would have it, it has a "Following Redirection" section a few pages down.

For the life of me, no amount of googling can reveal exactly where I can get the open-uri module... Seriously, where do I download this thing from and how could I have known that?

It ships with ruby, require 'open-uri'.

Thank you, from a very frustrated would-be Ruby programmer...

I've been working on getting lots more documentation enabled in 1.8. I hope to get the net/ stuff turned on, but I don't know if I'll have enough time to review/merge documentation from HEAD. (I'd really, really like help with this, and Hugh Sasse has brought in a ton of new documentation.)

···

On Jul 19, 2006, at 4:55 PM, Just Another Victim of the Ambient Morality wrote:

On Jul 19, 2006, at 2:55 PM, Just Another Victim of the Ambient >> Morality >> wrote:

--
Eric Hodel - drbrain@segment7.net - http://blog.segment7.net
This implementation is HODEL-HASH-9600 compliant

http://trackmap.robotcoop.com

>
> Okay, so the site doesn't exist where I think it does and I get
> redirected... in my browser. Net::HTTP doesn't get redirected, it simply
> fails. _This_ behaviour doesn't match my browser. Is there any way I can
> get redirected or find out where it wants to redirect me and go there,
> myself?

You might want to check out the API docs for Net::HTTP.

Ok, so Net:HTTP doesn't fail, it just returns pretty low-level
information. If you want browser behavior for redirects, you have to
roll it yourself. I ran into something similar 2 days ago when I was
trying to fetch pages which required basic authentication (Which, I
just learned, is the mechanism behind those urername & password login
popups that the browser generates). I did manage to roll my own,
based on some examples from this mailing list.
Then I had to write a wrapper around that because sometimes my URI's
pointed to file:// instead of http:// - another thing the browser
handles invisibly.
Then there are those pages that use cookies to track logins...

So my question is, is there an existing library out there that
implements all these things a browser does? Where get(uri).page
returns the source of the same page I would see if I put the URI in
firefox?

WWW::Mechanize is a programmatic web browser for ruby:
http://mechanize.rubyforge.org/
http://rubyforge.org/projects/mechanize

···

On Jul 19, 2006, at 7:03 PM, Adam Shelly wrote:

On 7/19/06, Joe Van Dyk <joevandyk@gmail.com> wrote:

On 7/19/06, Just Another Victim of the Ambient Morality >> <ihatespam@rogers.com> wrote:

-Adam

open-uri is included in the Ruby standard library.

The process for finding its documentation isn't that difficult.

Go to http://www.ruby-doc.org. Click on the 1.8.4 standard library
link. Click on the open-uri library link that's in the left frame.
Tada!

···

On 7/19/06, Just Another Victim of the Ambient Morality <ihatespam@rogers.com> wrote:

"Eric Hodel" <drbrain@segment7.net> wrote in message
news:A270FB20-30C3-472C-8BA9-922490FFD19B@segment7.net...
> On Jul 19, 2006, at 2:55 PM, Just Another Victim of the Ambient Morality > > wrote:
>
> No, it doesn't fail, it gives you a redirect. Follow it.

    You know, it would be easier to follow the redirect if I knew that it
was giving me a redirect and if I knew how to follow one were I to be given
it...

>> _This_ behaviour doesn't match my browser. Is there any way I can
>> get redirected or find out where it wants to redirect me and go there,
>> myself?
>
> Follow the redirect. Look at open-uri for example code.

    You know, this is where Ruby's lack of documentation is really biting me
in the ass. I found some rudimentary docs on how to use it but, you know
what? For the life of me, no amount of googling can reveal exactly where I
can get the open-uri module... Seriously, where do I download this thing
from and how could I have known that?
    Thank you, from a very frustrated would-be Ruby programmer...

<big snip>

You know, this is where Ruby's lack of documentation is really biting me in the ass. I found some rudimentary docs on how to use it but, you know what?

Unfortunately the Net::HTTP documentation isn't enabled in 1.8. You can use the 1.9 documentation however:

http://ruby-doc.org/core-1.9/classes/Net/HTTP.html

As luck would have it, it has a "Following Redirection" section a few pages down.

It's in the regular standard library documentation as well:

http://ruby-doc.org/stdlib/
http://ruby-doc.org/stdlib/libdoc/net/http/rdoc/classes/Net/HTTP.html

-Justin