400 "Bad Request"

Hi,
I'm developing a program to fetch the html contents of a site using
'net/http'. Everyting works fine except for http://www.youtube.com. When
i pass that url an error like this is found

/usr/lib/ruby/1.8/net/http.rb:2097:in `error!': 400 "Bad Request"
(Net::HTTPServerException)

I think it is because i'm not using any user agent. Can any body please
tell me any auggestion. I'll be really greatful. This is my code
snippet.

response = Net::HTTP.get_response(URI.parse("http://www.youtube.com"))
                    case response
                        when Net::HTTPSuccess then response
                        when Net::HTTPRedirection then response =
Net::HTTP.get(URI.parse(response['location']))
                    else
                        response.error!
                    end

Thanks

regards
Arun Kumar

···

--
Posted via http://www.ruby-forum.com/.

Hi

Hi,
I'm developing a program to fetch the html contents of a site using
'net/http'. Everyting works fine except for http://www.youtube.com. When
i pass that url an error like this is found

/usr/lib/ruby/1.8/net/http.rb:2097:in `error!': 400 "Bad Request"
(Net::HTTPServerException)

I think it is because i'm not using any user agent. Can any body please
tell me any auggestion. I'll be really greatful. This is my code
snippet.

You can try getting the page with Mechanize [1].

irb -rubygems -rmechanize
irb(main):001:0> agent=WWW::Mechanize.new
irb(main):002:0> agent.get('http://www.youtube.com').code
=> "200"

Anyway, it seems it's as you said, without a User-Agent, youtube returns a 400:
irb(main):003:0> agent.user_agent=nil
=> nil
irb(main):004:0> agent.get('http://www.youtube.com').code
WWW::Mechanize::ResponseCodeError: 400 => Net::HTTPBadRequest
  from /usr/lib/ruby/gems/1.8/gems/mechanize-0.9.2/lib/www/mechanize.rb:229:in
`get'
  from (irb):4
irb(main):005:0>

[1] http://mechanize.rubyforge.org/

···

On Thu, Mar 26, 2009 at 12:21 PM, Arun Kumar <arunkumar@innovaturelabs.com> wrote:

--
Luis Parravicini
http://ktulu.com.ar/blog/

Arun Kumar writes:
> Hi,
> I'm developing a program to fetch the html contents of a site using
> 'net/http'. Everyting works fine except for http://www.youtube.com. When
> i pass that url an error like this is found
>
> /usr/lib/ruby/1.8/net/http.rb:2097:in `error!': 400 "Bad Request"
> (Net::HTTPServerException)
>
> I think it is because i'm not using any user agent. Can any body please
> tell me any auggestion. I'll be really greatful. This is my code
> snippet.
>
> response = Net::HTTP.get_response(URI.parse("http://www.youtube.com"))
> case response
> when Net::HTTPSuccess then response
> when Net::HTTPRedirection then response =
> Net::HTTP.get(URI.parse(response['location']))
> else
> response.error!
> end
>
> Thanks
>
> regards
> Arun Kumar
> --
> Posted via http://www.ruby-forum.com/.
>

From a quick look at the Net::HTTP RDoc, it would look like your best
option would be to use the public instance 'get' method.

Here is a suggested rework of your code snippet:

Net::HTTP.start('www.youtube.com', 80) {|http|
  response = http.get('/', {'User-Agent'=>'ruby/net::http'})
  case response
    when Net::HTTPSuccess then response
    when Net::HTTPRedirection then response = Net::HTTP.get(URI.parse(response['location']))
  else
      response.error!
  end
}

Of course, this quick snippet does not account for when you get a
Net::HTTPRedirection and the redirected host is 'www.youtube.com',
again. (Then again, your original code didn't error check the
redirection response either.) So this isn't a complete solution, but
it will at least show you how to send the "User-Agent" header; I used
a variation of this code and was able to get content from
www.youtube.com.

Coey