Hi,
I'm developing a program to fetch the html contents of a site using
'net/http'. Everyting works fine except for http://www.youtube.com. When
i pass that url an error like this is found
I think it is because i'm not using any user agent. Can any body please
tell me any auggestion. I'll be really greatful. This is my code
snippet.
response = Net::HTTP.get_response(URI.parse("http://www.youtube.com"))
case response
when Net::HTTPSuccess then response
when Net::HTTPRedirection then response =
Net::HTTP.get(URI.parse(response['location']))
else
response.error!
end
Hi,
I'm developing a program to fetch the html contents of a site using
'net/http'. Everyting works fine except for http://www.youtube.com. When
i pass that url an error like this is found
Anyway, it seems it's as you said, without a User-Agent, youtube returns a 400:
irb(main):003:0> agent.user_agent=nil
=> nil
irb(main):004:0> agent.get('http://www.youtube.com').code
WWW::Mechanize::ResponseCodeError: 400 => Net::HTTPBadRequest
from /usr/lib/ruby/gems/1.8/gems/mechanize-0.9.2/lib/www/mechanize.rb:229:in
`get'
from (irb):4
irb(main):005:0>
Arun Kumar writes:
> Hi,
> I'm developing a program to fetch the html contents of a site using
> 'net/http'. Everyting works fine except for http://www.youtube.com. When
> i pass that url an error like this is found
>
> /usr/lib/ruby/1.8/net/http.rb:2097:in `error!': 400 "Bad Request"
> (Net::HTTPServerException)
>
> I think it is because i'm not using any user agent. Can any body please
> tell me any auggestion. I'll be really greatful. This is my code
> snippet.
>
> response = Net::HTTP.get_response(URI.parse("http://www.youtube.com"))
> case response
> when Net::HTTPSuccess then response
> when Net::HTTPRedirection then response =
> Net::HTTP.get(URI.parse(response['location']))
> else
> response.error!
> end
>
> Thanks
>
> regards
> Arun Kumar
> --
> Posted via http://www.ruby-forum.com/.
>
From a quick look at the Net::HTTP RDoc, it would look like your best
option would be to use the public instance 'get' method.
Here is a suggested rework of your code snippet:
Net::HTTP.start('www.youtube.com', 80) {|http|
response = http.get('/', {'User-Agent'=>'ruby/net::http'})
case response
when Net::HTTPSuccess then response
when Net::HTTPRedirection then response = Net::HTTP.get(URI.parse(response['location']))
else
response.error!
end
}
Of course, this quick snippet does not account for when you get a
Net::HTTPRedirection and the redirected host is 'www.youtube.com',
again. (Then again, your original code didn't error check the
redirection response either.) So this isn't a complete solution, but
it will at least show you how to send the "User-Agent" header; I used
a variation of this code and was able to get content from www.youtube.com.