I'm new to ruby and my co. has given me an assignment in ruby. It is
regarding html extraction. It works fine except for some sites like http://www.youtube.com, http://www.gmail.com where i'll get errors like
'400 Bad Request' and 'getaddrinfo: Name or service not known
(SocketError)' respectively for each of the 2 sites. I came to know that
may be it is because the url is being redirected. But i'm not sure about
it. My code for html extraction is :
puts "Enter domain name :"
domain = gets #concatinating ‘http://www.’ with the url to open the page
url = “http://www.”+domain
document = open(url) #getting the original url of the site
url2 = document.base_uri.to_s
Can anybody please help. It is urgent. I'll be really greatful for those
who reply
I'm new to ruby and my co. has given me an assignment in ruby. It is
regarding html extraction.
You probably want Mechanize.
domain = gets #concatinating 'http://www.' with the url to open the page
url = "http://www."+domain
Take a look at that URL -- I'd say you don't need 'www' in that.
But I'm guessing what's hurting is the newline at the end of it.
Quick fix:
domain = gets.chomp
url = "http://#{domain}"
Sorry to say David, I tried that but the same error is producing. Is it
because i've not set the user agent. Can u please tell me how to set the
user_agent for mozilla.
Thanks for ur immediate reply
Sorry to say David, I tried that but the same error is producing. Is it
because i've not set the user agent. Can u please tell me how to set the
user_agent for mozilla.
On Tue, Mar 17, 2009 at 11:28 AM, Arun Kumar > <arunkumar@innovaturelabs.com> wrote:
Sorry to say David, I tried that but the same error is producing. Is it
because i've not set the user agent. Can u please tell me how to set the
user_agent for mozilla.
Can i use user-agents in hpricot? or if it can be used only for
mechanize. I've found a user-agent for mozilla :
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR
1.1.4322; .NET CLR 2.0.50727)
But still it is showing the same error.
--
Posted via http://www.ruby-forum.com/\.
Can i use user-agents in hpricot? or if it can be used only for
mechanize. I've found a user-agent for mozilla :
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR
1.1.4322; .NET CLR 2.0.50727)
But still it is showing the same error.