TimeoutError in Net::HTTP get and post

I’m trying to rescue a TimeoutError in Net::HTTP’s get and post methods,
but it’s not working. I noticed some postings about how the rescue
catch-all didn’t work any more for TimeoutErrors, but I can’t even
explicitly rescue this error. I know the page is available and that
retrying will work, in this case, and all I want is a quick and dirty
way of scraping some web pages, but this error brings down my script.
Here’s my code:

begin
resp, data = h.post(’/tsd_listings/tsd_search.fpl’, postdata, headers)
rescue TimeoutError => e
retry
end

Thanks in advance for your help.

Carl Youngblood

P.S. I’m using Ruby 1.8.0

Was this just a stupid question or does nobody know the answer? Or
another option I guess would be that everyone is too busy.

Carl Youngblood wrote:

···

I’m trying to rescue a TimeoutError in Net::HTTP’s get and post methods,
but it’s not working. I noticed some postings about how the rescue
catch-all didn’t work any more for TimeoutErrors, but I can’t even
explicitly rescue this error. I know the page is available and that
retrying will work, in this case, and all I want is a quick and dirty
way of scraping some web pages, but this error brings down my script.
Here’s my code:

begin
resp, data = h.post(‘/tsd_listings/tsd_search.fpl’, postdata, headers)
rescue TimeoutError => e
retry
end

Thanks in advance for your help.

Carl Youngblood

P.S. I’m using Ruby 1.8.0

Hmmm… I just tried to get Net::HTTP#post to raise a TimeoutError by
mounting a little WEBrick proc that slept 20 seconds before responding, and
#post never timed out for me. Can you give a small, complete example that
causes the TimeoutError?

Nathaniel

<:((><

···

Carl Youngblood [mailto:carl@youngbloods.org] wrote:

Was this just a stupid question or does nobody know the answer? Or
another option I guess would be that everyone is too busy.

Nathaniel Talbott wrote:

···

Carl Youngblood [mailto:carl@youngbloods.org] wrote:

Was this just a stupid question or does nobody know the answer? Or
another option I guess would be that everyone is too busy.

Hmmm… I just tried to get Net::HTTP#post to raise a TimeoutError by
mounting a little WEBrick proc that slept 20 seconds before responding, and
#post never timed out for me. Can you give a small, complete example that
causes the TimeoutError?

Nathaniel

<:((><

I think the get and post timeout is set to 60 seconds by default. Your
script would probably work if you increased the waiting time. My script
only times out every once in a while since it is accessing a real web
server that usually responds in time.

Carl

Nathaniel Talbott wrote:

···

Carl Youngblood [mailto:carl@youngbloods.org] wrote:

Was this just a stupid question or does nobody know the answer? Or
another option I guess would be that everyone is too busy.

Hmmm… I just tried to get Net::HTTP#post to raise a TimeoutError by
mounting a little WEBrick proc that slept 20 seconds before responding, and
#post never timed out for me. Can you give a small, complete example that
causes the TimeoutError?

Nathaniel

<:((><

By the way, thanks for your response.

OK, that was it. Here’s a quick server to duplicate the problem:

require ‘webrick’

s = WEBrick::HTTPServer::new(:Port => 2000)
s.mount_proc(‘/’){|req, resp| sleep(20); resp.body = “Hi!”}
trap(“INT”){s.stop}
s.start

Here’s a client that catches it for me:

require ‘net/http’

h = Net::HTTP::new(‘localhost’, 2000)
h.read_timeout = 10
begin
puts “Trying…”
resp = h.post(‘/’, ‘’)
puts “Reply: #{resp.body}”
rescue TimeoutError => e
puts “Error: #{e.inspect}”
end

My suggestion, rather than retrying the post, would be to simply bump up the
timeout using #read_timeout=. Retrying has the danger of continually trying
to get a page that takes 70 seconds to load using a 60 second timeout.

HTH,

Nathaniel

<:((><

···

Carl Youngblood [mailto:carl@youngbloods.org] wrote:

I think the get and post timeout is set to 60 seconds by default. Your
script would probably work if you increased the waiting time. My script
only times out every once in a while since it is accessing a real web
server that usually responds in time.