Package idea: attempt

Daniel_Berger2 · 8 June 2006 22:36

Hi all,

I'm tired of this idiom:

max = 3
begin
    Timeout.timeout(val){
       # some op that could fail or timeout on occasion
    }
rescue Exception
    max -= 1
    if max > 0
       sleep interval
       retry
    end
    raise
end

Mark Fowler wrote a Perl module called "attempt" (http://search.cpan.org/~markf/Attempt-1.01/lib/Attempt.pm) that I think is pretty handy, and I would like this for myself. I figure the API should look like this:

# 1st arg is retries, 2nd arg is interval
attempt(3, 300){
FTP.open(host, user, passwd){ ... }
}

Here's my possibly naive implementation:

require 'timeout'

module Kernel
    def attempt(tries = 3, interval = 60, timeout = nil)
       begin
          if timeout
             Timeout.timeout(timeout){ yield }
          else
             yield
          end
       rescue
          tries -= 1
          if tries > 0
             sleep interval
             retry
          end
          raise
       end
    end
end

What do you think? Useful? Are there any gotchas I need to consider, such as nested begin/end blocks, try/catch? Anything else? Should I provide some way to provide debug info? Finer grained error handling?

Ideas welcome.

Thanks,

Dan

John_Carter · 9 June 2006 00:24

We had a bug in a system that did something like this so it failed
literally 99 times out of a hundred.

Since we had a fast retry we only noticed the bug when I went hunting
another bug and went around inserting logging statements everywhere and
found the retry / fail producing the massive stream of BLAH failed
retrying messages.

Fixed that bug and suddenly system a lot faster / more stable....

Moral of the Story :

Unlogged / unreported retries mask bugs, always log / report number of
retries.

John Carter Phone : (64)(3) 358 6639
Tait Electronics Fax : (64)(3) 359 4632
PO Box 1645 Christchurch Email : john.carter@tait.co.nz
New Zealand

Carter's Clarification of Murphy's Law.

"Things only ever go right so that they may go more spectacularly wrong later."

From this principle, all of life and physics may be deduced.

···

On Fri, 9 Jun 2006, Daniel Berger wrote:

kate_rhodes · 9 June 2006 15:18

The following is incredibly nitpicky I admit but I figure I may as
well mention it.
The line
Timeout.timeout(timeout){ yield }

Is it just me or is that a lot of "timeout" Hurts a strangers
understanding of the code. Why not change the name of the passed in
timeout var to user_timeout or anything else that isn't just 'timeout'
- kate = masukom

···

On 6/8/06, Daniel Berger <djberg96@gmail.com> wrote:

Hi all,

I'm tired of this idiom:

max = 3
begin
    Timeout.timeout(val){
       # some op that could fail or timeout on occasion
    }
rescue Exception
    max -= 1
    if max > 0
       sleep interval
       retry
    end
    raise
end

Mark Fowler wrote a Perl module called "attempt"
(http://search.cpan.org/~markf/Attempt-1.01/lib/Attempt.pm\) that I think
is pretty handy, and I would like this for myself. I figure the API
should look like this:

# 1st arg is retries, 2nd arg is interval
attempt(3, 300){
    FTP.open(host, user, passwd){ ... }
}

Here's my possibly naive implementation:

require 'timeout'

module Kernel
    def attempt(tries = 3, interval = 60, timeout = nil)
       begin
          if timeout
             Timeout.timeout(timeout){ yield }
          else
             yield
          end
       rescue
          tries -= 1
          if tries > 0
             sleep interval
             retry
          end
          raise
       end
    end
end

What do you think? Useful? Are there any gotchas I need to consider,
such as nested begin/end blocks, try/catch? Anything else? Should I
provide some way to provide debug info? Finer grained error handling?

Ideas welcome.

Thanks,

Dan

Daniel_Berger2 · 9 June 2006 13:43

John Carter wrote:

We had a bug in a system that did something like this so it failed
literally 99 times out of a hundred.

Since we had a fast retry we only noticed the bug when I went hunting
another bug and went around inserting logging statements everywhere and
found the retry / fail producing the massive stream of BLAH failed
retrying messages.

Fixed that bug and suddenly system a lot faster / more stable....

Moral of the Story :

Unlogged / unreported retries mask bugs, always log / report number of
retries.

Yes, that is a potential issue. It occurred to me that errors that would normally be ignored could/should be emitted as warnings. That way, if there's an obvious problem with your code, you'll see it right away, assuming you're running from the command line (or have some other way of monitoring stderr).

Regards,

Dan

···

On Fri, 9 Jun 2006, Daniel Berger wrote:

Jim_Weirich1 · 9 June 2006 14:26

John Carter wrote:

Moral of the Story :

Unlogged / unreported retries mask bugs, always log / report number
of retries.

Also, beware of retries in multiple levels of protocol stack. I've
heard stories of system that retried the lowest level of a protocol 3
times with a 30 second timeout (total 90 second timeout). The next
layer above that added its own 3 tries (now we have 4 1/2 minutes before
timeout failure). The next several layers also did retries, with the
end result taking *hours* to time out.

Moral of story: Don't add retries indiscriminately.

-- Jim Weirich

···

--
Posted via http://www.ruby-forum.com/\.

Daniel_Berger2 · 9 June 2006 15:24

kate rhodes wrote:

The following is incredibly nitpicky I admit but I figure I may as
well mention it.
The line
Timeout.timeout(timeout){ yield }

Is it just me or is that a lot of "timeout" Hurts a strangers
understanding of the code. Why not change the name of the passed in
timeout var to user_timeout or anything else that isn't just 'timeout'
- kate = masukom

Heh, I suppose it might be. I could change that.

I remember, back in the 1.6.x days, when "timeout" was a top level method and I had a variable called "timeout" in my code. That took a while to track down.

Regards,

Dan

Daniel_Berger2 · 9 June 2006 14:36

Jim Weirich wrote:

John Carter wrote:

Moral of the Story :

Unlogged / unreported retries mask bugs, always log / report number of retries.

Also, beware of retries in multiple levels of protocol stack. I've heard stories of system that retried the lowest level of a protocol 3 times with a 30 second timeout (total 90 second timeout). The next layer above that added its own 3 tries (now we have 4 1/2 minutes before timeout failure). The next several layers also did retries, with the end result taking *hours* to time out.

Moral of story: Don't add retries indiscriminately.

-- Jim Weirich

Yep, definitely something to watch out for. What can I say? Use with caution.

- Dan

Ara.T.Howard6 · 9 June 2006 15:17

for what it's worth have my own version of attempt in a few near-real-time
systems where the overriding principle is : keep going at all costs. in these
systems the 'fail big and fail early' priciple doesn't work unless one enjoys
working on sundays - so i've got lots of stuff like attempt - it all logs to
stderr and/or logs however, so it doesn't go unnoticed.

on another note i've found that incremental sleep increse with reset is almost
always what you want. retrying on the same interval seems to clog up systems
as you get in certain timing rythyms. in rq i use this alot

http://codeforpeople.com/lib/ruby/rq/rq-2.3.3/lib/rq-2.3.3/sleepcycle.rb

it's a cycle that looks like a sawtooth wave - so basically on each retry we
timeout for longer than before, essentially becoming more and more 'patient'
before getting really 'impatient' again.

i've found this matched the real world pretty well since timing out a bunch in
a short period normally means you should wait longer.

cheers.

-a

···

On Fri, 9 Jun 2006, Daniel Berger wrote:

Yep, definitely something to watch out for. What can I say? Use with
caution.

--
suffering increases your inner strength. also, the wishing for suffering
makes the suffering disappear.
- h.h. the 14th dali lama

Daniel_Berger2 · 9 June 2006 15:54

Hm, interesting. Maybe a more advanced version would use a full fledged class with lots of options. Something like this:

attempt = Attempt.new{ |a|
    a.tries = 3 # Try 3 times
    a.interval = 30 # 30 seconds between tries but...
    a.max = 90 # In case of nested retries
    a.increment = 10 # add 10 seconds to the interval with each try
    a.log = log # Where 'log' is an IO handle
    a.warnings = $stderr # Send caught errors to IO handle as warnings
}

attempt{ # Some op }

Attempt#max would, in theory, be used to prevent Jim Weirich's nightmare scenario, where you have a bunch of nested retries, all doing their own sleep + retry thing.

So, using the above example, if I did something like this:

attempt{
    begin
       # some op
    rescue
       sleep 500
       retry
    end
}

It would error out at 90 seconds no matter what (the value we set to 'max'). I'm not sure if that's possible, however, or even how you would implement it. Thoughts?

- Dan

···

ara.t.howard@noaa.gov wrote:

On Fri, 9 Jun 2006, Daniel Berger wrote:

Yep, definitely something to watch out for. What can I say? Use with
caution.

for what it's worth have my own version of attempt in a few near-real-time
systems where the overriding principle is : keep going at all costs. in these
systems the 'fail big and fail early' priciple doesn't work unless one enjoys
working on sundays - so i've got lots of stuff like attempt - it all logs to
stderr and/or logs however, so it doesn't go unnoticed.

on another note i've found that incremental sleep increse with reset is almost
always what you want. retrying on the same interval seems to clog up systems
as you get in certain timing rythyms. in rq i use this alot

http://codeforpeople.com/lib/ruby/rq/rq-2.3.3/lib/rq-2.3.3/sleepcycle.rb

it's a cycle that looks like a sawtooth wave - so basically on each retry we
timeout for longer than before, essentially becoming more and more 'patient'
before getting really 'impatient' again.

i've found this matched the real world pretty well since timing out a bunch in
a short period normally means you should wait longer.

cheers.

-a

Ara.T.Howard6 · 9 June 2006 16:02

something like:

   def done
     synchronize(:SH){ @done }
   end

   def done=d
     synchronize(:EX){ @done=d }
   end

   def ensure_max!
     @max ||= Thread.new(max, Thread.current) do |m,c|
       sleep max
       c.raise MaxError unless done
     end
   end

   def attempt
     ...
   ensure
     @max.kill
   end

or something like that

-a

···

On Sat, 10 Jun 2006, Daniel Berger wrote:

It would error out at 90 seconds no matter what (the value we set to 'max'). I'm not sure if that's possible, however, or even how you would implement it. Thoughts?

--
suffering increases your inner strength. also, the wishing for suffering
makes the suffering disappear.
- h.h. the 14th dali lama

Daniel_Berger2 · 9 June 2006 16:22

Hm....this has potential. I might be asking you for some help in the future.

Thanks,

Dan

···

ara.t.howard@noaa.gov wrote:

On Sat, 10 Jun 2006, Daniel Berger wrote:

It would error out at 90 seconds no matter what (the value we set to 'max'). I'm not sure if that's possible, however, or even how you would implement it. Thoughts?

something like:

  def done
    synchronize(:SH){ @done }
  end

  def done=d
    synchronize(:EX){ @done=d }
  end

  def ensure_max!
    @max ||= Thread.new(max, Thread.current) do |m,c|
      sleep max
      c.raise MaxError unless done
    end
  end

  def attempt
    ...
  ensure
    @max.kill
  end

or something like that

-a

Ara.T.Howard6 · 9 June 2006 16:47

sure thing dan. just ping me offline.

cheers.

-a

···

On Sat, 10 Jun 2006, Daniel Berger wrote:

Hm....this has potential. I might be asking you for some help in the future.

--
suffering increases your inner strength. also, the wishing for suffering
makes the suffering disappear.
- h.h. the 14th dali lama

Topic		Replies	Views
Allow retry to take arguments? ruby-talk	5	104	29 July 2005
Safety of timeout() ruby-talk	11	52	28 September 2004
Everyone's favorite flow control: retry ruby-talk	17	104	7 December 2007
Handling Timeout::Error from TCPSocket ruby-talk	18	134	4 April 2005
Evaluator for a mini-Ruby in Haskell ruby-talk	16	111	29 March 2005

Package idea: attempt

Related topics