Following advice in an old ruby-talk thread (can't remember which one,
offhand), I'm trying to implement a timeout with threads. The
canonical example was something like:
Thread.new do
sleep 5
Process.kill "ALRM", $$
end
begin
.... some stuff ...
rescue SignalException => se
return FAIL_CODE
end
Well, the problem with that is that the thread keeps executing, and if
the process as a whole takes more than 5 seconds to complete, then the
SIGALRM kills the process.
So fine, thinks I, I'll just stop the thread when I'm done with it.
No problem. Only I can't figure out how. I can't call Thread#stop
from outside the thread. What I finally ended up with was something
like:
rv = nil
Thread.new do
sleep 5
Process.kill "ALRM", $$ if rv == nil
end
..... catch SIGALRM if it happens ...
rv = query_some_stuff
....
return rv
I'm not excruciatingly happy about this solution, and I thought I'd
open it up to the Ruby community: What's the best way to do the
equivalent of 'alarm(5)' in C?
-=Eric
···
--
Come to think of it, there are already a million monkeys on a million
typewriters, and Usenet is NOTHING like Shakespeare.
-- Blair Houghton.
Is this more like what you need?
trap "ALRM" do
puts "Look's like we've worn out our welcome. Goodbye!"
exit
end
Thread.new do
sleep 5
Process.kill "ALRM", $$
end
puts "Waiting for the fun to start..."
i = 1
loop do
puts i.to_s
i += 1
sleep 1
end
...or do you definitely need to catch the SignalException within your main code?
Lennon
Lennon Day-Reynolds <rcoder@gmail.com> writes:
Is this more like what you need?
<snip example using trap>
..or do you definitely need to catch the SignalException within your main code?
I really do need to catch it within my main code. I'm querying a
number of remote machines for test status. Sometimes, very rarely,
that query will just go off into the weeds and stay there. I don't
know why yet, so I want to leave that situation in place, mark that
machine as nonresponsive, and move on. AFAIK, catching
SignalException is the only way to do that.
-=Eric
···
--
Come to think of it, there are already a million monkeys on a million
typewriters, and Usenet is NOTHING like Shakespeare.
-- Blair Houghton.
Eric,
I realized that this probably should have been my first suggestion:
how about using the standard 'timeout' module to accomplish the same
thing?
Ex:
require 'timeout'
begin
timeout(TIMELIMIT) do
my_sometimes_too_long_method()
end
rescue Timeout::Error
# Handle timeout here
end
Lennon Day-Reynolds wrote:
Eric,
I realized that this probably should have been my first suggestion:
how about using the standard 'timeout' module to accomplish the same
thing?
Ex:
require 'timeout'
begin
timeout(TIMELIMIT) do
my_sometimes_too_long_method()
end
rescue Timeout::Error
# Handle timeout here
main_thread.raise WhateverException
end
This addition lets you handle the exception in your main thread.
Lennon Day-Reynolds <rcoder@gmail.com> writes:
I realized that this probably should have been my first suggestion:
how about using the standard 'timeout' module to accomplish the same
thing?
I'll see if it works. I'm currently surprised by the fact that
somehow the SIGALRM I'm sending to the main process isn't apparently
being received. I'm currently building an instrumented Ruby
interpreter to validate that it's not Ruby's fault.
require 'timeout'
begin
timeout(TIMELIMIT) do
my_sometimes_too_long_method()
end
rescue Timeout::Error
# Handle timeout here
end
Alas, no love with this example. my_sometimes_too_long_method() just
goes on forever. I guess I'll just have to wait until my instrumented
interpreter finishes building.
-=Eric
···
--
Come to think of it, there are already a million monkeys on a million
typewriters, and Usenet is NOTHING like Shakespeare.
-- Blair Houghton.
Eric Schwartz <emschwar@pobox.com> writes:
Alas, no love with this example. my_sometimes_too_long_method() just
goes on forever. I guess I'll just have to wait until my instrumented
interpreter finishes building.
Not to follow up on myself or anything, but trying to rebuild the
Debian ruby1.8 package gives me:
$ make
../ext/extmk.rb:27:in `require': unexpected break (LocalJumpError)
from ./ext/extmk.rb:27
make: *** [all] Error 1
I couldn't find anything obvious from poking at google-- if anybody
has advice to share, I'd welcome it.
-=Eric
···
--
Come to think of it, there are already a million monkeys on a million
typewriters, and Usenet is NOTHING like Shakespeare.
-- Blair Houghton.
Eric Schwartz <emschwar@pobox.com> writes:
I'll see if it works. I'm currently surprised by the fact that
somehow the SIGALRM I'm sending to the main process isn't apparently
being received. I'm currently building an instrumented Ruby
interpreter to validate that it's not Ruby's fault.
Okay, it's Ruby's fault. Or, more probably, my fault for how I am
extending Ruby.
I instrumented signal.c, and what I've found is that sighandler() is
being called for the SIGALRM. In it, rb_trap_immediate is NOT set, so
rb_trap_pending is incremented, and the SIGALRM entry in trap_pending
list is incremented. So far so good-- it appears this is Ruby's way
of deferring handling of signals until it's safe to handle them.
The problem is, this signal is never getting handled. And, well,
kinda the point of a SIGALRM is that it gets sent in a reasonably
timely manner. I've noticed this behaviour seems to exist with
every signal, though, except SIGSTOP and SIGKILL (for obvious
reasons).
My code is at
http://sourceforge.net/tracker/?group_id=33142&atid=407383 if anyone
wants to double-check me. My questions are:
* Is there some way to force Ruby to deliver this signal?
* How can I tell why it's not being delivered?
Thanks for any help,
-=Eric
···
--
Come to think of it, there are already a million monkeys on a million
typewriters, and Usenet is NOTHING like Shakespeare.
-- Blair Houghton.
Eric,
I'm not sure what your problem with the Ruby rebuild is, (though I
might recommend just doing a local build of the 1.8.1 sources, rather
than the Debian package) but I may have an idea about the
SIGALRM/timeout issue you're having.
Is the long-running method calling out into C code? Even something
like a socket operation? If so, that system code may be blocking
signals before they can percolate up to the Ruby layer I would try
sending signals from outside the Ruby process to see if they can
interrupt it during the long method.
Lennon
Lennon Day-Reynolds <rcoder@gmail.com> writes:
Is the long-running method calling out into C code? Even something
like a socket operation? If so, that system code may be blocking
signals before they can percolate up to the Ruby layer I would try
sending signals from outside the Ruby process to see if they can
interrupt it during the long method.
I'm way ahead of you. I've tried it with a fork() instead of a new
thread, and I've even sent signals from a completely separate shell
process. No dice. I'm 99% sure it's the Ruby interpreter's fault,
because although I know that multiple SIGALRMs can be condensed into
one, I've never heard of only one taking over 30 seconds to be sent to
the process it's intended for.
-=Eric
···
--
Come to think of it, there are already a million monkeys on a million
typewriters, and Usenet is NOTHING like Shakespeare.
-- Blair Houghton.
Lennon Day-Reynolds <rcoder@gmail.com> writes:
Is the long-running method calling out into C code? Even something
like a socket operation?
I forgot to mention: yes, this is exactly what's happening. I built a
Ruby extension for the STAF library:
http://sourceforge.net/tracker/?group_id=33142&atid=407383
The STAF library itself is doing all sorts of C++ weirdness I dare not
attempt to decipher, lest I go insane trying. I fear some bizarre
interaction between STAF and Ruby, perhaps.
-=Eric
···
--
Come to think of it, there are already a million monkeys on a million
typewriters, and Usenet is NOTHING like Shakespeare.
-- Blair Houghton.
Eric,
It could be the interpreter, or it could be something inside the STAF
library itself trapping SIGALRM, and not letting the events reach the
intepreter (though a testing library that didn't allow you to use
SIGALRM in the code being tested.
However, I really know nothing about STAF, so I couldn't speculate as
to what might be causing the problem. I've never had any problems with
the Kernel.trap method in Ruby before, which is the only reason I keep
leaning towards the bug being elsewhere.
Have the STAF maintainers been able to offer any sense of whether
other language bindings (specifically, I notice they list Python on
the homepage) have had any problems with signal handling?