[ANN] Ruby/Watchcat 1.0.0

Hello

I'm pleased to announce the release of Ruby/Watchcat 1.0.0.

Ruby/Watchcat is an extension for Ruby for the development of
watchcatd-aware applications.

Watchcatd is a watchdog-like daemon in the sense that it takes actions
in situations where a machine is under heavy load and/or unresponsive.
However, watchcatd isn't as drastic as the usual watchdog systems, which
reboot the machine. Instead, all it does is sending a signal to a
registered process (which by default is SIGKILL) if the process doesn't
send it a heartbeat before a user-specified timeout.

Ruby/Watchcatd allows you to register ruby applications with watchcatd.

Examples:

  require 'watchcat'

  # Create a new cat.
  cat = Watchcat.new(:timeout => 10, :signal => 'KILL',
                     :info => 'killing from ruby!')
  loop do
    # Here you do something that could exceed the timeout
    sleep 9 + rand(3)
    cat.heartbeat # we're still alive
  end
  cat.close # clean the cat's litter box

You can also use a block, in which case the cat cleans its own litter box:

  require 'watchcat'

  Watchcat.new do |cat|
    loop do
      do_something_that_can_be_slow
      cat.heartbeat
    end
  end

For more details, please refer to the README file in the distribution and
in the project's homepage at http://oss.digirati.com.br/ruby-watchcat/.

This is my first Ruby C extension, so I would greatly appreciate comments
and suggestions :slight_smile:

Best regards,
Andre Nathan

<snip>

so, in effect, this is a Timeout::timeout based on a child process?

look handy for some of the work i'm doing now - which is a 24x7 satellite
ingest system which spawns many processes which can potentially hang...

-a

···

On Sat, 22 Apr 2006, Andre Nathan wrote:

Hello

I'm pleased to announce the release of Ruby/Watchcat 1.0.0.

--
be kind whenever possible... it is always possible.
- h.h. the 14th dali lama

Hmmm, me thinking...

Hello

I'm pleased to announce the release of Ruby/Watchcat 1.0.0.

Ruby/Watchcat is an extension for Ruby for the development of
watchcatd-aware applications.

Watchcatd is a watchdog-like daemon in the sense that it takes actions
in situations where a machine is under heavy load and/or unresponsive.
However, watchcatd isn't as drastic as the usual watchdog systems, which
reboot the machine. Instead, all it does is sending a signal to a
registered process (which by default is SIGKILL) if the process doesn't
send it a heartbeat before a user-specified timeout.

With Mongrel you've got the situation that a single Mongrel server could
potentially be handling many requests at once, so killing off a dead one
could really make things bad. But, in a shared hosting environment this
would perfect for catching the poorly coded servers that eat up resources.

I kind of like this solution though since it is more difficult for the
person to cheat. They can't really turn it off by injecting Ruby code since
they still have to talk to the watchcat. I'm curious if they could cheat
other ways such as transferring the socket to another process which always
works.

···

On 4/21/06 7:03 PM, "Andre Nathan" <andre@digirati.com.br> wrote:

On another note, you know there's options to for throttling and restricting the number of active threads in Mongrel right? -t will do a timeout (says
seconds in the docs but it's actually 1/100th of a second) between each
socket accept. -n will make sure the number of processor threads doesn't go
above a given limit.

Zed A. Shaw

http://mongrel.rubyforge.org/

Hi Ara

so, in effect, this is a Timeout::timeout based on a child process?

Yes, except that when the timeout expires, it triggers an action by the
watchcat daemon (which by default SIGKILLs the process).

look handy for some of the work i'm doing now - which is a 24x7 satellite
ingest system which spawns many processes which can potentially hang...

Our main use for it is in a shared hosting environment. We wrote a
mod_watchcat for apache2 and use watchcatd to kill misbehaving customer
scripts (that has helped increasing our servers' uptime a lot). When I
wrote the extension, my idea was to use it for something similar, maybe
with Mongrel, but I guess it would be useful for your satellite
application too :slight_smile:

Regards,
Andre

···

On Sat, 2006-04-22 at 23:03 +0900, ara.t.howard@noaa.gov wrote:

Hi Zed

With Mongrel you've got the situation that a single Mongrel server could
potentially be handling many requests at once, so killing off a dead one
could really make things bad. But, in a shared hosting environment this
would perfect for catching the poorly coded servers that eat up resources.

Yes, the library is better suited for a multi-process model, like the
pre-forking MPM in apache2, because we can just kill the process that is
eating the resources without killing the whole server.

I'm actually not very familiar with Mongrel (just used it for a simple
Camping app), although I'm planning to when time permits. I'm guessing
that to make it work with Mongrel, one would need a wrapper script to
launch the server, so that it could be relaunched after it's killed.
Does that make sense?

I kind of like this solution though since it is more difficult for the
person to cheat. They can't really turn it off by injecting Ruby code since
they still have to talk to the watchcat. I'm curious if they could cheat
other ways such as transferring the socket to another process which always
works.

In our environment, the user has no control of the watchcat (it is
created by mod_watchcat), so to pass the socket descriptor to another
process the user would first have to guess what the descriptor is. It's
not impossible to do it, but then it would be easy for us to identify a
user doing that.

Regards,
Andre

···

On Sun, 2006-04-23 at 02:53 +0900, Zed Shaw wrote: