I am just learning Ruby and planning a to use it for a network
monitoring system. I would like to design a system that will scale well
and handle large work loads ( tens of thousands of monitors per server
). My question is about concurrency. I am looking at threads and have
fiddled with them a bit. Ruby makes threading pretty straight forward.
Now I'm trying to figure how to best use them. The design goal is to
have a server host tens of thousands of monitors in the most efficient
way. Monitors will need to be executed at specific and varying
intervals. (i.e. 1000 monitors once a minute / 2000 monitors once every
three minutes / 5000 monitors once every 5 minutes / 10,000 monitors
once every 10 minutes, etc.)
This is a problem that I'm sure has already been solved. I would
appreciate any suggestions that the group can offer regarding a proven
design to tackle this goal.
Thanks, Don
···
--
Posted via http://www.ruby-forum.com/.
Hi Don,
with these amount of tasks it's inefficient to have a thread per
monitor. Things like this are typically tackled with a light weight
processing framework; you have a pool of threads that get feed tasks
via a thread safe queue (tomcat does it's request processing similarly
although with a bit more whistles and bells). Ruby comes with
everything you need for that apart from a scheduler maybe (but better
check the RAA). You then "just" have to glue pieces together. I found
Doug Lea's book very valuable (although it's written for Java the
basic principles and strategies are the same):
http://www.amazon.com/gp/product/0201310090
http://g.oswego.edu/
HTH
Kind regards
robert
···
2006/5/5, Don Stocks <amillionhitpoints@yahoo.com>:
I am just learning Ruby and planning a to use it for a network
monitoring system. I would like to design a system that will scale well
and handle large work loads ( tens of thousands of monitors per server
). My question is about concurrency. I am looking at threads and have
fiddled with them a bit. Ruby makes threading pretty straight forward.
Now I'm trying to figure how to best use them. The design goal is to
have a server host tens of thousands of monitors in the most efficient
way. Monitors will need to be executed at specific and varying
intervals. (i.e. 1000 monitors once a minute / 2000 monitors once every
three minutes / 5000 monitors once every 5 minutes / 10,000 monitors
once every 10 minutes, etc.)
This is a problem that I'm sure has already been solved. I would
appreciate any suggestions that the group can offer regarding a proven
design to tackle this goal.
--
Have a look: Robert K. | Flickr
Depending on what you're trying to do, a single-threaded approach may give
you more scalability and performance:
http://rubyforge.org/projects/eventmachine
···
On 5/5/06, Don Stocks <amillionhitpoints@yahoo.com> wrote:
I am just learning Ruby and planning a to use it for a network
monitoring system. I would like to design a system that will scale well
and handle large work loads ( tens of thousands of monitors per server
). My question is about concurrency. I am looking at threads and have
fiddled with them a bit. Ruby makes threading pretty straight forward.
Now I'm trying to figure how to best use them. The design goal is to
have a server host tens of thousands of monitors in the most efficient
way. Monitors will need to be executed at specific and varying
intervals. (i.e. 1000 monitors once a minute / 2000 monitors once every
three minutes / 5000 monitors once every 5 minutes / 10,000 monitors
once every 10 minutes, etc.)
This is a problem that I'm sure has already been solved. I would
appreciate any suggestions that the group can offer regarding a proven
design to tackle this goal.
Thanks, Don
--
Posted via http://www.ruby-forum.com/\.
Thanks guys. I'll look into both of these suggestions. I really
appreciate tips!
- Don
···
--
Posted via http://www.ruby-forum.com/.
Francis,
I took a look at EventMachine. It looks very intersting. But it seems
that it's primary use is to abstract networking. What are your thoughts
on how this could be leveraged for use in the monitoring service I
describe above (i.e. manage a large, dynamic queue of jobs and run them
at specific intervals).
Thanks! - Don
Francis Cianfrocca wrote:
···
Depending on what you're trying to do, a single-threaded approach may
give
you more scalability and performance:
http://rubyforge.org/projects/eventmachine
--
Posted via http://www.ruby-forum.com/\.
The idea behind using a single-threaded engine ("reactor model") in your
application would be this: Each of your tasks is implemented in nothing more
than a instance of a Ruby class (that you write). Rather than asking you to
maintain a thread pool and schedule each chunk of work onto the next
available thread, EventMachine just calls methods on your objects whenever
it's time to do some work. It definitely does have the ability to fire
requests into your objects periodically, based on timers that you set up.
The reason to use this approach instead of thread pools is that it can be
far faster and more scalable. The downside is that your network-protocol
handling may be a little more complicated.
Since you're talking about monitoring a large number of external entities
(processes? systems? users?), my next question is: what network protocol
will be used? We've already implemented EventMachine protocol handlers for
HTTP/S, LDAP, SMTP, SIP, and a few others.
···
On 5/5/06, Don Stocks <amillionhitpoints@yahoo.com> wrote:
Francis,
I took a look at EventMachine. It looks very intersting. But it seems
that it's primary use is to abstract networking. What are your thoughts
on how this could be leveraged for use in the monitoring service I
describe above (i.e. manage a large, dynamic queue of jobs and run them
at specific intervals).
Thanks! - Don
Francis Cianfrocca wrote:
> Depending on what you're trying to do, a single-threaded approach may
> give
> you more scalability and performance:
> http://rubyforge.org/projects/eventmachine
--
Posted via http://www.ruby-forum.com/\.
Don, I thought about it a bit more. Your original question was how to
achieve concurrency. So there are evidently latencies in your system that
you'd like to capture, and my question is: how do these arise? If they come
from the network, then a solution like EventMachine is ideal. If they are
compute-bound, then you should also look at EventMachine, and you should not
be looking at a threaded approach unless you're able to specify
multi-processor hardware. If the latencies are coming from elsewhere in your
local system (such as disk i/o), then you should look at thread pools.
I hope that's helpful. Best of luck
-f
···
On 5/5/06, Don Stocks <amillionhitpoints@yahoo.com> wrote:
Francis,
I took a look at EventMachine. It looks very intersting. But it seems
that it's primary use is to abstract networking. What are your thoughts
on how this could be leveraged for use in the monitoring service I
describe above (i.e. manage a large, dynamic queue of jobs and run them
at specific intervals).
Thanks! - Don
Francis Cianfrocca wrote:
> Depending on what you're trying to do, a single-threaded approach may
> give
> you more scalability and performance:
> http://rubyforge.org/projects/eventmachine
--
Posted via http://www.ruby-forum.com/\.
Excellent! It sounds like EventMachine may do exactly what I'm needing.
That's great news. Looks like I'll need to dig into it. Thanks!
Yes, I will be monitoring the most common network services (HHTP/S,
SMTP, POP3, DNS, IMAP, etc.) on remote systems. I'll also want to track
things like connect times and transaction response times I haven't
started coding the monitor classes yet, as I'm just now trying to
formulate a design. It sounds like you may have done some of the work
for me! If I understand it correctly, I should be able to build my
monitor classes implementing your existing protocol handlers and create
my own protocol handlers for those that don't exist.
To your latency question: I hope I understand your question correctly.
I think most of the latency will come from the network. For example, if
there are four monitors that need to be run in the current 60 second
window, then I need to run them simultaneously. If they are run in
serial then monitors later in the queue could get stuck behind a long
running monitor. The polling server (the system running the monitors)
can't get stuck on a single monitor for an extended period of time since
that would cause all work to stop. How would EventMachine handle this
without running each active monitor in it's own worker thread?
One the issue of scaling: I would like to use a shared queue that would
allow additional polling servers to be added for scaling out. Any itial
thoughts on how EventMachine can fit into this model?
I hope I'm not getting in over my head! It seems pretty complex for
such a simple mind. 
Thanks for all the help! - Don
···
--
Posted via http://www.ruby-forum.com/.
Ok, let's see if I understand you. (I certainly hope we're not going into
inappropriate territory for this list with a discussion of a particular
system design!)
You have a lot of network servers (many different protocols) floating around
your network, and you want to periodically send client requests to each of
them and measure the response times. (And of course send you alerts when
they don't respond.)
One way to do this would be to tell EventMachine to kick off one request for
each of the monitored servers every minute (for example). That's a simple
matter of instantiating a bunch of objects. The objects send their requests,
then wait for the responses, and then possibly talk to some singleton or
database connection somewhere. The objects die off by themselves when the
protocol-conversation completes or they time out. EventMachine manages all
of this by calling methods in your objects whenever timers expire or you
send data on them or they receive data from the network. So the network
drivers are working away in the kernel while you're processing each request.
To start it off, you call EventMachine's #run method, and it does everything
else. If you want to do other work on Ruby threads while the event machine
runs, that's ok too. The system will be live and concurrent as long as it
doesn't take you an inordinate amount of time to process each response
(you're probably just timing them, so no problem).
I'm happy to help you out if you want to give it a shot. Send me a private
email so we don't pollute this list 
···
On 5/5/06, Don Stocks <amillionhitpoints@yahoo.com> wrote:
Excellent! It sounds like EventMachine may do exactly what I'm needing.
That's great news. Looks like I'll need to dig into it. Thanks!
Yes, I will be monitoring the most common network services (HHTP/S,
SMTP, POP3, DNS, IMAP, etc.) on remote systems. I'll also want to track
things like connect times and transaction response times I haven't
started coding the monitor classes yet, as I'm just now trying to
formulate a design. It sounds like you may have done some of the work
for me! If I understand it correctly, I should be able to build my
monitor classes implementing your existing protocol handlers and create
my own protocol handlers for those that don't exist.
To your latency question: I hope I understand your question correctly.
I think most of the latency will come from the network. For example, if
there are four monitors that need to be run in the current 60 second
window, then I need to run them simultaneously. If they are run in
serial then monitors later in the queue could get stuck behind a long
running monitor. The polling server (the system running the monitors)
can't get stuck on a single monitor for an extended period of time since
that would cause all work to stop. How would EventMachine handle this
without running each active monitor in it's own worker thread?
One the issue of scaling: I would like to use a shared queue that would
allow additional polling servers to be added for scaling out. Any itial
thoughts on how EventMachine can fit into this model?
I hope I'm not getting in over my head! It seems pretty complex for
such a simple mind. 
Thanks for all the help! - Don
--
Posted via http://www.ruby-forum.com/\.