Thread safety techniques for server applications?

Hey all,

I'm looking for some information about handling thread safety with Ruby.
I've got an application server I wrote that I need to make sure it's
thread safe. This application server is used over http requests so it's
possible multiple people hit it at once. I have some questions that will
help me determine..

1. Does using mongrel / lighttpd / webrick, ensure thread saftey? (my
application relies on these)
2. What kinds of things in the Ruby language should I NOT do that will
cause thread headaches.. (maybe static variables)?
3. What techniques can I use to go about testing thread saftey?

I'm not asking anything in reference to rails. This would be just
general Ruby thread safety ideas..

thanks..

···

--
Posted via http://www.ruby-forum.com/.

Hi Aaron,
I'd like to learn more in this area too, but here are my thoughts:
The web servers, at least mongrel, are single-threaded. Mongrel queues
requests and feeds them to the app sequentially. To get concurrency
you have to run multiple instances of mongrel. In this situation there
are no thread safety issues because there's only one thread per
process.
I like the idea of separate processes instead of worrying about thread
safety, but sometimes I need multiple threads, for example in a jabber
client (keepalives, listeners, etc). What I've been doing is keeping
it as simple as possible and so far I haven't had to think about
thread conflicts. Or maybe I should be but I haven't :wink:
--Dave

···

On Aug 24, 7:05 pm, Aaron Smith <beingthexempl...@gmail.com> wrote:

Hey all,

I'm looking for some information about handling thread safety with Ruby.
I've got an application server I wrote that I need to make sure it's
thread safe. This application server is used over http requests so it's
possible multiple people hit it at once. I have some questions that will
help me determine..

1. Does using mongrel / lighttpd / webrick, ensure thread saftey? (my
application relies on these)
2. What kinds of things in the Ruby language should I NOT do that will
cause thread headaches.. (maybe static variables)?
3. What techniques can I use to go about testing thread saftey?

I'm not asking anything in reference to rails. This would be just
general Ruby thread safety ideas..

thanks..
--
Posted viahttp://www.ruby-forum.com/.

I'm looking for some information about handling thread safety with Ruby.
I've got an application server I wrote that I need to make sure it's
thread safe. This application server is used over http requests so it's
possible multiple people hit it at once. I have some questions that will
help me determine..

In general, it's the same as any other type of threaded programming. Share as little as possible, and control access to shared resources so that two threads aren't changing state in it at the same time and running into eachother. Look at the Mutex class and the Queue class as starting points for tools to help you do this.

1. Does using mongrel / lighttpd / webrick, ensure thread saftey? (my
application relies on these)

lighttpd is an external web server, so it's irrelevant.

Both mongrel and webrick are threaded Ruby web server platforms. They, however, don't do anything to ensure that your code which you run inside of them is threadsafe.

2. What kinds of things in the Ruby language should I NOT do that will
cause thread headaches.. (maybe static variables)?

The only things to really keep in mind is that Ruby threads are green threads -- they are all done inside of the Ruby interpreter. So, they all share a single process. Thus, the use of threads will rarely increase the throughput of your program, unless there is some external latency that can be captured, and that external latency does not occur inside of a Ruby extension.

This is because while the flow of execution is inside of an extension, it is out of Ruby's control, and no thread task switching will take place.

Also, be aware that Ruby uses a select() loop to manage its threads of execution, and it has an fd_setsize limit of 1024 handles, so there is a sharp upper boundary on the number of threads you can have in a Ruby process.

3. What techniques can I use to go about testing thread saftey?

Look for areas in your code where you share resources between your threads. Do you take precautions to keep multiple threads from stepping on eachother when using those resources?

Write test code that creates multiple threads, and tries to stress those areas.

Kirk Haines

···

On Sat, 25 Aug 2007, Aaron Smith wrote:

Roger Pack wrote:

I'm looking for some information about handling thread safety with Ruby.

Note also that sometimes if you are reading from TCPSockets the sockets
get confused and start reading from one another. To avoid this (I
think) use Francis Cianfrocca's EventMachine.

Possible ways to fix this might (might) be to ensure that every socket
read/write is 'not at the same time as any other read/write' (i.e.
surrounded by a mutex lock), or to perhaps write a drop in replacement
for the TCPSocket class that just uses EventMachine in the background
for I/O and queues the input/output.

I still haven't ever found a fix for the problem of defining methods in
one thread and the methods are assigned to a different thread. I think
I may just report this one to ruby and forget about it.

Good luck all.
-Roger

···

--
Posted via http://www.ruby-forum.com/\.

I don't believe this is true of mongrel itself, but rather of the Rails handler in mongrel.

Corey

···

On Aug 25, 2007, at 12:05 AM, dtuttle1@gmail.com wrote:

Hi Aaron,
I'd like to learn more in this area too, but here are my thoughts:
The web servers, at least mongrel, are single-threaded. Mongrel queues
requests and feeds them to the app sequentially. To get concurrency
you have to run multiple instances of mongrel. In this situation there
are no thread safety issues because there's only one thread per
process.

This is untrue.

The standard Mongrel is threaded. It creates a new thread of execution for each connection that it receives, and those execute in parallel with each other and the main Mongrel thread, which is essentially just an accept() loop that receives the requests and spawns handler threads for them.

The Rails mongrel handler has a mutex that locks the action within it to a single thread of execution at a time. So, if 10 requests come in at the same time, Mongrel will create 10 threads of execution for those 10 requests, but when execution flow reaches the Rails handler, each thread will stand in line at the mutex gate and proceed through it in single file.

In a standard Mongrel handler, which does not have a mutex at the front of it, the requests are processed concurrently. This is the normal situation.

Remember that Ruby threads, being green threads, are all in the same process, so there is no actual concurrency of execution between them. In most cases these threads will not increase your throughput.

Kirk Haines

···

On Sat, 25 Aug 2007, dtuttle1@gmail.com wrote:

I'd like to learn more in this area too, but here are my thoughts:
The web servers, at least mongrel, are single-threaded. Mongrel queues
requests and feeds them to the app sequentially. To get concurrency
you have to run multiple instances of mongrel. In this situation there
are no thread safety issues because there's only one thread per
process.

In general, EventMachine encourages a style that doesn't use threads at all.
The I/O queueing you're describing is already done by EM itself. All you
have to do is write the handlers and EM will call them itself as the I/O
comes in.

It may seem impossible to write network-aware programs without threads. Not
only is it possible but it can have very real benefits.

···

On 10/8/07, Roger Pack <rogerpack2005@gmail.com> wrote:

Roger Pack wrote:
>> I'm looking for some information about handling thread safety with
Ruby.
>
> Note also that sometimes if you are reading from TCPSockets the sockets
> get confused and start reading from one another. To avoid this (I
> think) use Francis Cianfrocca's EventMachine.
>

Possible ways to fix this might (might) be to ensure that every socket
read/write is 'not at the same time as any other read/write' (i.e.
surrounded by a mutex lock), or to perhaps write a drop in replacement
for the TCPSocket class that just uses EventMachine in the background
for I/O and queues the input/output.

Remember that Ruby threads, being green threads, are all in the same
process, so there is no actual concurrency of execution between them. In
most cases these threads will not increase your throughput.

Kirk Haines

I've heard that Ruby 2.0 won't use Green Threads, so hopefully this will
change. I would assume the reason why Ruby uses Green Threads is to try and
maintain portability over efficientcy.

I should play with it, I've never tried Ruby for Multi-Threaded work. It Might
not be worth my time but learning some thing new would be fun.

TerryP.

···

--
    
Email and shopping with the feelgood factor!
55% of income to good causes. http://www.ippimail.com

Wow EventMachine works like a dream. Thank you!

···

In general, EventMachine encourages a style that doesn't use threads at
all.
The I/O queueing you're describing is already done by EM itself. All you
have to do is write the handlers and EM will call them itself as the I/O
comes in.

It may seem impossible to write network-aware programs without threads.
Not
only is it possible but it can have very real benefits.

--
Posted via http://www.ruby-forum.com/\.

it's interesting to me that people assume green threads provide less performance advantage that native threads. this is patently untrue: it all depends on your task! to summarize

- if your task is cpu intensive AND you are on an SMP box AND you use many threads (aka lightweight processes) you will see a speed boost

- if your task is io/network bound and/or you are spawning a TON of threads then green threads will provide a speedup on any decent (aka not windoze) platform

consider these facts

- green threads are inexpensive to create compared to native threads
- green threads can help throughput a lot where io is concerened iff select is a good paradigm for scheduling activity (imagine many network connections)

- native threads are relatively expensive to create
- native threads have the same bottleneck on io that green threads have: you can physically only write to disk with the number of disk controllers you have and reading from sockets may still be limited to the speed of the person on the other end

green threads are good for something and native threads are good for somethings. fortunately in ruby it's extremely easy to farm out tasks to another process and use ipc with

   slave = Slave.new{ Server.new }

so we get the best of both works if we want it. people will miss green threads when they are gone - am i the only one who remembers not being able to stop java's native threads?

cheers.

a @ http://drawohara.com/

···

On Aug 25, 2007, at 9:29 AM, Terry Poulin wrote:

I've heard that Ruby 2.0 won't use Green Threads, so hopefully this will
change. I would assume the reason why Ruby uses Green Threads is to try and
maintain portability over efficientcy.

--
we can deny everything, except that we have the possibility of being better. simply reflect on that.
h.h. the 14th dalai lama

I've heard that Ruby 2.0 won't use Green Threads, so hopefully this will
change. I would assume the reason why Ruby uses Green Threads is to try and
maintain portability over efficientcy.

it's interesting to me that people assume green threads provide less performance advantage that native threads. this is patently untrue: it all depends on your task! to summarize

- if your task is cpu intensive AND you are on an SMP box AND you use many threads (aka lightweight processes) you will see a speed boost

How do you figure that? On a CPU intensive task, the threading overhead just removes from the amount of time that the CPU has to work on the CPU intensive task.

- if your task is io/network bound and/or you are spawning a TON of threads then green threads will provide a speedup on any decent (aka not windoze) platform

That's what I was saying. You have to have external latencies that you can capture without spending time blocking inside of an extension, and they have to represent a significant enough slice of the code's activities to overcome the overhead of thread creation, thread switching, and the additional cost of all of the extra objects that each thread and its contents imposes on the garbage collector.

So I still maintain that most of the time, for most code, Ruby threads are not a vehicle for improved performance, and are best through of as a usefull tool for elegantly solving problems instead of as a tool to make stuff go faster.

Kirk Haines

···

On Sun, 26 Aug 2007, ara.t.howard wrote:

ara.t.howard wrote:

consider these facts

- green threads are inexpensive to create compared to native threads
- green threads can help throughput a lot where io is concerened iff
select is a good paradigm for scheduling activity (imagine many network
connections)

- native threads are relatively expensive to create
- native threads have the same bottleneck on io that green threads have:
you can physically only write to disk with the number of disk
controllers you have and reading from sockets may still be limited to
the speed of the person on the other end

green threads are good for something and native threads are good for
somethings. fortunately in ruby it's extremely easy to farm out tasks
to another process and use ipc with

  slave = Slave.new{ Server.new }

so we get the best of both works if we want it. people will miss green
threads when they are gone - am i the only one who remembers not being
able to stop java's native threads?

Well ... I think we should have *both* green threads (i.e., a built-in
thread scheduler in a single Ruby process) and native threads (i.e., the
Linux "clone" operation creating a separate lightweight process sharing
a memory space). On top of that, we should have Erlang-style lightweight
Ruby processes communicating via message passing, something resembling
MPI and probably something resembling OpenMP. And of course there's dRB
and Rinda -- they aren't going away, are they?

Unfortunately, the whole world doesn't use the Linux kernel and the GCC
compilers, but for the part of the world that *does*, all of this is
doable via C-language libraries, and I'm guessing most of it *has* been
done. I know there's a "ruby-mpi" project, for example, although it
looks like it hasn't been touched in a couple of years and might be
orphaned.

khaines@enigo.com wrote:
...

- if your task is cpu intensive AND you are on an SMP box AND you use many threads (aka lightweight processes) you will see a speed boost

How do you figure that? On a CPU intensive task, the threading overhead just removes from the amount of time that the CPU has to work on the CPU intensive task.

On second reading, I *think* what he was saying was one native thread per cpu.

···

On Sun, 26 Aug 2007, ara.t.howard wrote:

--
       vjoel : Joel VanderWerf : path berkeley edu : 510 665 3407

M. Edward (Ed) Borasky wrote:

Well ... I think we should have *both* green threads (i.e., a built-in
thread scheduler in a single Ruby process) and native threads (i.e., the
Linux "clone" operation creating a separate lightweight process sharing
a memory space).

Did the recent discussion of fibers lead to the conclusion that green threads would still exist in some form in 1.9? Will we be able to experiment with 1:N and M:N threading in pure ruby?

···

--
       vjoel : Joel VanderWerf : path berkeley edu : 510 665 3407

yes. the whole point of SMP is that you can scale up that way. thanks.

a @ http://drawohara.com/

···

On Aug 25, 2007, at 1:30 PM, Joel VanderWerf wrote:

On second reading, I *think* what he was saying was one native thread per cpu.

--
we can deny everything, except that we have the possibility of being better. simply reflect on that.
h.h. the 14th dalai lama

i sure hope so - no point throwing out the baby with the bath water....

cheers.

a @ http://drawohara.com/

···

On Aug 25, 2007, at 1:36 PM, Joel VanderWerf wrote:

Did the recent discussion of fibers lead to the conclusion that green threads would still exist in some form in 1.9? Will we be able to experiment with 1:N and M:N threading in pure ruby?

--
we can deny everything, except that we have the possibility of being better. simply reflect on that.
h.h. the 14th dalai lama

There's nothing wrong with experimentation, but there are good reasons why
the Linux kernel people moved away from the M:N model. The Sun threading
model is still M:N because it has been for a dozen years, but they changed
the default threading discipline to "pre-emptive" years ago because life is
just so much easier that way. From a programmer's perspective, Solaris
threads might as well be 1:N. (More precisely, it's an exceedingly rare
program that can benefit from direct dependence on the M:N model.)

Non-preemptive threads seem like a wonderful idea because they're so
lightweight. Having done it both ways, I can tell you that the big problem
with non-preemptive threads is the bugs you get when they don't get
scheduled at the right times. I think something like Erlang processes will
be far easier to work with, and they're every bit as lightweight.

···

On 8/25/07, Joel VanderWerf <vjoel@path.berkeley.edu> wrote:

M. Edward (Ed) Borasky wrote:
> Well ... I think we should have *both* green threads (i.e., a built-in
> thread scheduler in a single Ruby process) and native threads (i.e., the
> Linux "clone" operation creating a separate lightweight process sharing
> a memory space).

Did the recent discussion of fibers lead to the conclusion that green
threads would still exist in some form in 1.9? Will we be able to
experiment with 1:N and M:N threading in pure ruby?

--
       vjoel : Joel VanderWerf : path berkeley edu : 510 665 3407

Ok. It confused me because we were talking about the current Ruby threading, which of course isn't helped by SMP.

Kirk Haines

···

On Sun, 26 Aug 2007, ara.t.howard wrote:

On Aug 25, 2007, at 1:30 PM, Joel VanderWerf wrote:

On second reading, I *think* what he was saying was one native thread per cpu.

yes. the whole point of SMP is that you can scale up that way. thanks.

So my weak brain is trying to put this all together. Here's what I make
of it and how it answers my questions. Correct me if I'm wrong.

The Rails mongrel handler has a mutex that locks the action within it to
a single thread of execution at a time. So, if 10 requests come in at the
same time, Mongrel will create 10 threads of execution for those 10
requests, but when execution flow reaches the Rails handler, each thread
will stand in line at the mutex gate and proceed through it in single
file."

When relying on Rails, the rails handler has a mutex in it so that only
one thread will ever go through the rails handler until it's complete
and then let's the next thread through for processing? So that would
tell me that if my app is relying on Rails, that I don't need to worry
about access to shared variables, such as static variables? As it isn't
truly executing multiple threads at once.

That does lead me to another question if using multiple Mongrel
processes. Multiple processes allow mongrel to receive more requests
thus creating more threads, but the Rails gateway is still opening the
gate for one thread at a time?

Also all this discussion makes me think of another question. I'm using a
dispatcher in my application. found
here(http://derrick.pallas.us/ruby-cgi/\). For someone who's reading the
code and knows a lot about mutex, are there any things that stand out to
someone as being a bad solution with this dispatcher? keeping in mind
multiple requests at once and possible shared resource headaches? I'm
just trying to get a feel for what I need to look into to make this
dispatcher better and avoid shared resource collisions.

···

--
Posted via http://www.ruby-forum.com/\.

The 'Rails gateway' is specific to a mongrel process - each process
has their own. Each process will handle one Rails thread.
For that reason you don't have to worry about shared variables.
--Dave

···

On Aug 25, 4:15 pm, Aaron Smith <beingthexempl...@gmail.com> wrote:

So my weak brain is trying to put this all together. Here's what I make
of it and how it answers my questions. Correct me if I'm wrong.

>The Rails mongrel handler has a mutex that locks the action within it to
>a single thread of execution at a time. So, if 10 requests come in at the
>same time, Mongrel will create 10 threads of execution for those 10
>requests, but when execution flow reaches the Rails handler, each thread
>will stand in line at the mutex gate and proceed through it in single
>file."

When relying on Rails, the rails handler has a mutex in it so that only
one thread will ever go through the rails handler until it's complete
and then let's the next thread through for processing? So that would
tell me that if my app is relying on Rails, that I don't need to worry
about access to shared variables, such as static variables? As it isn't
truly executing multiple threads at once.

That does lead me to another question if using multiple Mongrel
processes. Multiple processes allow mongrel to receive more requests
thus creating more threads, but the Rails gateway is still opening the
gate for one thread at a time?

Also all this discussion makes me think of another question. I'm using a
dispatcher in my application. found
here(http://derrick.pallas.us/ruby-cgi/\). For someone who's reading the
code and knows a lot about mutex, are there any things that stand out to
someone as being a bad solution with this dispatcher? keeping in mind
multiple requests at once and possible shared resource headaches? I'm
just trying to get a feel for what I need to look into to make this
dispatcher better and avoid shared resource collisions.

--
Posted viahttp://www.ruby-forum.com/.