Thread safety techniques for server applications?

Aaron_Smith · 25 August 2007 02:05

Hey all,

I'm looking for some information about handling thread safety with Ruby.
I've got an application server I wrote that I need to make sure it's
thread safe. This application server is used over http requests so it's
possible multiple people hit it at once. I have some questions that will
help me determine..

1. Does using mongrel / lighttpd / webrick, ensure thread saftey? (my
application relies on these)
2. What kinds of things in the Ruby language should I NOT do that will
cause thread headaches.. (maybe static variables)?
3. What techniques can I use to go about testing thread saftey?

I'm not asking anything in reference to rails. This would be just
general Ruby thread safety ideas..

thanks..

···

--
Posted via http://www.ruby-forum.com/.

dtuttle1 · 25 August 2007 07:05

Hi Aaron,
I'd like to learn more in this area too, but here are my thoughts:
The web servers, at least mongrel, are single-threaded. Mongrel queues
requests and feeds them to the app sequentially. To get concurrency
you have to run multiple instances of mongrel. In this situation there
are no thread safety issues because there's only one thread per
process.
I like the idea of separate processes instead of worrying about thread
safety, but sometimes I need multiple threads, for example in a jabber
client (keepalives, listeners, etc). What I've been doing is keeping
it as simple as possible and so far I haven't had to think about
thread conflicts. Or maybe I should be but I haven't
--Dave

···

On Aug 24, 7:05 pm, Aaron Smith <beingthexempl...@gmail.com> wrote:

Hey all,

I'm looking for some information about handling thread safety with Ruby.
I've got an application server I wrote that I need to make sure it's
thread safe. This application server is used over http requests so it's
possible multiple people hit it at once. I have some questions that will
help me determine..

1. Does using mongrel / lighttpd / webrick, ensure thread saftey? (my
application relies on these)
2. What kinds of things in the Ruby language should I NOT do that will
cause thread headaches.. (maybe static variables)?
3. What techniques can I use to go about testing thread saftey?

I'm not asking anything in reference to rails. This would be just
general Ruby thread safety ideas..

thanks..
--
Posted viahttp://www.ruby-forum.com/.

Kirk_Haines · 25 August 2007 13:37

I'm looking for some information about handling thread safety with Ruby.
I've got an application server I wrote that I need to make sure it's
thread safe. This application server is used over http requests so it's
possible multiple people hit it at once. I have some questions that will
help me determine..

In general, it's the same as any other type of threaded programming. Share as little as possible, and control access to shared resources so that two threads aren't changing state in it at the same time and running into eachother. Look at the Mutex class and the Queue class as starting points for tools to help you do this.

1. Does using mongrel / lighttpd / webrick, ensure thread saftey? (my
application relies on these)

lighttpd is an external web server, so it's irrelevant.

Both mongrel and webrick are threaded Ruby web server platforms. They, however, don't do anything to ensure that your code which you run inside of them is threadsafe.

2. What kinds of things in the Ruby language should I NOT do that will
cause thread headaches.. (maybe static variables)?

The only things to really keep in mind is that Ruby threads are green threads -- they are all done inside of the Ruby interpreter. So, they all share a single process. Thus, the use of threads will rarely increase the throughput of your program, unless there is some external latency that can be captured, and that external latency does not occur inside of a Ruby extension.

This is because while the flow of execution is inside of an extension, it is out of Ruby's control, and no thread task switching will take place.

Also, be aware that Ruby uses a select() loop to manage its threads of execution, and it has an fd_setsize limit of 1024 handles, so there is a sharp upper boundary on the number of threads you can have in a Ruby process.

3. What techniques can I use to go about testing thread saftey?

Look for areas in your code where you share resources between your threads. Do you take precautions to keep multiple threads from stepping on eachother when using those resources?

Write test code that creates multiple threads, and tries to stress those areas.

Kirk Haines

···

On Sat, 25 Aug 2007, Aaron Smith wrote:

Roger_Pack4 · 8 October 2007 18:17

Roger Pack wrote:

I'm looking for some information about handling thread safety with Ruby.

Note also that sometimes if you are reading from TCPSockets the sockets
get confused and start reading from one another. To avoid this (I
think) use Francis Cianfrocca's EventMachine.

Possible ways to fix this might (might) be to ensure that every socket
read/write is 'not at the same time as any other read/write' (i.e.
surrounded by a mutex lock), or to perhaps write a drop in replacement
for the TCPSocket class that just uses EventMachine in the background
for I/O and queues the input/output.

I still haven't ever found a fix for the problem of defining methods in
one thread and the methods are assigned to a different thread. I think
I may just report this one to ruby and forget about it.

Good luck all.
-Roger

···

--
Posted via http://www.ruby-forum.com/\.

Corey_Jewett1 · 25 August 2007 07:22

I don't believe this is true of mongrel itself, but rather of the Rails handler in mongrel.

Corey

···

On Aug 25, 2007, at 12:05 AM, dtuttle1@gmail.com wrote:

Hi Aaron,
I'd like to learn more in this area too, but here are my thoughts:
The web servers, at least mongrel, are single-threaded. Mongrel queues
requests and feeds them to the app sequentially. To get concurrency
you have to run multiple instances of mongrel. In this situation there
are no thread safety issues because there's only one thread per
process.

Kirk_Haines · 25 August 2007 13:24

This is untrue.

The standard Mongrel is threaded. It creates a new thread of execution for each connection that it receives, and those execute in parallel with each other and the main Mongrel thread, which is essentially just an accept() loop that receives the requests and spawns handler threads for them.

The Rails mongrel handler has a mutex that locks the action within it to a single thread of execution at a time. So, if 10 requests come in at the same time, Mongrel will create 10 threads of execution for those 10 requests, but when execution flow reaches the Rails handler, each thread will stand in line at the mutex gate and proceed through it in single file.

In a standard Mongrel handler, which does not have a mutex at the front of it, the requests are processed concurrently. This is the normal situation.

Remember that Ruby threads, being green threads, are all in the same process, so there is no actual concurrency of execution between them. In most cases these threads will not increase your throughput.

Kirk Haines

···

On Sat, 25 Aug 2007, dtuttle1@gmail.com wrote:

I'd like to learn more in this area too, but here are my thoughts:
The web servers, at least mongrel, are single-threaded. Mongrel queues
requests and feeds them to the app sequentially. To get concurrency
you have to run multiple instances of mongrel. In this situation there
are no thread safety issues because there's only one thread per
process.

Francis_Cianfrocca · 8 October 2007 18:35

In general, EventMachine encourages a style that doesn't use threads at all.
The I/O queueing you're describing is already done by EM itself. All you
have to do is write the handlers and EM will call them itself as the I/O
comes in.

It may seem impossible to write network-aware programs without threads. Not
only is it possible but it can have very real benefits.

···

On 10/8/07, Roger Pack <rogerpack2005@gmail.com> wrote:

Roger Pack wrote:
>> I'm looking for some information about handling thread safety with
Ruby.
>
> Note also that sometimes if you are reading from TCPSockets the sockets
> get confused and start reading from one another. To avoid this (I
> think) use Francis Cianfrocca's EventMachine.
>

Possible ways to fix this might (might) be to ensure that every socket
read/write is 'not at the same time as any other read/write' (i.e.
surrounded by a mutex lock), or to perhaps write a drop in replacement
for the TCPSocket class that just uses EventMachine in the background
for I/O and queues the input/output.

Terry_Poulin · 25 August 2007 15:29

Remember that Ruby threads, being green threads, are all in the same
process, so there is no actual concurrency of execution between them. In
most cases these threads will not increase your throughput.

Kirk Haines

I've heard that Ruby 2.0 won't use Green Threads, so hopefully this will
change. I would assume the reason why Ruby uses Green Threads is to try and
maintain portability over efficientcy.

I should play with it, I've never tried Ruby for Multi-Threaded work. It Might
not be worth my time but learning some thing new would be fun.

TerryP.

···

--

Email and shopping with the feelgood factor!
55% of income to good causes. http://www.ippimail.com

Roger_Pack4 · 9 October 2007 14:58

Wow EventMachine works like a dream. Thank you!

···

In general, EventMachine encourages a style that doesn't use threads at
all.
The I/O queueing you're describing is already done by EM itself. All you
have to do is write the handlers and EM will call them itself as the I/O
comes in.

It may seem impossible to write network-aware programs without threads.
Not
only is it possible but it can have very real benefits.

--
Posted via http://www.ruby-forum.com/\.

a11 · 25 August 2007 15:59

it's interesting to me that people assume green threads provide less performance advantage that native threads. this is patently untrue: it all depends on your task! to summarize

- if your task is cpu intensive AND you are on an SMP box AND you use many threads (aka lightweight processes) you will see a speed boost

- if your task is io/network bound and/or you are spawning a TON of threads then green threads will provide a speedup on any decent (aka not windoze) platform

consider these facts

- green threads are inexpensive to create compared to native threads
- green threads can help throughput a lot where io is concerened iff select is a good paradigm for scheduling activity (imagine many network connections)

- native threads are relatively expensive to create
- native threads have the same bottleneck on io that green threads have: you can physically only write to disk with the number of disk controllers you have and reading from sockets may still be limited to the speed of the person on the other end

green threads are good for something and native threads are good for somethings. fortunately in ruby it's extremely easy to farm out tasks to another process and use ipc with

slave = Slave.new{ Server.new }

so we get the best of both works if we want it. people will miss green threads when they are gone - am i the only one who remembers not being able to stop java's native threads?

cheers.

a @ http://drawohara.com/

···

On Aug 25, 2007, at 9:29 AM, Terry Poulin wrote:

I've heard that Ruby 2.0 won't use Green Threads, so hopefully this will
change. I would assume the reason why Ruby uses Green Threads is to try and
maintain portability over efficientcy.

--
we can deny everything, except that we have the possibility of being better. simply reflect on that.
h.h. the 14th dalai lama

Kirk_Haines · 25 August 2007 19:15

I've heard that Ruby 2.0 won't use Green Threads, so hopefully this will
change. I would assume the reason why Ruby uses Green Threads is to try and
maintain portability over efficientcy.

it's interesting to me that people assume green threads provide less performance advantage that native threads. this is patently untrue: it all depends on your task! to summarize

- if your task is cpu intensive AND you are on an SMP box AND you use many threads (aka lightweight processes) you will see a speed boost

How do you figure that? On a CPU intensive task, the threading overhead just removes from the amount of time that the CPU has to work on the CPU intensive task.

- if your task is io/network bound and/or you are spawning a TON of threads then green threads will provide a speedup on any decent (aka not windoze) platform

That's what I was saying. You have to have external latencies that you can capture without spending time blocking inside of an extension, and they have to represent a significant enough slice of the code's activities to overcome the overhead of thread creation, thread switching, and the additional cost of all of the extra objects that each thread and its contents imposes on the garbage collector.

So I still maintain that most of the time, for most code, Ruby threads are not a vehicle for improved performance, and are best through of as a usefull tool for elegantly solving problems instead of as a tool to make stuff go faster.

Kirk Haines

···

On Sun, 26 Aug 2007, ara.t.howard wrote:

M_Edward_Ed_Borasky1 · 25 August 2007 19:20

ara.t.howard wrote:

consider these facts

- green threads are inexpensive to create compared to native threads
- green threads can help throughput a lot where io is concerened iff
select is a good paradigm for scheduling activity (imagine many network
connections)

- native threads are relatively expensive to create
- native threads have the same bottleneck on io that green threads have:
you can physically only write to disk with the number of disk
controllers you have and reading from sockets may still be limited to
the speed of the person on the other end

green threads are good for something and native threads are good for
somethings. fortunately in ruby it's extremely easy to farm out tasks
to another process and use ipc with

slave = Slave.new{ Server.new }

so we get the best of both works if we want it. people will miss green
threads when they are gone - am i the only one who remembers not being
able to stop java's native threads?

Well ... I think we should have *both* green threads (i.e., a built-in
thread scheduler in a single Ruby process) and native threads (i.e., the
Linux "clone" operation creating a separate lightweight process sharing
a memory space). On top of that, we should have Erlang-style lightweight
Ruby processes communicating via message passing, something resembling
MPI and probably something resembling OpenMP. And of course there's dRB
and Rinda -- they aren't going away, are they?

Unfortunately, the whole world doesn't use the Linux kernel and the GCC
compilers, but for the part of the world that *does*, all of this is
doable via C-language libraries, and I'm guessing most of it *has* been
done. I know there's a "ruby-mpi" project, for example, although it
looks like it hasn't been touched in a couple of years and might be
orphaned.

Joel_VanderWerf1 · 25 August 2007 19:30

khaines@enigo.com wrote:
...

- if your task is cpu intensive AND you are on an SMP box AND you use many threads (aka lightweight processes) you will see a speed boost

How do you figure that? On a CPU intensive task, the threading overhead just removes from the amount of time that the CPU has to work on the CPU intensive task.

On second reading, I *think* what he was saying was one native thread per cpu.

···

On Sun, 26 Aug 2007, ara.t.howard wrote:

--
vjoel : Joel VanderWerf : path berkeley edu : 510 665 3407

Joel_VanderWerf1 · 25 August 2007 19:36

M. Edward (Ed) Borasky wrote:

Well ... I think we should have *both* green threads (i.e., a built-in
thread scheduler in a single Ruby process) and native threads (i.e., the
Linux "clone" operation creating a separate lightweight process sharing
a memory space).

Did the recent discussion of fibers lead to the conclusion that green threads would still exist in some form in 1.9? Will we be able to experiment with 1:N and M:N threading in pure ruby?

···

--
vjoel : Joel VanderWerf : path berkeley edu : 510 665 3407

a11 · 25 August 2007 21:06

yes. the whole point of SMP is that you can scale up that way. thanks.

a @ http://drawohara.com/

···

On Aug 25, 2007, at 1:30 PM, Joel VanderWerf wrote:

On second reading, I *think* what he was saying was one native thread per cpu.

--
we can deny everything, except that we have the possibility of being better. simply reflect on that.
h.h. the 14th dalai lama

a11 · 25 August 2007 21:09

i sure hope so - no point throwing out the baby with the bath water....

cheers.

a @ http://drawohara.com/

···

On Aug 25, 2007, at 1:36 PM, Joel VanderWerf wrote:

Did the recent discussion of fibers lead to the conclusion that green threads would still exist in some form in 1.9? Will we be able to experiment with 1:N and M:N threading in pure ruby?

--
we can deny everything, except that we have the possibility of being better. simply reflect on that.
h.h. the 14th dalai lama

Francis_Cianfrocca · 26 August 2007 01:43

There's nothing wrong with experimentation, but there are good reasons why
the Linux kernel people moved away from the M:N model. The Sun threading
model is still M:N because it has been for a dozen years, but they changed
the default threading discipline to "pre-emptive" years ago because life is
just so much easier that way. From a programmer's perspective, Solaris
threads might as well be 1:N. (More precisely, it's an exceedingly rare
program that can benefit from direct dependence on the M:N model.)

Non-preemptive threads seem like a wonderful idea because they're so
lightweight. Having done it both ways, I can tell you that the big problem
with non-preemptive threads is the bugs you get when they don't get
scheduled at the right times. I think something like Erlang processes will
be far easier to work with, and they're every bit as lightweight.

···

On 8/25/07, Joel VanderWerf <vjoel@path.berkeley.edu> wrote:

M. Edward (Ed) Borasky wrote:
> Well ... I think we should have *both* green threads (i.e., a built-in
> thread scheduler in a single Ruby process) and native threads (i.e., the
> Linux "clone" operation creating a separate lightweight process sharing
> a memory space).

Did the recent discussion of fibers lead to the conclusion that green
threads would still exist in some form in 1.9? Will we be able to
experiment with 1:N and M:N threading in pure ruby?

--
vjoel : Joel VanderWerf : path berkeley edu : 510 665 3407

Kirk_Haines · 25 August 2007 21:45

Ok. It confused me because we were talking about the current Ruby threading, which of course isn't helped by SMP.

Kirk Haines

···

On Sun, 26 Aug 2007, ara.t.howard wrote:

On Aug 25, 2007, at 1:30 PM, Joel VanderWerf wrote:

On second reading, I *think* what he was saying was one native thread per cpu.

yes. the whole point of SMP is that you can scale up that way. thanks.

Aaron_Smith · 25 August 2007 23:15

So my weak brain is trying to put this all together. Here's what I make
of it and how it answers my questions. Correct me if I'm wrong.

The Rails mongrel handler has a mutex that locks the action within it to
a single thread of execution at a time. So, if 10 requests come in at the
same time, Mongrel will create 10 threads of execution for those 10
requests, but when execution flow reaches the Rails handler, each thread
will stand in line at the mutex gate and proceed through it in single
file."

When relying on Rails, the rails handler has a mutex in it so that only
one thread will ever go through the rails handler until it's complete
and then let's the next thread through for processing? So that would
tell me that if my app is relying on Rails, that I don't need to worry
about access to shared variables, such as static variables? As it isn't
truly executing multiple threads at once.

That does lead me to another question if using multiple Mongrel
processes. Multiple processes allow mongrel to receive more requests
thus creating more threads, but the Rails gateway is still opening the
gate for one thread at a time?

Also all this discussion makes me think of another question. I'm using a
dispatcher in my application. found
here(http://derrick.pallas.us/ruby-cgi/\). For someone who's reading the
code and knows a lot about mutex, are there any things that stand out to
someone as being a bad solution with this dispatcher? keeping in mind
multiple requests at once and possible shared resource headaches? I'm
just trying to get a feel for what I need to look into to make this
dispatcher better and avoid shared resource collisions.

···

--
Posted via http://www.ruby-forum.com/\.

dtuttle1 · 30 August 2007 16:20

The 'Rails gateway' is specific to a mongrel process - each process
has their own. Each process will handle one Rails thread.
For that reason you don't have to worry about shared variables.
--Dave

···

On Aug 25, 4:15 pm, Aaron Smith <beingthexempl...@gmail.com> wrote:

So my weak brain is trying to put this all together. Here's what I make
of it and how it answers my questions. Correct me if I'm wrong.

>The Rails mongrel handler has a mutex that locks the action within it to
>a single thread of execution at a time. So, if 10 requests come in at the
>same time, Mongrel will create 10 threads of execution for those 10
>requests, but when execution flow reaches the Rails handler, each thread
>will stand in line at the mutex gate and proceed through it in single
>file."

When relying on Rails, the rails handler has a mutex in it so that only
one thread will ever go through the rails handler until it's complete
and then let's the next thread through for processing? So that would
tell me that if my app is relying on Rails, that I don't need to worry
about access to shared variables, such as static variables? As it isn't
truly executing multiple threads at once.

That does lead me to another question if using multiple Mongrel
processes. Multiple processes allow mongrel to receive more requests
thus creating more threads, but the Rails gateway is still opening the
gate for one thread at a time?

Also all this discussion makes me think of another question. I'm using a
dispatcher in my application. found
here(http://derrick.pallas.us/ruby-cgi/\). For someone who's reading the
code and knows a lot about mutex, are there any things that stand out to
someone as being a bad solution with this dispatcher? keeping in mind
multiple requests at once and possible shared resource headaches? I'm
just trying to get a feel for what I need to look into to make this
dispatcher better and avoid shared resource collisions.

--
Posted viahttp://www.ruby-forum.com/.

Topic		Replies	Views
How about ruby's threads? ruby-talk	18	105	8 December 2009
Ruby Threads ruby-talk	39	168	29 May 2006
Ruby Threads 101 ruby-talk	12	104	27 September 2005
Threads and Ruby ruby-talk	34	152	13 July 2008
Using fork to conserve memory ruby-talk	21	136	4 February 2007

Thread safety techniques for server applications?

Related topics