Meditations on threads and thread safety

Hi –

I seem to keep coming up with Ruby projects that run aground on the
matter of thread safety. The most recent was the little
Object#stretch thing which extended objects module-wise for the
duration of a block. The larger earlier project that ran into the
same thing was Ruby Behaviors, which provides a way to do block-scoped
(but not thread safe) changes to core classes and modules.

I’ve been trying to think through the matter of thread safety
systematically. I’ve started to wonder whether there are different
kinds of thread safety, like there are different kinds of
infinities…

I’m not trying to argue a lax position about thread safety. I’m just
trying to understand it as deeply as I can. The difference between
Ruby code that does something and Ruby code that does something
thread-safely seems so vast; one essentially loses out on all the
conciseness of the language. At least that’s my impression. Mind
you, I’ve done very little thread programming, which may explain why I
find the “it’s not thread safe” semi-brick wall sort of exasperating,
and also may explain any deficiencies in my grasp of the
technicalities.

Anyway, here’s a snapshot of my current possibly rambling thoughts –
some notes to myself, neatened up a bit. I’d be interested in hearing
what people think.

                 *    *    *

Ruby is already not thread-safe, in the sense that if you don’t know
whether other threads are running, then weird things can always
happen. For example, if you do this:

obj.each { |e| puts e.upcase }

it’s possible that obj.upcase will be undefined or redefined between
iterations, if you’re going on the assumption that other threads might
be running and that you don’t know what they will be doing.

Therefore, lack of thread safety (of this kind) is something we live
with. There may be a rogue thread out there somewhere (we assume).

So… what exactly does it mean to say that a particular practice is
not thread safe? I guess it means that the practice itself is the
rogue thread.

But if we’re planning around rogue threads – if we always act as if
they might be there – then theoretically shouldn’t it be OK if they
are there? Or is this one of those game theory (at least the
popularized versions) questions, where A and B have to act together to
succeed but don’t know what the other is doing?

Let’s take Ruby Behaviors as an example. Let’s say you change the
behavior of Array#join for the duration of a block:

b = Behavior.new(“MyBehavior”)
b.adopt do
# … stuff …
end

The problem here is that some other thread might be using Array#join,
and might expect it to work in the normal way.

OK, that’s bad. But the question is… is it worse than the regular
conditions under which Ruby code runs? If my premise is right, every
call to Array#join already runs the risk of taking place after some
other thread has done something weird. So what’s the difference?

That’s what I’m trying to figure out. Is it just quantity (i.e.,
increasing the likelihood of “something weird” going on in a distant
thread)? Or are the baseline conditions of Ruby actually less fragile
than I’m suggesting?

Of course the sensible thing to do is to declare all modifications to
core functionality to be dangerous and avoid it. That’s probably
reasonable… but I just can’t get away from the thought that the
changeable nature of Ruby objects, including core classes, is
something with depths that can be explored, rather than one of these
"enough rope to hang yourself" things.

Is it enough to document that something isn’t thread-safe, and let
people decide whether or not to use it? Is there in fact such a thing
as absolutely knowing that multiple threads will not be running?

This is all very fragmentary but I’m just trying to stir up and
(eventually :slight_smile: clarify my thoughts on the subject.

David

···


David Alan Black
home: dblack@candle.superlink.net
work: blackdav@shu.edu
Web: http://pirate.shu.edu/~blackdav

I think traditionally thread saftey makes one think of two things,
the first being kind of a special case of the second:

  • reentrant functions, e.g. functions that don’t break when entered
    by several threads in arbitrary order
  • shared resources, e.g. resources that can deal with mutiple
    concurrent requests

What you’re talking about seems to open up a whole new world
of things that can go wrong since you’re changing semantics at
runtime but at the very heart of it I think you require a mechanism
that either isolates the changes to a specific thread or serializes
execution of the function you’re changing.

If you choose to serialize access to the changed functions the next
question would be: what granularity should the concurrency
mechanism support? Global, module, class, object?

You could do what Java does and have a monitor on each object
but I don’t think that approach has found many friends.

Andrew Queisser

dblack@candle.superlink.net wrote in message
news:Pine.LNX.4.44.0301091233090.27026-100000@candle.superlink.net

···

Hi –

I seem to keep coming up with Ruby projects that run aground on the
matter of thread safety. The most recent was the little
Object#stretch thing which extended objects module-wise for the
duration of a block. The larger earlier project that ran into the
same thing was Ruby Behaviors, which provides a way to do block-scoped
(but not thread safe) changes to core classes and modules.

I’ve been trying to think through the matter of thread safety
systematically. I’ve started to wonder whether there are different
kinds of thread safety, like there are different kinds of
infinities…

I’m not trying to argue a lax position about thread safety. I’m just
trying to understand it as deeply as I can. The difference between
Ruby code that does something and Ruby code that does something
thread-safely seems so vast; one essentially loses out on all the
conciseness of the language. At least that’s my impression. Mind
you, I’ve done very little thread programming, which may explain why I
find the “it’s not thread safe” semi-brick wall sort of exasperating,
and also may explain any deficiencies in my grasp of the
technicalities.

Anyway, here’s a snapshot of my current possibly rambling thoughts –
some notes to myself, neatened up a bit. I’d be interested in hearing
what people think.

                 *    *    *

Ruby is already not thread-safe, in the sense that if you don’t know
whether other threads are running, then weird things can always
happen. For example, if you do this:

obj.each { |e| puts e.upcase }

it’s possible that obj.upcase will be undefined or redefined between
iterations, if you’re going on the assumption that other threads might
be running and that you don’t know what they will be doing.

Therefore, lack of thread safety (of this kind) is something we live
with. There may be a rogue thread out there somewhere (we assume).

So… what exactly does it mean to say that a particular practice is
not thread safe? I guess it means that the practice itself is the
rogue thread.

But if we’re planning around rogue threads – if we always act as if
they might be there – then theoretically shouldn’t it be OK if they
are there? Or is this one of those game theory (at least the
popularized versions) questions, where A and B have to act together to
succeed but don’t know what the other is doing?

Let’s take Ruby Behaviors as an example. Let’s say you change the
behavior of Array#join for the duration of a block:

b = Behavior.new(“MyBehavior”)
b.adopt do
# … stuff …
end

The problem here is that some other thread might be using Array#join,
and might expect it to work in the normal way.

OK, that’s bad. But the question is… is it worse than the regular
conditions under which Ruby code runs? If my premise is right, every
call to Array#join already runs the risk of taking place after some
other thread has done something weird. So what’s the difference?

That’s what I’m trying to figure out. Is it just quantity (i.e.,
increasing the likelihood of “something weird” going on in a distant
thread)? Or are the baseline conditions of Ruby actually less fragile
than I’m suggesting?

Of course the sensible thing to do is to declare all modifications to
core functionality to be dangerous and avoid it. That’s probably
reasonable… but I just can’t get away from the thought that the
changeable nature of Ruby objects, including core classes, is
something with depths that can be explored, rather than one of these
“enough rope to hang yourself” things.

Is it enough to document that something isn’t thread-safe, and let
people decide whether or not to use it? Is there in fact such a thing
as absolutely knowing that multiple threads will not be running?

This is all very fragmentary but I’m just trying to stir up and
(eventually :slight_smile: clarify my thoughts on the subject.

David


David Alan Black
home: dblack@candle.superlink.net
work: blackdav@shu.edu
Web: http://pirate.shu.edu/~blackdav