Synchronized attr_accessor

I have a fairly repetitive use case of having to define attributes and
have them lock protected. I was thinking of defining a attr_accessor
kind of directive but one that creates a lock protected attribute -

class Module
  def attr_sync_accessor(*symbols)
   symbols.each do |symbol|
     str = <<-EOF
       def #{symbol}()
         if @#{symbol}
           @#{symbol}.synchronize { return @#{symbol} }
         else
           nil
         end
       end
     EOF
     module_eval(str)
     str = <<-EOF
       def #{symbol}=(val)
         if @#{symbol}
           @#{symbol}.synchronize { @#{symbol} = val }
         else
           @#{symbol} = val
           @#{symbol}.extend(MonitorMixin)
         end
       end
     EOF
     module_eval(str)
   end
  end
end

···

#--------------

Now to use it you need to require monitor so

require 'monitor'

class A
     attr_sync_accessor :hello
     def get
       hello
     end
     def set(val)
      self.hello = val
   end
end

irb(main):190:0* d = A.new
=> #<D:0x27f77d4>
irb(main):191:0> d.set "h"
=> "h"
irb(main):192:0> d.get
=> "h"

This works but obviously has some flaws, like for example first time
creation of attribute is unprotected as is the nil check, also it does
not address scope [private, protected etc.].

I am looking for suggestions on improvements to this if anyone else
also finds it useful or if it is solved by someone in some other way
and I have overlooked something.

TIA
- Nasir

Could you describe are you trying to accomplish using these attributes? There is usually a better way to communicate between threads than setting/getting attributes on an object.

-mental

···

On Sat, 9 Jun 2007 04:34:42 +0900, Nasir Khan <rubylearner@gmail.com> wrote:

This works but obviously has some flaws, like for example first time
creation of attribute is unprotected as is the nil check, also it does
not address scope [private, protected etc.].

This is about having a instance variable always accessed under a lock,
if object state (instance variables) are accessed by many threads at
the same time
for eg. a hash accessed by several threads for getting values but some
operation also adding deleting from the hash. You would want a
synchronized access to your hash in this case...

So this could save some repetitive lines of code, that is all.

- nasir

···

On Jun 8, 12:48 pm, MenTaLguY <men...@rydia.net> wrote:

On Sat, 9 Jun 2007 04:34:42 +0900, Nasir Khan <rubylear...@gmail.com> wrote:
> This works but obviously has some flaws, like for example first time
> creation of attribute is unprotected as is the nil check, also it does
> not address scope [private, protected etc.].

Could you describe are you trying to accomplish using these attributes? There is usually a better way to communicate between threads than setting/getting attributes on an object.

-mental

Just because access to an instance variable is synchronized does not mean that access to the object it references is synchronized. Retrieving a hash object via a synchronized accessor does not make it safe against concurrent modification.

There are approaches which will work, but to know which ones to suggest, I will need to know more about the "bigger picture" that this code is part of.

-mental

···

On Sat, 9 Jun 2007 04:55:24 +0900, Nasir Khan <rubylearner@gmail.com> wrote:

This is about having a instance variable always accessed under a lock,
if object state (instance variables) are accessed by many threads at
the same time for eg. a hash accessed by several threads for getting values but some
operation also adding deleting from the hash. You would want a
synchronized access to your hash in this case...

There is no big picture.
The problem is simple - "create accessors for an instance variable
that access the variable under a lock".
Just as problem statement for "attr_accessor" could be - "create
accessors for instance variables"

- Nasir

PS: I understand that a synchronized access to hash is not the same as
synchronized access to hash values, but this problem is *not* about
it.

···

On Jun 8, 2:13 pm, MenTaLguY <men...@rydia.net> wrote:

On Sat, 9 Jun 2007 04:55:24 +0900, Nasir Khan <rubylear...@gmail.com> wrote:
> This is about having a instance variable always accessed under a lock,
> if object state (instance variables) are accessed by many threads at
> the same time for eg. a hash accessed by several threads for getting values but some
> operation also adding deleting from the hash. You would want a
> synchronized access to your hash in this case...

Just because access to an instance variable is synchronized does not mean that access to the object it references is synchronized. Retrieving a hash object via a synchronized accessor does not make it safe against concurrent modification.

There are approaches which will work, but to know which ones to suggest, I will need to know more about the "bigger picture" that this code is part of.

-mental

Hmm there is no such thing as a lock on an ivar.
By declaring a synchronized accessor there is just no way to know
anything about the behavior of the accessed object, it might be
threadsafe or immutable, therefore it does not make much sense to me
and if I have understood your code correctly neither to Ruby. Imagine
someone using your synchronized accessor to change the ivar to a
simple integer, your code would fail at the next access attempt.

Cheers
Robert

···

On 6/8/07, Nasir Khan <rubylearner@gmail.com> wrote:

There is no big picture.
The problem is simple - "create accessors for an instance variable
that access the variable under a lock".
Just as problem statement for "attr_accessor" could be - "create
accessors for instance variables"

Since you had asked for feedback on anything you'd missed, I was trying to find out if was some mitigating factor in the specific way your program worked (the one for which you originally wrote this code), before telling you that it won't work.

But -- unless the objects the attributes were set to were never modified, even if they didn't have the thread safety problems you already noted, the accessors generated still couldn't ensure thread safety. If thread 1 calls obj.some_accessor.foo, and thread 2 calls obj.some_accessor.bar (where #bar is some mutating method), some_accessor being synchronized simply _will not_ protect you.

-mental

···

On Sat, 9 Jun 2007 06:54:04 +0900, Nasir Khan <rubylearner@gmail.com> wrote:

There is no big picture.

Yeah I see what you guys mean...I was a little delusional. Now I
realize that I was looking for basically something equivalent to
Java's -
Collections.synchronizedMap(new HashMap(...))
...
Thanks

···

On 6/8/07, MenTaLguY <mental@rydia.net> wrote:

On Sat, 9 Jun 2007 06:54:04 +0900, Nasir Khan <rubylearner@gmail.com> wrote:
> There is no big picture.

Since you had asked for feedback on anything you'd missed, I was trying to find out if was some mitigating factor in the specific way your program worked (the one for which you originally wrote this code), before telling you that it won't work.

But -- unless the objects the attributes were set to were never modified, even if they didn't have the thread safety problems you already noted, the accessors generated still couldn't ensure thread safety. If thread 1 calls obj.some_accessor.foo, and thread 2 calls obj.some_accessor.bar (where #bar is some mutating method), some_accessor being synchronized simply _will not_ protect you.

-mental

Actually facets/more/synchash.rb does what I was looking for hash.

Thanks

···

On 6/8/07, Nasir Khan <rubylearner@gmail.com> wrote:

Yeah I see what you guys mean...I was a little delusional. Now I
realize that I was looking for basically something equivalent to
Java's -
Collections.synchronizedMap(new HashMap(...))
...
Thanks

On 6/8/07, MenTaLguY <mental@rydia.net> wrote:
> On Sat, 9 Jun 2007 06:54:04 +0900, Nasir Khan <rubylearner@gmail.com> wrote:
> > There is no big picture.
>
> Since you had asked for feedback on anything you'd missed, I was trying to find out if was some mitigating factor in the specific way your program worked (the one for which you originally wrote this code), before telling you that it won't work.
>
> But -- unless the objects the attributes were set to were never modified, even if they didn't have the thread safety problems you already noted, the accessors generated still couldn't ensure thread safety. If thread 1 calls obj.some_accessor.foo, and thread 2 calls obj.some_accessor.bar (where #bar is some mutating method), some_accessor being synchronized simply _will not_ protect you.
>
> -mental
>

Prodding it further I could come up with a method synchronizer.

For any class where method synchronization is required, mixin the
MethodSynchronizer and implement a class level method sync_methods providing
an array of methods to be
synchronized.

···

#-----------------------

module MethodSynchronizer

  def MethodSynchronizer.included(into)
    into.sync_methods.each do |m|
      MethodSynchronizer.wrap_method(into, m)
    end
  end

  def MethodSynchronizer.wrap_method(klass, meth)
    klass.class_eval do
      alias_method "__nonsync_#{meth}", "#{meth}"
      require 'thread'
      define_method(meth) do |*args|
        @__ms_lock = Mutex.new unless @__ms_lock
        @__ms_lock.synchronize do
          self.send("__nonsync_#{meth}",*args)
        end
      end
    end
  end
end
#-------------------------

Usage e.g.

  class D
    def self.sync_methods
      [:m1, :m2]
    end
      def m1
        puts "hello"
      end

      def m2
        puts "bye"
      end

      def m3
       puts "not synchronized"
      end
    include MethodSynchronizer
  end

Comments solicited.

- Nasir

On 6/8/07, Nasir Khan <rubylearner@gmail.com> wrote:

Actually facets/more/synchash.rb does what I was looking for hash.

Thanks

On 6/8/07, Nasir Khan <rubylearner@gmail.com> wrote:
> Yeah I see what you guys mean...I was a little delusional. Now I
> realize that I was looking for basically something equivalent to
> Java's -
> Collections.synchronizedMap(new HashMap(...))
> ...
> Thanks
>
> On 6/8/07, MenTaLguY <mental@rydia.net> wrote:
> > On Sat, 9 Jun 2007 06:54:04 +0900, Nasir Khan <rubylearner@gmail.com> wrote:
> > > There is no big picture.
> >
> > Since you had asked for feedback on anything you'd missed, I was

trying to find out if was some mitigating factor in the specific way your
program worked (the one for which you originally wrote this code), before
telling you that it won't work.

> >
> > But -- unless the objects the attributes were set to were never

modified, even if they didn't have the thread safety problems you already
noted, the accessors generated still couldn't ensure thread safety. If
thread 1 calls obj.some_accessor.foo, and thread 2 calls
obj.some_accessor.bar (where #bar is some mutating method), some_accessor
being synchronized simply _will not_ protect you.

> >
> > -mental
> >
>

Nasir Khan wrote:

    define_method(meth) do |*args|
      @__ms_lock = Mutex.new unless @__ms_lock
      @__ms_lock.synchronize do
        self.send("__nonsync_#{meth}",*args)
      end
    end

This still isn't completely thread-safe, I'm afraid. If the lock
doesn't exist yet and two threads both call some synchronized method
on the same object, you've got a race condition. I'd suggest creating
the lock upfront, before the synchronization wrappers are defined....
unfortunately, that makes implementing this idea a bit more
complicated.

Thanks for the feedback. Here is a refinement -

···

#---------------------
module MethodSynchronizer

  def MethodSynchronizer.included(into)
    into.sync_methods.each do |m|
      MethodSynchronizer.wrap_method(into, m)
    end
  end

  def MethodSynchronizer.wrap_method(klass, meth)
    klass.class_eval do
      alias_method "__nonsync_#{meth}", "#{meth}"
      require 'thread'
      @@__ms_c_lock = Mutex.new
      define_method(meth) do |*args|
        @@__ms_c_lock.synchronize { @__ms_lock = Mutex.new unless @__ms_lock
} unless @__ms_lock
        @__ms_lock.synchronize do
          self.send("__nonsync_#{meth}",*args)
        end
      end
    end
  end
end
#----------------------

The @@__ms_c_lock is a class level lock which will be taken only of mutex
@__ms_lock is not yet created. This is the double checked lock pattern where
the mutex is again checked after taking the class level lock.

Thanks
Nasir

On 6/10/07, Caleb Clausen <vikkous@gmail.com> wrote:

Nasir Khan wrote:
> define_method(meth) do |*args|
> @__ms_lock = Mutex.new unless @__ms_lock
> @__ms_lock.synchronize do
> self.send("__nonsync_#{meth}",*args)
> end
> end

This still isn't completely thread-safe, I'm afraid. If the lock
doesn't exist yet and two threads both call some synchronized method
on the same object, you've got a race condition. I'd suggest creating
the lock upfront, before the synchronization wrappers are defined....
unfortunately, that makes implementing this idea a bit more
complicated.

I think there's still a race condition... you need to change this:

        @@__ms_c_lock.synchronize { @__ms_lock = Mutex.new unless @__ms_lock
} unless @__ms_lock

To this:

        @@__ms_c_lock.synchronize { @__ms_lock = Mutex.new unless @__ms_lock
unless @__ms_lock }

Which is somewhat uglier, since now the class level mutex has to be
checked on every synchronized method call.

Now, what I had in mind was a little more like this (INCOMPLETELY TESTED):

module MethodSynchronizer

def MethodSynchronizer.included(into)
   into.sync_methods.each do |m|
     MethodSynchronizer.wrap_method(into, m)
   end
end

def MethodSynchronizer.wrap_method(klass, meth)
   klass.class_eval do
     alias_method "__nonsync_#{meth}", "#{meth}"
     require 'thread'
     define_method(:initialize_synchronizer) do
       @__ms_lock=Mutex.new
       return self
     end
     define_method(meth) do |*args|
       @__ms_lock.synchronize do
         self.send("__nonsync_#{meth}",*args)
       end
     end
   end
end
end

And now you have to make sure that #initialize_synchronizer is called
on the object before any of the synchronized methods.

(What I really was thinking of was hacking into the regular
#initialize to get it to do the extra initialization automatically...
That can be hairy; among other things, you have to get it into the
#initialize of the class being mixed-in to, rather than the module's
#initialize. It's not impossible, but it requires more code than I
want to write right now.)

···

On 6/10/07, Nasir Khan <rubylearner@gmail.com> wrote:

Thanks for the feedback. Here is a refinement -

I think there's still a race condition... you need to change this:

       @@__ms_c_lock.synchronize { @__ms_lock = Mutex.new unless @__ms_lock
} unless @__ms_lock

To this:

       @@__ms_c_lock.synchronize { @__ms_lock = Mutex.new unless @__ms_lock
unless @__ms_lock }

What is the race condition you are talking about in the first snippet above?
And why did you change it to have two identical unless's one after the other
in the second snippet?
Note the unless is checking for @__ms_lock in *both* the unless's.

- Nasir

···

On 6/10/07, Caleb Clausen <vikkous@gmail.com> wrote:

On 6/10/07, Nasir Khan <rubylearner@gmail.com> wrote:
> Thanks for the feedback. Here is a refinement -

I think there's still a race condition... you need to change this:

        @@__ms_c_lock.synchronize { @__ms_lock = Mutex.new unless
@__ms_lock
} unless @__ms_lock

To this:

        @@__ms_c_lock.synchronize { @__ms_lock = Mutex.new unless
@__ms_lock
unless @__ms_lock }

Which is somewhat uglier, since now the class level mutex has to be
checked on every synchronized method call.

Now, what I had in mind was a little more like this (INCOMPLETELY TESTED):

module MethodSynchronizer

def MethodSynchronizer.included(into)
   into.sync_methods.each do |m|
     MethodSynchronizer.wrap_method(into, m)
   end
end

def MethodSynchronizer.wrap_method(klass, meth)
   klass.class_eval do
     alias_method "__nonsync_#{meth}", "#{meth}"
     require 'thread'
     define_method(:initialize_synchronizer) do
       @__ms_lock=Mutex.new
       return self
     end
     define_method(meth) do |*args|
       @__ms_lock.synchronize do
         self.send("__nonsync_#{meth}",*args)
       end
     end
   end
end

And now you have to make sure that #initialize_synchronizer is called
on the object before any of the synchronized methods.

(What I really was thinking of was hacking into the regular
#initialize to get it to do the extra initialization automatically...
That can be hairy; among other things, you have to get it into the
#initialize of the class being mixed-in to, rather than the module's
#initialize. It's not impossible, but it requires more code than I
want to write right now.)

       @@__ms_c_lock.synchronize { @__ms_lock = Mutex.new unless
@__ms_lock
} unless @__ms_lock

What is the race condition...?

It's known as the "double-checked locking" antipattern. Generally, anything like:

unless @obj
   @lock.synchronize do
     unless @obj
       @obj = Whatever.new
     end
   end
end
@obj.some_method

is wrong. This is because (when dealing with real multi-threading -- not a major issue in Ruby yet, but it soon will be!) we depend on locks not only for preserving the order of operations, but ensuring that each thread has a consistent view of memory. For instance, when reading the instance variable @obj outside the protection of the lock, it is possible that a thread could @obj being non-null, but not seeing the effects of initializing the object until much later!

If you're interested in learning more about the details of the problem, read here:

The "Double-Checked Locking is Broken" Declaration

The article deals with Java, but in practice the problems apply to pretty much any contemporary language with native threads (and certainly to JRuby!).

Anwyay, the correct approach is something like this:

  obj = @lock.synchronize { @obj ||= Whatever.new }
  obj.some_method

Or even the slightly more verbose:

  obj = @lock.synchronize do
    @obj = Whatever.new unless @obj
    @obj
  end
  obj.some_method

Note that all assignments to @obj and all reads of @obj happen within the protection of the lock.

And why did you change it to have two identical unless's one after the
other in the second snippet?

Yes, the two unlesses are indeed redundant in that case, and failed to fix the problem generally if there were reads of the instance variable outside the lock.

-mental

···

On Tue, 12 Jun 2007 03:13:39 +0900, "Nasir Khan" <rubylearner@gmail.com> wrote:

Uh-oh, I misunderstood your code. I didn't notice that you had unless
inside the lock too. I apologize.

I wasn't even aware of this issue MenTaLguY is talking about. I would
have thought that your second version worked, (as indeed it does in
1.8, although this behavior isn't advertized) although I still prefer
my explicit initialization beforehand for something like this.....
however, is that even a valid solution? After reading MenTaLguY's link
I'm not so sure...

···

On 6/11/07, Nasir Khan <rubylearner@gmail.com> wrote:

What is the race condition you are talking about in the first snippet above?
And why did you change it to have two identical unless's one after the other
in the second snippet?
Note the unless is checking for @__ms_lock in *both* the unless's.

Wow, that was poor grammar on my part. Let me try again:

For instance, if reading the instance variable @obj outside the protection of the lock, it is possible for a thread to see that @obj is non-null long before the effects of the object's initialize method have become visible to it!

-mental

···

On Tue, 12 Jun 2007 03:45:36 +0900, MenTaLguY <mental@rydia.net> wrote:

For instance, when reading the instance variable @obj outside the
protection of the lock, it is possible that a thread could @obj being
non-null, but not seeing the effects of initializing the object until
much later!

MenTaLguY wrote:

       @@__ms_c_lock.synchronize { @__ms_lock = Mutex.new unless
@__ms_lock
} unless @__ms_lock

What is the race condition...?

It's known as the "double-checked locking" antipattern. Generally, anything like:

unless @obj
   @lock.synchronize do
     unless @obj
       @obj = Whatever.new
     end
   end
end
@obj.some_method

is wrong. This is because (when dealing with real multi-threading --
not a major issue in Ruby yet, but it soon will be!) we depend on
locks not only for preserving the order of operations, but ensuring
that each thread has a consistent view of memory. For instance, when
reading the instance variable @obj outside the protection of the
lock, it is possible that a thread could @obj being non-null, but not
seeing the effects of initializing the object until much later!

(...and later restated)

For instance, if reading the instance variable @obj outside the
protection of the lock, it is possible for a thread to see that @obj
is non-null long before the effects of the object's initialize method
have become visible to it!

So you're saying that, as the line

         @obj = Whatever.new

is executed, it is possible (in some ruby implementations, but not MRI 1.8) that @obj will be assigned the Whatever instance *before* the #new call completes? (That does seem to be what the wikipedia entry is warning about.[1])

Scary! Your point that we should rely on the locking primitives for all access (and hope that they get the memory barrier right) is well taken. Which means always paying the sync cost:

   def obj
     @lock.synchronize do
       @obj ||= Whatever.new
     end
   end

I wonder if the following will be a more efficient alternative, or worse because of the singleton method:

   def obj
     @lock.synchronize do
       @obj = Whatever.new
       def self.obj
         @obj
       end
     end
     @obj
   end

I suppose that has the same problem, depending on implementation, in that the method definition could be reordered before the assignment...

[1] Double-checked locking - Wikipedia

···

On Tue, 12 Jun 2007 03:13:39 +0900, "Nasir Khan" <rubylearner@gmail.com> wrote:

--
       vjoel : Joel VanderWerf : path berkeley edu : 510 665 3407

This begs a few questions -

- Is it true for Ruby today?
- If not then will it ever be true in future?
- Is there some text somewhere where Ruby memory model for interpreter
developers? Like Java Language spec?
- Since Ruby doesnt have a volatile keyword does it mean that all variable
access is essentially non-volatile?

Thanks
Nasir

···

On 6/11/07, MenTaLguY <mental@rydia.net> wrote:

On Tue, 12 Jun 2007 03:45:36 +0900, MenTaLguY <mental@rydia.net> wrote:
> For instance, when reading the instance variable @obj outside the
> protection of the lock, it is possible that a thread could @obj being
> non-null, but not seeing the effects of initializing the object until
> much later!

Wow, that was poor grammar on my part. Let me try again:

For instance, if reading the instance variable @obj outside the protection
of the lock, it is possible for a thread to see that @obj is non-null long
before the effects of the object's initialize method have become visible to
it!

-mental

So you're saying that, as the line

         @obj = Whatever.new

is executed, it is possible (in some ruby implementations, but not MRI
1.8) that @obj will be assigned the Whatever instance *before* the #new
call completes? (That does seem to be what the wikipedia entry is
warning about.[1])

More or less. Some of this is a result of instruction reordering performed by both the compiler and the CPU, and some of this is the result of stuff not getting written out to shared memory right away. Either way, it's possible for one thread's operations to appear in a different order to other threads.

The effect of a memory barrier is to bring all threads involved into sync, so to speak, so that whatever is going on behind the scenes, they all see things happening in the same order. Memory barriers are the blue pill -- you really don't want to see how deep the concurrency rabbit hole goes.

I wonder if the following will be a more efficient alternative, or worse
because of the singleton method:

   def obj
     @lock.synchronize do
       @obj = Whatever.new
       def self.obj
         @obj
       end
     end
     @obj
   end

I suppose that has the same problem, depending on implementation, in
that the method definition could be reordered before the assignment...

There's also a race condition. Consider:

thread #1: enters #obj
thread #2: enters #obj
thread #2: acquires @lock
thread #1: waits for @lock
thread #2: @obj = Whatever.new (#<Whatever:0x1234abcd>)
thread #2: redefines obj
thread #2: releases @lock
thread #2: returns @obj (#<Whatever:0x1234abcd>)
thread #1: wakes up
thread #1: acquires @lock
thread #1: @obj = Whatever.new (#<Whatever:0xcafebeef>)
thread #1: redefines obj
thread #1: releases @lock
thread #1: returns @obj (#<Whatever:0xcafebeef>)

Sometimes when writing multi-threaded code, it helps to pretend that the thread scheduler hates your guts and has it in for you personally. I think you'd have to do something like this at least:

def obj
   @lock.synchronize do
     @obj ||= Whatever.new
     def self.obj
       @obj
     end
     @obj
   end
end

But there's still the un-synchronized read of @obj from within the singleton method, which can even occur before @lock has been released by the writing thread. It might often work in practice, since the Ruby implementation is (hopefully) doing some kind of synchronization when reading/updating its method tables, preventing the CPU from reordering things across the method definition. But a lot depends on the memory model, and there's some pretty crazy stuff out there (*cough* Alpha *cough)...

It's probably best to stick to the simplest correct way:

def obj
   @lock.synchronize do
     @obj ||= Whatever.new
   end
end

If the locking turns out to be a bottleneck, then the method's callers can cache its result in variables private to their respective threads, which will not require any locking at all to read.

-mental

···

On Tue, 12 Jun 2007 07:32:11 +0900, Joel VanderWerf <vjoel@path.berkeley.edu> wrote: