Threads and DRb

I changed the title here because this is not
strictly a continuation of the other thread
(no pun on the words “thread” or “continuation”).

This discussion has made me wonder if there is
some “generic” way of handling serializing in
druby.

Wouldn’t it be nice if you could simply mix in
a module, like

myobj.extend(ThreadSafe)

and then start the service normally and forget it?

How would you approach this? I have an idea but
it seems clunky.

Something similar could perhaps be done to allow
pooling.

Thoughts??

Hal

Hmm. DRb uses obj.send to invoke a method. (I don’t know what the
difference is between this and obj.send)

So, in principle you could do:

module ThreadSafe
def __tsinit
@__tsmutex = Mutex.new
end
def send(*args)
@__tsmutex.synchronize { super(*args) }
end
end

When you run this Ruby says:
threadsafe.rb:5: warning: redefining `send’ may cause serious problem

but surprisingly it does actually seem to work:

a = SlowServer.new
a.extend ThreadSafe
a.__tsinit
DRb.start_service(‘druby://localhost:9000’, a)

I would really like the ‘extend ThreadSafe’ to perform the initialisation of
the instance variable automatically though. Is there a way to do that?

Regards,

Brian.

···

On Tue, Mar 11, 2003 at 07:16:34AM +0900, Hal E. Fulton wrote:

This discussion has made me wonder if there is
some “generic” way of handling serializing in
druby.

Wouldn’t it be nice if you could simply mix in
a module, like

myobj.extend(ThreadSafe)

and then start the service normally and forget it?

How would you approach this? I have an idea but
it seems clunky.

This discussion has made me wonder if there is
some “generic” way of handling serializing in
druby.

Wouldn’t it be nice if you could simply mix in
a module, like

myobj.extend(ThreadSafe)

and then start the service normally and forget it?

I always use MonitorMixin.

require ‘monitor’

class MyClass
include MonitorMixin

def initialize
super

end

def foo
synchronize do

end
end
end

Do you want to call synchronize automatically?
How is a thing like Delegator ?

require ‘monitor’

class SyncDelegator
include MonitorMixin

def initialize(obj)
super()
preserved = ::Kernel.instance_methods
preserved -= [“to_s”,“to_a”,“inspect”,“==”,“=~”,“===”]
for t in self.type.ancestors
preserved |= t.instance_methods
preserved |= t.private_instance_methods
preserved |= t.protected_instance_methods
break if t == SyncDelegator
end
for method in obj.methods
next if preserved.include? method
eval <<-EOS
def self.#{method}(*args, &block)
begin
synchronize do
getobj.send(:#{method}, *args, &block)
end
rescue Exception
$@.delete_if{|s| /:in __getobj__'$/ =~ s} #
$@.delete_if{|s| /^\(eval\):confused: =~ s}
raise
end
end
EOS
end
end

def getobj
raise NotImplementError, “need to define `getobj’”
end

end

class SimpleSyncDelegator<SyncDelegator

def initialize(obj)
super
@obj = obj
end

def getobj
@obj
end

def setobj(obj)
@obj = obj
end
end

if FILE == $0
class Slow
def do_it
p [:do_it, Time.now]
sleep 2
p [:done]
end
end

if true
# sync
slow = Slow.new
$slow = SimpleSyncDelegator.new(slow)
else
# don’t sync
$slow = Slow.new
end

t1 = Thread.new do
5.times do
$slow.do_it
end
end

t2 = Thread.new do
5.times do
$slow.do_it
end
end

t1.join
t2.join
end

(I could be wrong)
It seems to me send() is provided in case you define a send()

···

il Tue, 11 Mar 2003 07:57:29 +0900, Brian Candler B.Candler@pobox.com ha scritto::

Hmm. DRb uses obj.send to invoke a method. (I don’t know what the
difference is between this and obj.send)

Hmm. DRb uses obj.send to invoke a method. (I don’t know what the
difference is between this and obj.send)

So, in principle you could do:

module ThreadSafe
def __tsinit
@__tsmutex = Mutex.new
end
def send(*args)
@__tsmutex.synchronize { super(*args) }
end
end

When you run this Ruby says:
threadsafe.rb:5: warning: redefining `send’ may cause serious problem

but surprisingly it does actually seem to work:

a = SlowServer.new
a.extend ThreadSafe
a.__tsinit
DRb.start_service(‘druby://localhost:9000’, a)

I’d like Matz’s comments on when this redefinition
is/isn’t safe.

I would really like the ‘extend ThreadSafe’ to perform the initialisation
of
the instance variable automatically though. Is there a way to do that?

I suppose you could add a line to
send:

if not @__tsmutex then @__tsmutex = Mutex.new end

But then that slows down every method call by a
few more microseconds.

Probably a better way, but it doesn’t jump out
at me.

Hal

···

----- Original Message -----
From: “Brian Candler” B.Candler@pobox.com
To: “ruby-talk ML” ruby-talk@ruby-lang.org
Sent: Monday, March 10, 2003 4:57 PM
Subject: Re: Threads and DRb

I always use MonitorMixin.

[snip]

Very good… having to remember to do
the synchronize is a minor problem.

Do you want to call synchronize automatically?
How is a thing like Delegator ?

:slight_smile: You read my mind.

[snip]

The only problem is that you must
subclass another class. If you
have already inherited from
something else, this is a problem.

What do you think of Robert’s
solution?

Hal

···

----- Original Message -----
From: “Masatoshi SEKI” m_seki@mva.biglobe.ne.jp
To: “ruby-talk ML” ruby-talk@ruby-lang.org
Sent: Tuesday, March 11, 2003 4:36 AM
Subject: Re: Threads and DRb

“Hal E. Fulton” wrote:

a.extend ThreadSafe

If it is always done like above, couldn’t you do the

if not @__tsmutex then @__tsmutex = Mutex.new end

In a system hook once per server object?

http://www.rubycentral.com/book/ref_c_module.html#Module.extend_object

···


([ Kent Dahl ]/)_ ~ [ http://www.stud.ntnu.no/~kentda/ ]/~
))_student
/(( _d L b_/ NTNU - graduate engineering - 5. year )
( __õ|õ// ) )Industrial economics and technological management(
_
/ö____/ (_engineering.discipline=Computer::Technology)

I would really like the ‘extend ThreadSafe’ to perform the initialisation
of
the instance variable automatically though. Is there a way to do that?

I suppose you could add a line to
send:

if not @__tsmutex then @__tsmutex = Mutex.new end

or shorter: @__tsmutex ||= Mutex.new

But then that slows down every method call by a
few more microseconds.

More importantly, it’s not thread-safe for the creation of the Mutex itself.
I guess it’s unlikely that this race would cause a problem, but in theory it
could.

I like Kent’s suggestion of using the extend_object trigger at the point
where the Module is included, this wasn’t something I was aware of.

In the end, because I needed a pattern which distributed requests amongst
multiple objects, I came up with the attached solution. It lets you create a
pool of objects, and you can pass the object representing the pool to a DRb
server (or a SOAP server, or any server which runs concurrent threads on the
same object). Incoming method calls are parcelled out to the individual
objects, and no object handles more than one request at a time.

I’d be interested in any comments on the attached solution (especially flaws
that I’ve overlooked); if it is sound then I’ll stick it up on the Wiki.

Regards,

Brian.

facade.rb (2.12 KB)

···

On Tue, Mar 11, 2003 at 08:48:09AM +0900, Hal E. Fulton wrote:

“Kent Dahl” kentda@stud.ntnu.no schrieb im Newsbeitrag
news:3E6D8640.68BEFA28@stud.ntnu.no…

“Hal E. Fulton” wrote:

a.extend ThreadSafe

If it is always done like above, couldn’t you do the

if not @__tsmutex then @__tsmutex = Mutex.new end

In a system hook once per server object?

http://www.rubycentral.com/book/ref_c_module.html#Module.extend_object

Can’t one just alias initialze and rededine it?

module ThreadSafe
alias __tsinit initialize

def initialize(*args)
@__tsmutex = Mutex.new
__tsinit(*args)
end
def send(*args)
@__tsmutex.synchronize { super(*args) }
end
end

robert

In the end, because I needed a pattern which distributed requests amongst
multiple objects, I came up with the attached solution. It lets you create
a
pool of objects, and you can pass the object representing the pool to a
DRb
server (or a SOAP server, or any server which runs concurrent threads on
the
same object). Incoming method calls are parcelled out to the individual
objects, and no object handles more than one request at a time.

I’d be interested in any comments on the attached solution (especially
flaws
that I’ve overlooked); if it is sound then I’ll stick it up on the Wiki.

That’s pretty interesting.

I take it this solves threading and pooling at
the same time, and it’s transparent from the
client end?

I"m a little fuzzy on the details of how
it works. Do you feel like clarifying?

For example: Does every object have to have
alive? and suicide methods, or is this just
an example? Why is alive? defined in the
module? If these two methods are just for
example purposes, how do we determine in
general when a pool entry is usable or
active or whatever?

Thanks,
Hal

···

----- Original Message -----
From: “Brian Candler” B.Candler@pobox.com
To: “ruby-talk ML” ruby-talk@ruby-lang.org
Sent: Tuesday, March 11, 2003 5:45 PM
Subject: Re: Threads and DRb

“Brian Candler” B.Candler@pobox.com schrieb im Newsbeitrag
news:20030311234504.GA51909@uk.tiscali.com

I’d be interested in any comments on the attached solution
(especially flaws
that I’ve overlooked); if it is sound then I’ll stick it up on the
Wiki.

Great work!

Things I like:

Using a block as factory.

Things I don’t understand:

Why do you introduce method alive? Are there conditions under which
an instance in the pool can die? If not, you could check sanity when
you take back an instance and put it into the pool.

Why do you rededine the set of methods to invoke method_missing? This
happens anyway if a method is not implemented.

Room for improvement:

I’d change the design in a way that the pool and the facade are
separated. This is orthogonal functionality and by extracting the
pool functionality to another class the design becomes cleaner and
more modular, hence improved reusability.

The @poolnotempty.wait(@mutex) should be in a loop because with only
one element in the pool and two threads waiting and both are signalled
only one can fetch the element and the other has to continue waiting.
Normally this can’t occur in your example, but there might be other
signals sent via the same mutex; it saver to recheck the condition.

You can look at some stuff I put here for threading purposes:

Kind regards

robert

apparently the first post did not get through. If it did, please
ignore this.

r.

“Brian Candler” B.Candler@pobox.com schrieb im Newsbeitrag
news:20030311234504.GA51909@uk.tiscali.com

I’d be interested in any comments on the attached solution
(especially flaws
that I’ve overlooked); if it is sound then I’ll stick it up on the
Wiki.

Great work!

Things I like:

Using a block as factory.

Things I don’t understand:

Why do you introduce method alive? Are there conditions under which
an instance in the pool can die? If not, you could check sanity when
you take back an instance and put it into the pool.

Why do you rededine the set of methods to invoke method_missing? This
happens anyway if a method is not implemented.

Room for improvement:

I’d change the design in a way that the pool and the facade are
separated. This is orthogonal functionality and by extracting the
pool functionality to another class the design becomes cleaner and
more modular, hence improved reusability.

The @poolnotempty.wait(@mutex) should be in a loop because with only
one element in the pool and two threads waiting and both are signalled
only one can fetch the element and the other has to continue waiting.
Normally this can’t occur in your example, but there might be other
signals sent via the same mutex; it saver to recheck the condition.

You can look at some stuff I put here for threading purposes:

Kind regards

robert

“Kent Dahl” kentda@stud.ntnu.no schrieb im Newsbeitrag
news:3E6D8640.68BEFA28@stud.ntnu.no…

“Hal E. Fulton” wrote:

a.extend ThreadSafe

If it is always done like above, couldn’t you do the

if not @__tsmutex then @__tsmutex = Mutex.new end

In a system hook once per server object?

http://www.rubycentral.com/book/ref_c_module.html#Module.extend_object

Can’t one just alias initialze and rededine it?

module ThreadSafe
alias __tsinit initialize

def initialize(*args)
@__tsmutex = Mutex.new
__tsinit(*args)
end
def send(*args)
@__tsmutex.synchronize { super(*args) }
end
end

Good ideas, Kent and Robert.

I think I like this one best so far.

I’m still curious to know the potential
dangers of redefining send, if anyone
can elaborate.

Hal

···

----- Original Message -----
From: “Robert Klemme” bob.news@gmx.net
Newsgroups: comp.lang.ruby
To: “ruby-talk ML” ruby-talk@ruby-lang.org
Sent: Tuesday, March 11, 2003 3:50 AM
Subject: Re: Threads and DRb

I’d be interested in any comments on the attached solution (especially
flaws
that I’ve overlooked); if it is sound then I’ll stick it up on the Wiki.

That’s pretty interesting.

I take it this solves threading and pooling at
the same time, and it’s transparent from the
client end?

I"m a little fuzzy on the details of how
it works. Do you feel like clarifying?

Sure. @pool is created as an array of object instances. Whenever an
incoming method request arrives, the facade picks one off the front of the
array. If it has an ‘alive?’ method then it checks that first, and replaces
it with a fresh object if it returns false. It then sends it the requested
message. The ‘ensure’ clause puts it back at the end of the array after the
method has finished (successfully or not).

Modifications to @pool are mutex-protected. The objects themselves need no
mutex, because each one is removed from the pool while it is processing a
message, and therefore it won’t receive a second one.

For example: Does every object have to have
alive? and suicide methods, or is this just
an example?

‘suicide’ was just an example, but it shows how to use ‘alive?’

My thinking is than a pool object may often contain other objects, for
example a DBI handle. If something goes seriously wrong with that and the
database disconnects, it is more convenient for the pool object to signal
that it wishes to be recreated from scratch, than having to handle the
re-initialisation of its component objects itself.

Why is alive? defined in the
module?

If the user happens to send an ‘alive?’ message to the pool front object, it
would be forwarded to an arbitrary member of the pool, which is a bit of a
waste of time. In particular, it might happen if you set up a chain of
objects, where a pool contains objects which are themselves DRb front-ends
to a remote object pool:

          machine A                     machine B

     P1              ,-- O1 --.  P2                ,-- O4
    ----> SimplePool --- O2 ----------> SimplePool --- O5
                     `-- O3 --'                    `-- O6

(I had this running with protocol P1 = SOAP and protocol P2 = DRb !)

As part of its normal operation, SimplePool sends ‘alive?’ to O1/O2/O3, and
it’s pretty pointless to proxy this to a random selection out of O4/O5/O6.
However, in this case, it would be better for O1/O2/O3 to trap ‘alive?’
locally to save the round-trip requests.

Actually, I was seeing exceptions raised, and I think I was being confused
by the fact that ‘respond_to?’ and ‘alive?’ can’t be sent directly over SOAP
without mapping to a different name (which doesn’t contain a question mark).
NaHi is going to improve the error reporting in that case.

If the short-circuit ‘alive?’ were removed, I wouldn’t have an objection :slight_smile:

Regards,

Brian.

···

On Wed, Mar 12, 2003 at 09:26:21AM +0900, Hal E. Fulton wrote:

Things I don’t understand:

Why do you introduce method alive? Are there conditions under which
an instance in the pool can die? If not, you could check sanity when
you take back an instance and put it into the pool.

Sure, I didn’t see much difference in checking sanity before or after the
call though. The instance in the pool itself can’t “die”, but it could get
into a state where it is easier just to reinitialise it.

Thinking about this again, maybe it makes sense to check sanity only after
an exception has been raised. In that case, the sanity check can be
something reasonably active: e.g.

def alive?
@dbh.select_one(“select 123 from dual”) == 123
end

If that returns false or an exception, then you can be pretty sure that the
instance has a problem, and it needs re-initialising.

Perhaps even cleverer would be to have a separate thread for putting new
objects into the pool, which means you can rate-limit them to one every
second (say)

Why do you rededine the set of methods to invoke method_missing? This
happens anyway if a method is not implemented.

It doesn’t work with DRb. DRb validates method calls explicitly and raises
an exception if they are not present: in drb/drb.rb see
‘check_insecure_method’, which is called from its own ‘method_missing’

So to keep DRb happy, the pool object has to have real methods corresponding
to the ones in the pool objects.

You don’t need to list the methods explicitly in the facade in cases where
method_missing is sufficient though.

Room for improvement:

I’d change the design in a way that the pool and the facade are
separated. This is orthogonal functionality and by extracting the
pool functionality to another class the design becomes cleaner and
more modular, hence improved reusability.

Sounds reasonable, how do you suggest: (facade) has-a (pool) ?
I can’t think of a good way to express that though.

The @poolnotempty.wait(@mutex) should be in a loop because with only
one element in the pool and two threads waiting and both are signalled
only one can fetch the element and the other has to continue waiting.
Normally this can’t occur in your example

Even if it did, wouldn’t that be the correct behaviour? i.e. two threads are
waiting on an empty pool, the cv is signalled when one object is put back
into the pool (which allows one of the waiting threads to take it back out
again). When a second object is put back into the pool the cv will be
signalled again, so the second waiting thread will be woken. Or am I missing
a sequence which causes this to fail?

You can look at some stuff I put here for threading purposes:
http://www.rubygarden.org/ruby?MultiThreading

Nice introduction. I found ‘ObjectPoolingAndThreading’ before, but the code
given assumed that it was spawning fresh threads itself - which doesn’t work
where some other process is spawning them.

Cheers,

Brian.

···

On Wed, Mar 12, 2003 at 06:13:46PM +0900, Robert Klemme wrote:

Brian,

I slept over it and came up with this pool implementation (see attachment).
The facade part is still missing, but I’ve no experience with drb so I guess
that’s the part where you can give better input.

“Brian Candler” B.Candler@pobox.com schrieb im Newsbeitrag
news:20030312210421.GA52600@uk.tiscali.com

Things I don’t understand:

Why do you introduce method alive? Are there conditions under which
an instance in the pool can die? If not, you could check sanity when
you take back an instance and put it into the pool.

Sure, I didn’t see much difference in checking sanity before or after the
call though. The instance in the pool itself can’t “die”, but it could get
into a state where it is easier just to reinitialise it.

Thinking about this again, maybe it makes sense to check sanity only after
an exception has been raised. In that case, the sanity check can be
something reasonably active: e.g.

def alive?
@dbh.select_one(“select 123 from dual”) == 123
end

If that returns false or an exception, then you can be pretty sure that
the
instance has a problem, and it needs re-initialising.

Sounds reasonable but not feasible with a separated pool. My pool does it
on insertion and retrieval. Maybe this should be changed if we assume that
instances don’t change their state in the pool. Then checking on reinsert
should be sufficient.

One could adhere to the pattern that an instance is only put back into the
pool if the method did not throw. This is easy achieved by changing the
pattern

obj = pool.get
begin
obj.do()
ensure
pool.put obj
end

to

obj = pool.get
obj.do()
pool.put obj

which will omit the put() in case of an exception.

Perhaps even cleverer would be to have a separate thread for putting new
objects into the pool, which means you can rate-limit them to one every
second (say)

Why do you rededine the set of methods to invoke method_missing? This
happens anyway if a method is not implemented.

It doesn’t work with DRb. DRb validates method calls explicitly and raises
an exception if they are not present: in drb/drb.rb see
‘check_insecure_method’, which is called from its own ‘method_missing’

So to keep DRb happy, the pool object has to have real methods
corresponding
to the ones in the pool objects.

You don’t need to list the methods explicitly in the facade in cases where
method_missing is sufficient though.

Ah! I see.

Room for improvement:

I’d change the design in a way that the pool and the facade are
separated. This is orthogonal functionality and by extracting the
pool functionality to another class the design becomes cleaner and
more modular, hence improved reusability.

Sounds reasonable, how do you suggest: (facade) has-a (pool) ?
I can’t think of a good way to express that though.

Well, just hand the pool over to the facade on construction or create it in
initialize().

The @poolnotempty.wait(@mutex) should be in a loop because with only
one element in the pool and two threads waiting and both are signalled
only one can fetch the element and the other has to continue waiting.
Normally this can’t occur in your example

Even if it did, wouldn’t that be the correct behaviour? i.e. two threads
are
waiting on an empty pool, the cv is signalled when one object is put back
into the pool (which allows one of the waiting threads to take it back out
again). When a second object is put back into the pool the cv will be
signalled again, so the second waiting thread will be woken. Or am I
missing
a sequence which causes this to fail?

No, as I said this can’t happen in your example. But the more general
pattern is to repeat the check. That’s not too costly and you avoid strange
errors if the condition variable is used for other signals. (I can’t see
such an application yet, and maybe I’m too much influenced by the Java
notify() and notifyAll()…)

I think the situation changes with a separated pool: In that case you want
to limit the number of threads that is concurrently running. In that case
the pool does not need a condition variable (see my example). Instead the
facade instance needs a semaphore which is initialized to the max amount of
threads and decremented on entry and incremented on exit of a method.

You can look at some stuff I put here for threading purposes:
http://www.rubygarden.org/ruby?MultiThreading

Nice introduction. I found ‘ObjectPoolingAndThreading’ before, but the
code
given assumed that it was spawning fresh threads itself - which doesn’t
work
where some other process is spawning them.

Thanks! I’m not satisfied with it yet, but I try to add something from time
to time.

Kind regards

robert

thread-safe-instance-pool.rb (1.15 KB)

···

On Wed, Mar 12, 2003 at 06:13:46PM +0900, Robert Klemme wrote: