[RFC] The Early Demise of Myriad (Thanks To Ruby Threads)

Hi Everyone,

I figured out this weekend that Ruby's Thread implementation causes the Ruby/Event binding I wrote to completely stall and go dead. After reviewing the Ruby source and watching several strace runs, it's clear that the Ruby Thread implementation uses select in a way that--while not being bad--just isn't compatible with libevent. The second a thread is created Ruby assumes it's the only game in town and doesn't relinquish control.

This basically means that, I have three choices as I see it:

  1) Ruby/Event becomes completely evil and redefines Thread's initialize so that it throws an exception telling you to not use threads. Ugh.
  2) Implement DRb using Myriad so that people can get the advantages of threads without using them. Also ugh. No protocol specs does not make for a fun time implementing a protocol. And no, source code is *not* a protocol specification.
  3) Abandon Ruby/Event entirely and just base a new implementation of Myriad on the Ruby thread stuff. I know you purists will think this is great, but you do realize that I get *incredible* performance from libevent and that a pure Ruby implementation would only handle as many connections as select supports (which is sometimes 256)?

I'm writing to get people's feedback on the usefulness of Ruby/Event and Myriad. Take a look at the project:

http://www.zedshaw.com/projects/ruby_event/

I'm seriously leaning toward just dumping the whole thing, implementing a simple SCGI setup in pure Ruby since that's the most useful part, and then working on FastCST again.

Tell me if, given this new information, you think Myriad and/or the Ruby/Event bindings would still be useful. Also if you'd accept one of the above options.

Thanks for your time and your opinion.

Zed A. Shaw
http://www.zedshaw.com/

In article <20050911122118.7742196a.zedshaw@zedshaw.com>,
  "Zed A. Shaw" <zedshaw@zedshaw.com> writes:

a pure Ruby implementation would only handle as many connections as select supports (which is sometimes 256)?

Ruby 1.9 allocates fd_set dynamically. So the FD_SETSIZE restriction
is relaxed.

···

--
Tanaka Akira

Zed A. Shaw wrote:

Hi Everyone,

I figured out this weekend that Ruby's Thread implementation causes the Ruby/Event binding I wrote to completely stall and go dead. After reviewing the Ruby source and watching several strace runs, it's clear that the Ruby Thread implementation uses select in a way that--while not being bad--just isn't compatible with libevent. The second a thread is created Ruby assumes it's the only game in town and doesn't relinquish control.

This basically means that, I have three choices as I see it:

1) Ruby/Event becomes completely evil and redefines Thread's initialize so that it throws an exception telling you to not use threads. Ugh.
2) Implement DRb using Myriad so that people can get the advantages of threads without using them. Also ugh. No protocol specs does not make for a fun time implementing a protocol. And no, source code is *not* a protocol specification.
3) Abandon Ruby/Event entirely and just base a new implementation of Myriad on the Ruby thread stuff. I know you purists will think this is great, but you do realize that I get *incredible* performance from libevent and that a pure Ruby implementation would only handle as many connections as select supports (which is sometimes 256)?

I'm writing to get people's feedback on the usefulness of Ruby/Event and Myriad. Take a look at the project:

http://www.zedshaw.com/projects/ruby_event/

I'm seriously leaning toward just dumping the whole thing, implementing a simple SCGI setup in pure Ruby since that's the most useful part, and then working on FastCST again.

Tell me if, given this new information, you think Myriad and/or the Ruby/Event bindings would still be useful. Also if you'd accept one of the above options.

Thanks for your time and your opinion.

Zed A. Shaw
http://www.zedshaw.com/

Hello,

I'm a Ruby newb but have been programming for many years and have been attracted to Ruby from the good things I've read about it on the Net. I'm responding to the RFC because we've implemented a messaging system in C and have been investigating adding messaging to Ruby.

The could be relevant to this discussion because my understanding is that Ruby/Event's main purpose is provide asynchronous notification to a application which is the same purpose of the messaging system.

In the messaging system we've done, the messages are sent and processed asynchronously, and sending a message will be guaranteed never to block the sender and the receiver receive one message at a time. This has profound effect that a system can be composed of many (100's or 1000's or ...) of independent objects (threads/components) all communicating and working without the need for mutual exclusion objects (mutex/critical sections). Is there any reason to believe this won't work?

Zed, does this strike you as interesting or related?

Is this interesting to anyone else?

Regards from a newb,

Wink

"Zed A. Shaw" <zedshaw@zedshaw.com> writes:

This basically means that, I have three choices as I see it:

  1) Ruby/Event becomes completely evil and redefines Thread's
  initialize so that it throws an exception telling you to not
  use threads. Ugh.

  2) Implement DRb using Myriad so that people can get the
  advantages of threads without using them. Also ugh. No
  protocol specs does not make for a fun time implementing a
  protocol. And no, source code is *not* a protocol
  specification.

  3) Abandon Ruby/Event entirely and just base a new
  implementation of Myriad on the Ruby thread stuff. I know you
  purists will think this is great, but you do realize that I
  get *incredible* performance from libevent and that a pure
  Ruby implementation would only handle as many connections as
  select supports (which is sometimes 256)?

Or, 4) Become a great Ruby hacker and replace the Ruby thread select()
        completely with libevent. :wink:

···

Thanks for your time and your opinion.

Zed A. Shaw

--
Christian Neukirchen <chneukirchen@gmail.com> http://chneukirchen.org

I'm writing to get people's feedback on the usefulness of Ruby/Event and Myriad. Take a look at the project:

http://www.zedshaw.com/projects/ruby_event/

I'm seriously leaning toward just dumping the whole thing, implementing

  This is so depressing for me to hear, I wonder what it must be like for you :frowning:

a simple SCGI setup in pure Ruby since that's the most useful part, and [...]

Obviously this by itself would be a great contribution. But I would rather vote for your option #3. Although mitigating the usefulness of Myriad, it is a path where you do not dump everything you have done... and Ruby/Event remains useful as long as the application does not require 'thread'. :S

Is there any chance that you could libeventify the Ruby interpreter ?

Tell me if, given this new information, you think Myriad and/or the Ruby/Event bindings would still be useful. Also if you'd accept one of the above options.

Thanks for your time and your opinion.

Keep up !

···

On Sun, 11 Sep 2005 18:19:37 +0200, Zed A. Shaw <zedshaw@zedshaw.com> wrote:

Zed A. Shaw
http://www.zedshaw.com/

--
Katarina

Hi

As i mentioned I tweaked my chat to use myriad, and it's going a lot
better than it used to, with about 40/50 people chatting - so it can't
be all bad :). I'm hoping that as long as I don't need to use threads
then I'm golden for the moment? Or can i expect strange behaviour

Thanks

R

Thanks, that really helps me decide if I should continue. FD_SETSIZE in a Ruby implementation that will be done "when it's done" is exactly the information I needed to know that Myriad is worth continuing.

Instead of continuing to prop-up the select method, please take a look at using libevent instead. Just based on my experiences with a *10 fold* increase in performance over Ruby Threads should be enough motivation. I tried my best but that just didn't work.

Zed A. Shaw

···

On Mon, 12 Sep 2005 02:12:18 +0900 Tanaka Akira <akr@m17n.org> wrote:

In article <20050911122118.7742196a.zedshaw@zedshaw.com>,
  "Zed A. Shaw" <zedshaw@zedshaw.com> writes:

> a pure Ruby implementation would only handle as many connections as select supports (which is sometimes 256)?

Ruby 1.9 allocates fd_set dynamically. So the FD_SETSIZE restriction
is relaxed.
--
Tanaka Akira

This is apparently in the works for Ruby's "whenever-it's-done-version", although I haven't noticed much mention of it other than an occaisional person announced as being charged with doing libevent.

No point in duplicating work.

Zed A. Shaw

···

On Mon, 12 Sep 2005 03:33:32 +0900 Christian Neukirchen <chneukirchen@gmail.com> wrote:

Or, 4) Become a great Ruby hacker and replace the Ruby thread select()
        completely with libevent. :wink:

Yes!

Randy Kramer

···

On Sunday 11 September 2005 01:29 pm, Wink Saville wrote:

In the messaging system we've done, the messages are sent and processed
asynchronously, and sending a message will be guaranteed never to block
the sender and the receiver receive one message at a time. This has
profound effect that a system can be composed of many (100's or 1000's
or ...) of independent objects (threads/components) all communicating
and working without the need for mutual exclusion objects
(mutex/critical sections). Is there any reason to believe this won't work?

Zed, does this strike you as interesting or related?

Is this interesting to anyone else?

Hey Wink,

There's a group of people who are clamouring for a Ruby answer to JMS. Not sure if that's what you're aiming for with this, but I'd say it's worth it if you can get it to work.

The main thing to worry about (based on my experience) is the following:

- Ruby's Garbage Collector isn't all that great. It'll thrash your C code unless you basically register your C objects with it in some way. This means for you (as it did for me) that even if you're storing the C memory in an internal data structure you still need to store the attached Ruby objects into a Hash. This double storage really impacts performance.
  * And before someone tries to answer this complaint by explaining that you can register objects with the GC, I would like to refer you to your nearest library to go read the runtime estimates for removing a random object from a linked list.
- Added to the above is the problem that the GC isn't all that great and will cause massive pauses if you don't manage the memory carefully on the C side of things.
+ Ruby's C extension API is fantastic. It'll be cake to work with compared with any of the others. The only API I thought was better was Lua's, but it's also weird.
+ Ruby's Garbage Collector works better with C code than Java's.
- Test your code with a Thread that creates lots of randomly sleeping Threads. This is how I discovered the Ruby problem.
- If you want your code to work on many platforms, then you'll be stuck writing it cross platform for windows. If you can do it in pure Ruby then you're ok.
- Ruby's performance ain't so hot. 1.9 is supposed to be fixing this when it comes out.

Otherwise I think you could work it, and there'd be a market for it possibly.

Zed A. Shaw

···

On Mon, 12 Sep 2005 02:29:42 +0900 Wink Saville <wink@saville.com> wrote:

Zed A. Shaw wrote:

Hello,

I'm a Ruby newb but have been programming for many years and have been
attracted to Ruby from the good things I've read about it on the Net.
I'm responding to the RFC because we've implemented a messaging system
in C and have been investigating adding messaging to Ruby.

The could be relevant to this discussion because my understanding is
that Ruby/Event's main purpose is provide asynchronous notification to a
application which is the same purpose of the messaging system.

In the messaging system we've done, the messages are sent and processed
asynchronously, and sending a message will be guaranteed never to block
the sender and the receiver receive one message at a time. This has
profound effect that a system can be composed of many (100's or 1000's
or ...) of independent objects (threads/components) all communicating
and working without the need for mutual exclusion objects
(mutex/critical sections). Is there any reason to believe this won't work?

Zed, does this strike you as interesting or related?

Is this interesting to anyone else?

Regards from a newb,

Wink

"Zed A. Shaw" <zedshaw@zedshaw.com> writes:

This basically means that, I have three choices as I see it:

  1) Ruby/Event becomes completely evil and redefines Thread's
  initialize so that it throws an exception telling you to not
  use threads. Ugh.

  2) Implement DRb using Myriad so that people can get the
  advantages of threads without using them. Also ugh. No

         [...]

  3) Abandon Ruby/Event entirely and just base a new
  implementation of Myriad on the Ruby thread stuff. I know you

         [Performance implication...]

Or, 4) Become a great Ruby hacker and replace the Ruby thread select()
       completely with libevent. :wink:

   or, 5) Contribute to the threading code in some other way which
         would make life easier for people including yourself.

This is suggested because you seem to have familiarity with the
issue that multithreaded programs create, which is a subtle area, so
you could well provide something helpful. Obviously the main
obstacle is becoming familiar enough with the internals that you can
do this confidently, but it sounds like you are well on the way
there already, if you haven't actually arrived.

         Hugh

···

On Mon, 12 Sep 2005, Christian Neukirchen wrote:

Zed A. Shaw wrote:

<Great info snipped>

Thanks for the feed back and the gotcha's! Let me noodle on this a while and then possibly start a discussions about my idea's.

One big worry that you've identified is the GC. In the current messaging system we manage memory from a pool malloc'd once and never freed. So would the following work: I register the entire pool as a single entity with the GC. Then can create multiple ruby objects referring into the pool. When Ruby is done an object will it try to "free" the object or will it know that there is another reference and let me manage the memory?

Thanks in advance,

Wink

other than liscense issues - i wonder why not build upon a library which
already deals with these issue such as the apache portable runtime (apr)?
building ontop of this, for example, would give not only portable threads, but
mmap routines, file locking, iconv, etc, etc.

   http://apr.apache.org/

regards.

-a

···

On Mon, 12 Sep 2005, Hugh Sasse wrote:

On Mon, 12 Sep 2005, Christian Neukirchen wrote:

"Zed A. Shaw" <zedshaw@zedshaw.com> writes:

This basically means that, I have three choices as I see it:

  1) Ruby/Event becomes completely evil and redefines Thread's
  initialize so that it throws an exception telling you to not use
  threads. Ugh.

  2) Implement DRb using Myriad so that people can get the advantages of
  threads without using them. Also ugh. No

       [...]

  3) Abandon Ruby/Event entirely and just base a new implementation of
  Myriad on the Ruby thread stuff. I know you

       [Performance implication...]

Or, 4) Become a great Ruby hacker and replace the Ruby thread select()
completely with libevent. :wink:

or, 5) Contribute to the threading code in some other way which would
make life easier for people including yourself.

This is suggested because you seem to have familiarity with the issue that
multithreaded programs create, which is a subtle area, so you could well
provide something helpful. Obviously the main obstacle is becoming familiar
enough with the internals that you can do this confidently, but it sounds
like you are well on the way there already, if you haven't actually arrived.

--

email :: ara [dot] t [dot] howard [at] noaa [dot] gov
phone :: 303.497.6469
Your life dwells amoung the causes of death
Like a lamp standing in a strong breeze. --Nagarjuna

===============================================================================

There’s also Sydney which promises to add real POSIX thread support to 1.8.2
when it is done. It also makes embedding multiple Ruby interpreters easier by
moving globals into a ruby_state struct (Lua does something similar iIrc).

http://blog.fallingsnow.net/
http://blog.fallingsnow.net/articles/category/Sydney

It would be great if this effort were to be merged with 1.9 somehow but it’s
unclear if Matz and Evan are even discusssing this. The last thing we need is
two slightly incompatible Ruby interpreters.

···


Jos Backus
jos at catnook.com

<snip>

Thanks for the feed back and the gotcha's! Let me noodle on this a while
and then possibly start a discussions about my idea's.

One big worry that you've identified is the GC. In the current messaging
system we manage memory from a pool malloc'd once and never freed. So
would the following work: I register the entire pool as a single entity
with the GC. Then can create multiple ruby objects referring into the
pool. When Ruby is done an object will it try to "free" the object or
will it know that there is another reference and let me manage the memory?

Actually, this would be the *best* way to do this. You just register the whole damn pool with the GC and never unregister it. Then you do your own memory management and control. The GC is completely cut out of the picture and can't mess with your gear. The only thing you need to do is just not declare a free function for the ruby object.

Zed A. Shaw

···

On Mon, 12 Sep 2005 09:32:58 +0900 Wink Saville <wink@saville.com> wrote:

(I've only sent this to sydney-devel since i'm not on ruby-talk
anymore, so feel free to forward this on to ruby-talk)

This is great time to bring up the discussion of how people think I
should attempt to get the sydney modifications into core. I've got an
email I've been composing to ruby-core and havent sent yet, and if
anyone has any ideas/thoughts on the subject, lets hear them!

One obvious thing would be to make sure the Sydney changes are available
against the ruby CVS HEAD, not just ruby_1_8. Are they?

Another issue would be how well these changes play with YARV and what, if
anything, needs to change in YARV and/or the Sydney modifications so that they
can coexist.

Btw: great job, Evan. Ruby really needs native thread support. If I understand
correctly, it sounds like your implementation is superior to Python's even (no
GIL because of Sydney's MODIFY API).

Jos

···

On Mon, Sep 12, 2005 at 09:56:25AM -0700, Evan Webb wrote:

Evan

On 9/12/05, Jos Backus <jos@catnook.com> wrote:
> There's also Sydney which promises to add real POSIX thread support to 1.8.2
> when it is done. It also makes embedding multiple Ruby interpreters easier by
> moving globals into a ruby_state struct (Lua does something similar iIrc).
>
> http://blog.fallingsnow.net/
> http://blog.fallingsnow.net/articles/category/Sydney
>
> It would be great if this effort were to be merged with 1.9 somehow but it's
> unclear if Matz and Evan are even discusssing this. The last thing we need is
> two slightly incompatible Ruby interpreters.
>
> --
> Jos Backus
> jos at catnook.com
> _______________________________________________
> Sydney-devel mailing list
> Sydney-devel@rubyforge.org
> http://rubyforge.org/mailman/listinfo/sydney-devel
>

--
When I do good, I feel good; when I do bad, I feel bad,
and that is my religion.
    -- Abraham Lincoln (1809 - 1865)

_______________________________________________
Sydney-devel mailing list
Sydney-devel@rubyforge.org
http://rubyforge.org/mailman/listinfo/sydney-devel

--
Jos Backus
jos at catnook.com

Zed A. Shaw wrote:

···

On Mon, 12 Sep 2005 09:32:58 +0900 >Wink Saville <wink@saville.com> wrote:

<snip>

Thanks for the feed back and the gotcha's! Let me noodle on this a while and then possibly start a discussions about my idea's.

One big worry that you've identified is the GC. In the current messaging system we manage memory from a pool malloc'd once and never freed. So would the following work: I register the entire pool as a single entity with the GC. Then can create multiple ruby objects referring into the pool. When Ruby is done an object will it try to "free" the object or will it know that there is another reference and let me manage the memory?

Actually, this would be the *best* way to do this. You just register the whole damn pool with the GC and never unregister it. Then you do your own memory management and control. The GC is completely cut out of the picture and can't mess with your gear. The only thing you need to do is just not declare a free function for the ruby object.

Zed A. Shaw
http://www.zedshaw.com/

I'll give a try and we'll see what happens.

Cheers,

Wink