[ruby-talk:444105] Ractor status: are they used?

Hi,

I just evaluated the possibility of using Ractors to distribute processing using various gems with native extensions (some matrix/vector computations with Numo::Linalg and approximate vector searches) and I've mixed feelings.

The Ractor design seems very sound to me and I would prefer to use them to distribute the load on multiple CPUs but in practice I see 2 major obstacles :
- the first use of Ractor code isn't encouraging, even with Ruby 3.2.0, as it outputs :
"warning: Ractor is experimental, and the behavior may change in future versions of Ruby! Also there are many implementation issues."
- when heavy computations are involved you are probably already using gems with native extensions and when trying to use them in a non-main Ractor you will almost always get an exception :
"ractor unsafe method called from not main ractor (Ractor::UnsafeError)"

This last problem could be manageable, according to :

native extensions should mark their individual functions as thread safe using :

#ifdef HAVE_RB_EXT_RACTOR_SAFE
rb_ext_ractor_safe(true);
#endif

I was considering contacting the maintainers of the various gems that we use to check with them which methods are safe and to see if I can submit a pull request but I'm a bit hesitant and wonders if Ractors are actually used and stable enough for maintainers to care.

Do people on this list use Ractors? What for?
Do people code gems with native extensions with them in mind? I've seen traces of rb_ext_ractor_safe when searching on Github but almost all of the matches are from load.h from the Ruby headers but not many actual uses (in the 4 first pages of results Psych and TruffleRuby are the only exceptions and the first external gem is ruby-extlz4 on the 5th page...).

I've yet to see how the ffi gem can handle Ractors, I already know there's no trace of Ractors in its code and this could be a complete new can of worms (or not...).

I'll dig a bit more but in my current position I think I'll favor a multi-process solution to scale (mostly because I don't have a definitive list of gems I'll have to patch at this point and time is short). That's a shame and I'm willing to revisit this later.

Best regards,

Lionel

···

______________________________________________
ruby-talk mailing list -- ruby-talk@ml.ruby-lang.org
To unsubscribe send an email to ruby-talk-leave@ml.ruby-lang.org
ruby-talk info -- Info | ruby-talk@ml.ruby-lang.org - ml.ruby-lang.org

I’ve toyed around with Ractors, thinking they might be a Godsend for GUI
apps, like those built by Glimmer DSL for LibUI (
GitHub - AndyObtiva/glimmer-dsl-libui: Glimmer DSL for LibUI (Prerequisite-Free Ruby Desktop Development GUI Library - The Quickest Way From Zero To GUI - No need to pre-install any prerequisites. Just install the gem and have platform-independent GUI that just works)). But, they always turned
out to be more pain to use with the variable access restrictions than plain
old threads, which I get for free as truly parallel in JRuby while using
Glimmer DSL for SWT (GitHub - AndyObtiva/glimmer-dsl-swt: Glimmer DSL for SWT (JRuby Desktop Development GUI Framework) - The Quickest Way From Zero To GUI).

I don’t know. Maybe one day I’ll see the light of Ractors and change my
mind, but being a long time multithreading user, I always thought the fears
of using true multithreading were overblown in Ruby to the point of
impracticality. In actuality, I use multithreading all the time when
building desktop apps, and I never have issues with deadlocks given I
correctly use mutexes/semaphores, or given that many apps don’t even need
to share data through the threads, but could benefit from threads for
parallel processing. It would be nice to have the more convenient option
of multithreading for those cases at least.

I’ve always thought Ruby’s stance on multithreading was extreme. why not
support it behind a non-default switch, and let those who don’t fear using
them and know how to benefit from them safely use real multithreading by
passing a switch, instead of getting forced to use something very
restrictive and inconvenient like Ractors.

But, if it were my decision, I’d even make multithreading always enabled in
Ruby. I’ve built countless desktop GUI apps with JRuby’s multithreading
support over the years, and never had any issues. Check out this Mandelbrot
Fractal renderer that takes advantage of all your CPU cores!

I’ve observed that most people who have issues with multithreading don’t
really have a solid foundation of parallel concurrent programming from a
university degree (I have a BSc in CS), reading books, or real experience,
and thus misunderstand how to build multithreaded apps, thus run into the
problems they claim are horrifying to the point of discouraging
multithreading.

Exhibit A: the Mandelbrot Fractal problem. Some people assume that given
that you want to calculate fractal pixels in a grid, you’d have to have a
thread per pixel. Wrong!!! You are supposed to have as many thread as your
CPU core threads only, and distribute work to them in a pool. They won’t
share data, so there won’t be a deadlock. But, people weak in parallel
concurrent programming will try to use a thread per pixel and even share a
data structure between all of them, and then will tell you that true
multithreading is too memory consuming and dangerous, and will end up
cursing threads and wanting to use fibers or ractors instead. Well, that’s
because of their inexperience in how to write parallel code correctly with
threads to begin with more than anything. As a result, everybody is
suffering an impractical restriction in Ruby because a few bad programmers
don’t know how to do multithreading when in fact it’s a simple matter of
practice makes perfect.

I’d rather we follow the true Ruby way, which is to empower programmers
with freedom (like the freedom of dynamic typing) and leaving them be
responsible adults. So, multithreading should have the same Ruby way
freedom too. It’s no different. And, at minimum, provide a Ruby command
switch for truly parallel multithreading to those who know what they’re
doing.

Andy Maleh

···

On Sat, Jan 21, 2023 at 9:37 AM Lionel Bouton via ruby-talk < ruby-talk@ml.ruby-lang.org> wrote:

Hi,

I just evaluated the possibility of using Ractors to distribute
processing using various gems with native extensions (some matrix/vector
computations with Numo::Linalg and approximate vector searches) and I've
mixed feelings.

The Ractor design seems very sound to me and I would prefer to use them
to distribute the load on multiple CPUs but in practice I see 2 major
obstacles :
- the first use of Ractor code isn't encouraging, even with Ruby 3.2.0,
as it outputs :
"warning: Ractor is experimental, and the behavior may change in future
versions of Ruby! Also there are many implementation issues."
- when heavy computations are involved you are probably already using
gems with native extensions and when trying to use them in a non-main
Ractor you will almost always get an exception :
"ractor unsafe method called from not main ractor (Ractor::UnsafeError)"

This last problem could be manageable, according to :
Feature #17307: A way to mark C extensions as thread-safe, Ractor-safe, or unsafe - Ruby master - Ruby Issue Tracking System
native extensions should mark their individual functions as thread safe
using :

#ifdef HAVE_RB_EXT_RACTOR_SAFE
   rb_ext_ractor_safe(true);
#endif

I was considering contacting the maintainers of the various gems that we
use to check with them which methods are safe and to see if I can submit
a pull request but I'm a bit hesitant and wonders if Ractors are
actually used and stable enough for maintainers to care.

Do people on this list use Ractors? What for?
Do people code gems with native extensions with them in mind? I've seen
traces of rb_ext_ractor_safe when searching on Github but almost all of
the matches are from load.h from the Ruby headers but not many actual
uses (in the 4 first pages of results Psych and TruffleRuby are the only
exceptions and the first external gem is ruby-extlz4 on the 5th page...).

I've yet to see how the ffi gem can handle Ractors, I already know
there's no trace of Ractors in its code and this could be a complete new
can of worms (or not...).

I'll dig a bit more but in my current position I think I'll favor a
multi-process solution to scale (mostly because I don't have a
definitive list of gems I'll have to patch at this point and time is
short). That's a shame and I'm willing to revisit this later.

Best regards,

Lionel
______________________________________________
ruby-talk mailing list -- ruby-talk@ml.ruby-lang.org
To unsubscribe send an email to ruby-talk-leave@ml.ruby-lang.org
ruby-talk info --
Info | ruby-talk@ml.ruby-lang.org - ml.ruby-lang.org

--
Andy Maleh

LinkedIn: https://www.linkedin.com/in/andymaleh
<https://www.linkedin.com/in/andymaleh&gt;
Blog: http://andymaleh.blogspot.com
GitHub: AndyObtiva (Andy Maleh) · GitHub
Twitter: @AndyObtiva <https://twitter.com/AndyObtiva&gt;

My experience with Ractor was that I wanted to run various instances of whitequark/parser gem in parallel, but there were a couple of problems, for instance Ragel generated code contained something like singleton getters/setters on global modules. This made me rethink use of singleton getters/setters on global in the future, as they are very akin to what are global variables, but I didn't go as far as trying to fix Ragel to generate better code. But I also didn't like the fact, that I had to write unclean code like:

 call\_a\_method\(&amp;Ractor\.make\_shareable\(proc do
   do\_something
 \)\)

For a Ractor to actually launch correctly.

Finally, what I wanted to do, I implemented with fork.

But - I agree with you in general - if you carefully deal with threads like Erlang forces you to do, nothing will go wrong. Python has recently eliminated its GIL. The GIL severely limits what you can do with Threads, it basically only allows you to accelerate IO with that (and for IO, cooperative multitasking is a much better choice anyway).

···

On 1/21/23 23:23, Andy Maleh via ruby-talk wrote:

I’ve toyed around with Ractors, thinking they might be a Godsend for GUI apps, like those built by Glimmer DSL for LibUI (GitHub - AndyObtiva/glimmer-dsl-libui: Glimmer DSL for LibUI (Prerequisite-Free Ruby Desktop Development GUI Library - The Quickest Way From Zero To GUI - No need to pre-install any prerequisites. Just install the gem and have platform-independent GUI that just works)). But, they always turned out to be more pain to use with the variable access restrictions than plain old threads, which I get for free as truly parallel in JRuby while using Glimmer DSL for SWT (GitHub - AndyObtiva/glimmer-dsl-swt: Glimmer DSL for SWT (JRuby Desktop Development GUI Framework) - The Quickest Way From Zero To GUI).

______________________________________________
ruby-talk mailing list -- ruby-talk@ml.ruby-lang.org
To unsubscribe send an email to ruby-talk-leave@ml.ruby-lang.org
ruby-talk info -- Info | ruby-talk@ml.ruby-lang.org - ml.ruby-lang.org

Hi,

[...]
I’ve always thought Ruby’s stance on multithreading was extreme. why not support it behind a non-default switch, and let those who don’t fear using them and know how to benefit from them safely use real multithreading by passing a switch, instead of getting forced to use something very restrictive and inconvenient like Ractors.

If you are referring to true concurrent threading, from what I know the explanation is mostly library support. For example PHP as an ecosystem took ages to truly support multi-threading (I'm not even sure it fully does today) because most of the libraries built with it where designed before it supported threads so they didn't support them (chicken and egg problem).
It was far easier for Java to do it right as true concurrent threads were baked in from the start.

But, if it were my decision, I’d even make multithreading always enabled in Ruby. I’ve built countless desktop GUI apps with JRuby’s multithreading support over the years, and never had any issues.

Yes if you only use pure Ruby or wrappers around Java libs. But unless I'm mistaken interfacing JRuby with native C libraries is not straightforward and certainly not thread safe by default: I don't see a way around knowing which functions are safe and which aren't to properly wrap/use them.
To properly support concurrent multi-threading in MRI with native extensions you'd have to only expose thread-safe functions or advise how to protect unsafe ones. You might have to rb_ext_ractor_safe(true) or use something similar to manage this to avoid using trial and error to see what fails.

Ractors are just another tool in the box.
- Threads are the obvious match when giving access to your whole state to many concurrent executions is needed (can be motivated by the actual problem to solve or the speed it gives for large amounts of information transfer).
- Ractors are great when you want to divide your problem into portions of code with very clear and concise interfaces. This promotes simplicity which greatly helps when you need robustness (simplicity often indirectly helps performance too).
- Whole processes are another tool that can be even more robust when used properly: many Unix daemons (like Postfix, Apache in prefork mode, PostgreSQL to name a few I use regularly) have been very solid in part because they divide their work amongst re-startable processes with clearly delimited responsibilities. This is the path we have taken until now and will probably continue with although there's an expected slight performance disadvantage with our new components.

Ractors seem to be victims of the same chicken and egg problem PHP threads had. They are a good solution for a whole class of problems but it seems almost nobody supports them and so they aren't as useful as they could be.

If at least the interface wasn't experimental anymore, there might be more incentive to work on making gems with native extensions Ractor aware. At least this is how I see it: I'm tempted to fork some gems and submit pull requests later after testing but this involves probably several days of work for our needs and this experimental state is a risk of our work going to waste. The whole process route is less risky right now.

Best regards,

···

Le 21/01/2023 à 23:23, Andy Maleh via ruby-talk a écrit :

--
Lionel Bouton
gérant de JTEK SARL
https://www.linkedin.com/in/lionelbouton/

______________________________________________
ruby-talk mailing list -- ruby-talk@ml.ruby-lang.org
To unsubscribe send an email to ruby-talk-leave@ml.ruby-lang.org
ruby-talk info -- Info | ruby-talk@ml.ruby-lang.org - ml.ruby-lang.org

The remaining comments in this thread are useful but I would recommend watching this talk by Samuel Williams (@ioquatix on Twitter/ Mastadon) since he probably has the best knowledge to answer your question :slight_smile:

In what I read, ractors are still slow but as with most things Ruby, it takes a couple of versions to be great to go.

Best wishes,
Mohit.

···

On 2023-1-21 10:37 pm, Lionel Bouton via ruby-talk wrote:

Hi,

I just evaluated the possibility of using Ractors to distribute processing using various gems with native extensions (some matrix/vector computations with Numo::Linalg and approximate vector searches) and I've mixed feelings.

The Ractor design seems very sound to me and I would prefer to use them to distribute the load on multiple CPUs but in practice I see 2 major obstacles :
- the first use of Ractor code isn't encouraging, even with Ruby 3.2.0, as it outputs :
"warning: Ractor is experimental, and the behavior may change in future versions of Ruby! Also there are many implementation issues."
- when heavy computations are involved you are probably already using gems with native extensions and when trying to use them in a non-main Ractor you will almost always get an exception :
"ractor unsafe method called from not main ractor (Ractor::UnsafeError)"

______________________________________________
ruby-talk mailing list -- ruby-talk@ml.ruby-lang.org
To unsubscribe send an email to ruby-talk-leave@ml.ruby-lang.org
ruby-talk info -- Info | ruby-talk@ml.ruby-lang.org - ml.ruby-lang.org