Ruby and threading

Hello,

Is the current ruby 1.9.2 multithreaded at the OS level?

Regards,

Carter.

For MRI, yes, but it has a global interpreter lock. Here's a blog that
explains all the nuances.
http://www.engineyard.com/blog/2011/ruby-concurrency-and-you/

···

On Sun, Oct 16, 2011 at 3:50 AM, Carter Cheng <cartercheng@gmail.com> wrote:

Hello,

Is the current ruby 1.9.2 multithreaded at the OS level?

Regards,

Carter.

Is there any big projects using jRuby? It seems that jRuby is available
for a long time, but still has not very much attention.

···

--
Posted via http://www.ruby-forum.com/.

Maybe it's a stupid question, but I can't get it - what's the point of
using OS threads with GIL?

You'll anyway never get situation where two thread runs simultaneously,
so, what's the point of it?

···

--
Posted via http://www.ruby-forum.com/.

It's not an unreasonable decision to consider Clojure, but the rules
it forces you to follow can also be followed voluntarily in Ruby.

I don't understand this. Ruby has no support for parallel execution
(except jRuby and similar platforms), so no matter what technics do You
apply - the program always will be on the single processor core.

As far as I understood there's a way to hack it, like EventMachine, but
it's a half-solution, it's useful only for limited problems dealing with
little computing load and lots of IO-waiting.

I honestly don't know why You ever should use Ruby Threads - it gives
You all the burden of concurrent programming complexity and gives in
return exactly nothing (except very specific cases like EventMachine).

Clojure otherwise - allows You to use all available cores and makes it
simple by applying modern approaches of dealing with concurrency.

It's not Ruby vs Clojure, I just want to point that there's no sense to
compare these tools, because they are different.

···

--
Posted via http://www.ruby-forum.com/\.

Hi Josh,

Thanks for the reply. I was looking over the YARV implementation in 1.9.3rc1
and did notice some support for pthreads and ruby threads (and some
description in the comments of certain models). Do you know if (or does
anyone else) if this code is fully implemented at present? Or is it a global
lock like situation?

Regards,

Carter.

···

On Sun, Oct 16, 2011 at 8:03 PM, Josh Cheek <josh.cheek@gmail.com> wrote:

On Sun, Oct 16, 2011 at 3:50 AM, Carter Cheng <cartercheng@gmail.com> > wrote:

> Hello,
>
> Is the current ruby 1.9.2 multithreaded at the OS level?
>
> Regards,
>
> Carter.
>

For MRI, yes, but it has a global interpreter lock. Here's a blog that
explains all the nuances.
http://www.engineyard.com/blog/2011/ruby-concurrency-and-you/

Obtiva (who is now owned by Groupon) gave a JRuby workshop at Red Dirt Ruby
Conf. The presenters, Tyler Jennings, and Noel Rappin, said that they were
using JRuby on a large client site, which was initially written in Java, and
then their new portion was written in Rails, and they were coexisting via
JRuby.

···

On Sun, Oct 16, 2011 at 12:09 PM, Alexey Petrushin <axyd80@gmail.com> wrote:

Is there any big projects using jRuby? It seems that jRuby is available
for a long time, but still has not very much attention.

JRuby is being used at several companies including LinkedIn and Square. It
also powers the HBase console.

···

On Sun, Oct 16, 2011 at 10:09 AM, Alexey Petrushin <axyd80@gmail.com> wrote:

Is there any big projects using jRuby? It seems that jRuby is available
for a long time, but still has not very much attention.

--
Posted via http://www.ruby-forum.com/\.

--
Tony Arcieri

I've done some consulting work, and JRuby was being used heavily.

Threads can still run simultaneously, they just can't make changes to Ruby
objects or the Ruby environment.

A native extension can release the GIL and do blocking I/O or perform a
complex computation (e.g. crypto) while Ruby code is running in another
thread.

···

On Mon, Oct 17, 2011 at 1:02 PM, Alexey Petrushin <axyd80@gmail.com> wrote:

Maybe it's a stupid question, but I can't get it - what's the point of
using OS threads with GIL?

You'll anyway never get situation where two thread runs simultaneously,
so, what's the point of it?

--
Tony Arcieri

"MRI 1.9 uses the same technique as MRI 1.8 to improve the situation, namely
the GIL is released if a Thread is waiting on an external event (normally
IO) which improves responsiveness."

-- http://www.engineyard.com/blog/2011/ruby-concurrency-and-you/

···

On Mon, Oct 17, 2011 at 3:02 PM, Alexey Petrushin <axyd80@gmail.com> wrote:

Maybe it's a stupid question, but I can't get it - what's the point of
using OS threads with GIL?

You'll anyway never get situation where two thread runs simultaneously,
so, what's the point of it?

--
Posted via http://www.ruby-forum.com/\.

No, with a GIL no two threads can concurrently hold the lock. But a
thread which does not need the lock can execute in parallel (e.g.
while doing a syscall). For specifics you would have to ask Matz or
read the source.

Kind regards

robert

···

On Mon, Oct 17, 2011 at 10:02 PM, Alexey Petrushin <axyd80@gmail.com> wrote:

Maybe it's a stupid question, but I can't get it - what's the point of
using OS threads with GIL?

You'll anyway never get situation where two thread runs simultaneously,
so, what's the point of it?

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

A great followup to this post, explains why the GIL exists
http://merbist.com/2011/10/18/data-safety-and-gil-removal/

When I ran the code Matt provides under MRI 1.9.3 (has GIL) and Rubinius,
JRuby, MacRuby (native threads, no GIL):

$ rvm 1.9.3-rc1,rbx-2.0.0pre,jruby-1.6.4,macruby-0.10 do ruby needs_gil.rb

ruby-1.9.3-rc1
0.064 seconds
400000 elements in array (should be 400000)

rbx-2.0.0pre
0.232 seconds
398877 elements in array (should be 400000)

jruby-1.6.4
0.069 seconds
398709 elements in array (should be 400000)

macruby-0.10
0.076 seconds
366231 elements in array (should be 400000)

$ cat needs_gil.rb
puts '', ENV['RUBY_VERSION']

@array, threads = ,
start = Time.now
4.times do
  threads << Thread.new { (1..100_000).each {|n| @array << n} }
end
threads.each{|t| t.join }
stop = Time.now

puts "%0.3f seconds" % (stop - start), @array.size

Note:
* Other times I ran it under Rubinius, the array got corrupted or something
"Tuple::copy_from: index 8092 out of bounds for size 5395
(Rubinius::ObjectBoundsExceededError)"
* Other times I ran it under JRuby, it detected the corrupt data with
'ConcurrencyError: Detected invalid array contents due to unsynchronized
modifications with concurrent users'
* I ran this a whole bunch of times, sometimes MRI was fastest, sometimes
MacRuby, sometimes JRuby (MRI was fastest most consistently, though)

Thoughts:
* MRI has a GIL, thus keeping the data safe, and still performs equivalently
with other implementations (for this admittedly limited test), so do
benchmarks to decide if this will be worthwhile. It's not a fluke that Matz
wants to keep the GIL.
* I'm glad JRuby notices the corrupt data (though not always) I'm a big fan
of fail-fast
* Has JRuby fixed their startup time issue? I ran this a lot of times and
didn't notice any of the lag I used to.

···

On Sun, Oct 16, 2011 at 4:03 AM, Josh Cheek <josh.cheek@gmail.com> wrote:

On Sun, Oct 16, 2011 at 3:50 AM, Carter Cheng <cartercheng@gmail.com>wrote:

Hello,

Is the current ruby 1.9.2 multithreaded at the OS level?

Regards,

Carter.

For MRI, yes, but it has a global interpreter lock. Here's a blog that
explains all the nuances.
http://www.engineyard.com/blog/2011/ruby-concurrency-and-you/

The idea is that if you wrote thread-safe code, the GIL could be
removed, and then Ruby would _not_ be limited to running on one core.

I don't understand this. Ruby has no support for parallel execution
(except jRuby and similar platforms), so no matter what technics do You
apply - the program always will be on the single processor core.

So use JRuby or Rubinius if you want real multicore concurrency. Problem
solved.

Threads are still useful even if you have a GIL. Rails can use threads to
service more than one request at a time per Ruby interpreter. Without
threads each connection to a web site requires a dedicated Ruby VM to
service it. Seems bad, but people have made it work.

Clojure otherwise - allows You to use all available cores and makes it
simple by applying modern approaches of dealing with concurrency.

Clojure has a lot of neat tools, like immutable persistent data structures,
agents, and STM. However, none of these are a panacea for concurrency bugs.
Clojure provides a very thin interop wrapper around the JVM and existing
Java libraries, which you'll find yourself using all the time when you write
Clojure programs. Since the Java libraries use mutable state and allow you
to get below the abstractions Clojure otherwise provides, you can still wind
up with thread safety bugs in your Clojure programs which are identical to
the kind you'd find in Ruby programs.

···

On Thu, Oct 27, 2011 at 9:28 PM, Alexey Petrushin <axyd80@gmail.com> wrote:

--
Tony Arcieri

Hey Carter-

MRI will not be removing the GIL any time soon. For true concurrency,
you should use JRuby or Rubinius.

-Steve

In London there are a few big name companies following this MO.

They wrap up a legacy Java app with Ruby based acceptance tests
(cucumber/capybara is popular), and then extend the app with Ruby based
extensions (no new production code with Java).

JRuby glues up a lot of it - companies have put a lot of investment in a
Java-based operations backend. JRuby lets them keep it. Its just the JSP
grunge that they want to be rid of.

···

On Sun, Oct 16, 2011 at 7:59 PM, Josh Cheek <josh.cheek@gmail.com> wrote:

On Sun, Oct 16, 2011 at 12:09 PM, Alexey Petrushin <axyd80@gmail.com> > wrote:

> Is there any big projects using jRuby? It seems that jRuby is available
> for a long time, but still has not very much attention.
>
>
Obtiva (who is now owned by Groupon) gave a JRuby workshop at Red Dirt Ruby
Conf. The presenters, Tyler Jennings, and Noel Rappin, said that they were
using JRuby on a large client site, which was initially written in Java,
and
then their new portion was written in Rails, and they were coexisting via
JRuby.

--
http://richardconroy.blogspot.com | http://twitter.com/RichardConroy

I think it's pretty obvious, that implementations without GIL should
behave exactly as you have shown.
You did not use synchronization to append an item to your shared
object, so sometimes items will get lost.
- --
All the best, Sandor Szücs

···

On 10/19/11 9:11 AM, Josh Cheek wrote:

ruby-1.9.3-rc1 0.064 seconds 400000 elements in array (should be
400000)

rbx-2.0.0pre 0.232 seconds 398877 elements in array (should be
400000)

jruby-1.6.4 0.069 seconds 398709 elements in array (should be
400000)

macruby-0.10 0.076 seconds 366231 elements in array (should be
400000)

$ cat needs_gil.rb puts '', ENV['RUBY_VERSION']

@array, threads = , start = Time.now 4.times do threads <<
Thread.new { (1..100_000).each {|n| @array << n} } end
threads.each{|t| t.join } stop = Time.now

puts "%0.3f seconds" % (stop - start), @array.size

A great followup to this post, explains why the GIL exists
http://merbist.com/2011/10/18/data-safety-and-gil-removal/

When I ran the code Matt provides under MRI 1.9.3 (has GIL) and Rubinius,
JRuby, MacRuby (native threads, no GIL):

Ok, I can't let this one sit.

To my eyes, the only one broken there is MRI. It's not actually doing
anything in parallel, so you get the synchronous result. Perhaps I
should file a bug against MRI that its threads...aren't?

In all seriousness, though, this is flawed reasoning. Spinning up
threads is asking the runtime to do something in parallel, and MRI is
the only example here not delivering. You are asking for the result
you get under JRuby, Rubinius, and MacRuby, since you don't
synchronize any access to the shared array, and the shared array does
not (according to Matz himself) have thread safety as part of its
contract. The only reason you get the other result under MRI is
because it isn't actually doing what you've asked of it.

Saying that the GIL is useful based on this example is a bit like
saying "JRuby not supporting C extensions is useful because they'll
never crash due to C extensions." You can't compare lack of
parallelism with parallelism when you're trying to demonstrate
parallelism.

* Other times I ran it under JRuby, it detected the corrupt data with
'ConcurrencyError: Detected invalid array contents due to unsynchronized
modifications with concurrent users'

We do our best to detect this for Array, and at some point we'll try
to do it for Hash (Hash will currently raise errors from Java like
ArrayIndexOutOfBoundsException...still rescuable, but not as nice). It
would be cool if Ruby incorporated some explicitly thread-safe
collections by default, but there are gems that provide such things
right now.

FWIW, it's almost impossible to have threadsafe data structures that
perform as well as non-threadsafe data structures, which is why we've
always opted to keep Array and Hash the way they are. Hopefully people
are starting to learn that the alternatives aren't that bad, like
using external threadsafe libs or simply mutexing around all accesses.

* I ran this a whole bunch of times, sometimes MRI was fastest, sometimes
MacRuby, sometimes JRuby (MRI was fastest most consistently, though)

For a run that short, I'm not surprised. JRuby would be faster if it
ran for more than...what...0.07 seconds? I ran a longer version
without threads (so it wouldn't error out) and JRuby was clearly the
fastest. I also wrote a version that uses a JRuby-specific module for
thread-safety, and it only slowed down by about 2x...but it completes
successfully every time:

require 'jruby/synchronized'
puts '', ENV['RUBY_VERSION']

class SafeArray < Array
  include JRuby::Synchronized
end

10.times do
  @array, threads = SafeArray.new,
  start = Time.now
  4.times do
   threads << Thread.new { (1..100_000).each {|n| @array << n} }
  end
  threads.each{|t| t.join }
  stop = Time.now

  puts "%0.3f seconds" % (stop - start), @array.size
end

Thoughts:
* MRI has a GIL, thus keeping the data safe, and still performs equivalently
with other implementations (for this admittedly limited test), so do
benchmarks to decide if this will be worthwhile. It's not a fluke that Matz
wants to keep the GIL.

False safety (you can still easily have threads step on each other) at
the expense of parallelism. I'm not sure that's a win.

Also, I don't think Matz has ever said he really "wants" to keep the
GIL. It's just a massively difficult thing to retrofit MRI for
parallel threading without a very large rework. If they could drop the
GIL without destabilizing MRI itself, I'm sure they'd do it.

* I'm glad JRuby notices the corrupt data (though not always) I'm a big fan
of fail-fast

It only fails fast if it actually fails, of course. Some of your runs
manage to succeed without the threads stepping on each other. And by
failure, here, I mean potentially corrupting the array. The array
contents may get out of sync because you don't synchronize writes, but
that's not a failure in a concurrent environment. Or at least, it's
not JRuby's failure...it's yours.

* Has JRuby fixed their startup time issue? I ran this a lot of times and
didn't notice any of the lag I used to.

That's good to hear! Every release includes more startup-time tweaks.
Perhaps we're finally "getting there".

- Charlie

···

On Wed, Oct 19, 2011 at 2:11 AM, Josh Cheek <josh.cheek@gmail.com> wrote:

The GIL discussion is very similar to the memory ordering property of processors [http://en.wikipedia.org/wiki/Memory_ordering\] and the related problem of gaining speed in CPU design by making it more Alpha-ish vs. keeping it x86-ish but with less hassle on the software front. BTW, Ruby still uses mostly volatile’s which are not inherently thread safe on all processors instead of proper memory barriers.

– Matthias

···

On 28.10.2011 19:40, Steve Klabnik wrote:

The idea is that if you wrote thread-safe code, the GIL could be
removed, and then Ruby would _not_ be limited to running on one core.