Jruby vs mri memory management

Hello

I have tried to run my string munching application in JRuby, and I am
quite pleased with the results.

My string_arrays.each{|array| m=MunchedStrings.new; array.each{|str|
m.munch str}; puts m.pretty_print } cycle tops at about 790MB ram, and
the RSS even decreases when smaller arrays are processed. In contrast,
the MRI tops at about 2.5G at the end of the cycle, and I have
observed no decrease in RSS ever. Also notable is that after parsing
the data file JRuby uses about 250MB ram as opposed 600MB used by MRI.

These results were obtained with JRuby 1.1RC2 on OS X 10.4 with the
included Sun JVM 1.5 versus current Debian ruby MRI packages
(1.8.6).

I did not check for differences in the result yet, and I could not run
the jruby tests as the junit.jar is not included, at least not in the
place indicated by readme.

The JRuby 1.0 that is packaged for Debian does not work for me. It
does not understand -KU and spams lots of warnings about $; being
uninitialized.

I know that it would be nice to provide the application for testing
but I cannot release the data.
I might to do a simplified version that runs on random data eventually.

Anyway, other people who have trouble with large memory usage might
try JRuby as well. It runs 1.8 code without modification (I
intentionally did not use any non-core extension to make the
application portable although it might have simplified the use of
utf-8 strings a bit).

Thanks

Michal

Michal Suchanek wrote:

Hello

I have tried to run my string munching application in JRuby, and I am
quite pleased with the results.

My string_arrays.each{|array| m=MunchedStrings.new; array.each{|str|
m.munch str}; puts m.pretty_print } cycle tops at about 790MB ram, and
the RSS even decreases when smaller arrays are processed. In contrast,
the MRI tops at about 2.5G at the end of the cycle, and I have
observed no decrease in RSS ever. Also notable is that after parsing
the data file JRuby uses about 250MB ram as opposed 600MB used by MRI.

These results were obtained with JRuby 1.1RC2 on OS X 10.4 with the
included Sun JVM 1.5 versus current Debian ruby MRI packages
(1.8.6).

I did not check for differences in the result yet, and I could not run
the jruby tests as the junit.jar is not included, at least not in the
place indicated by readme.

The JRuby 1.0 that is packaged for Debian does not work for me. It
does not understand -KU and spams lots of warnings about $; being
uninitialized.

I know that it would be nice to provide the application for testing
but I cannot release the data.
I might to do a simplified version that runs on random data eventually.

Anyway, other people who have trouble with large memory usage might
try JRuby as well. It runs 1.8 code without modification (I
intentionally did not use any non-core extension to make the
application portable although it might have simplified the use of
utf-8 strings a bit).

I've briefly tried Jruby 1.0 and 1.1RC2 with the latest Sun 1.6 JVM on Linux 32bit when we benchmarked the Ruby Quiz #157 submissions, but I was not impressed (I didn't check the memory usage though):

- both threw a NulllException in the Jruby code when running one of the submissions (you can easily reproduce this by looking for the benchmark posts in the archives, there was an URL for a page with all the relevant code), you just had to launch the benchmark to get it. There was no way of protecting from this as it was out of Ruby's scope (ie a begin rescue couldn't catch it),
- 1.0 was slower than MRI (between 2x and 3x) and 1.1RC2 slightly faster (5 to 10%), the code was mainly doing floating point computations.
- Launching the JVM is 10x slower than MRI (unusable for small scripts designed to return quickly).

JRuby is still on my watch list (ruby-gettext is now pure Ruby so one of my main obstacle to using it in production was recently removed) but I'm not sure it is quite here yet (aborting the first program I try with it with a NullException is not encouraging).

For the small scripts problem with the JVM loading, at some point I've read that future JVMs would have the ability of acting as resident interpreters (ie: a JVM is always running in the background and new instances are simply feeding it the code instead of loading a whole new JVM). I didn't found any documentation on that in the java man page of my local Sun 1.6.0.03 JDK install though :frowning:

Lionel

Lionel Bouton wrote:

I've briefly tried Jruby 1.0 and 1.1RC2 with the latest Sun 1.6 JVM on Linux 32bit when we benchmarked the Ruby Quiz #157 submissions, but I was not impressed (I didn't check the memory usage though):

- both threw a NulllException in the Jruby code when running one of the submissions (you can easily reproduce this by looking for the benchmark posts in the archives, there was an URL for a page with all the relevant code), you just had to launch the benchmark to get it. There was no way of protecting from this as it was out of Ruby's scope (ie a begin rescue couldn't catch it),
- 1.0 was slower than MRI (between 2x and 3x) and 1.1RC2 slightly faster (5 to 10%), the code was mainly doing floating point computations.
- Launching the JVM is 10x slower than MRI (unusable for small scripts designed to return quickly).

There have been more perf improvements since RC2, but we should be more than 5-10% faster for normal computation. IO and such tend to drag us down a bit though, since there's necessarily more layers to go through.

JRuby is still on my watch list (ruby-gettext is now pure Ruby so one of my main obstacle to using it in production was recently removed) but I'm not sure it is quite here yet (aborting the first program I try with it with a NullException is not encouraging).

That's very unusual...most scripts run perfectly. If there's an NPE, it will be fixed by the end of today.

For the small scripts problem with the JVM loading, at some point I've read that future JVMs would have the ability of acting as resident interpreters (ie: a JVM is always running in the background and new instances are simply feeding it the code instead of loading a whole new JVM). I didn't found any documentation on that in the java man page of my local Sun 1.6.0.03 JDK install though :frowning:

JRuby supports running with Nailgun, a memory-resident background JVM. You need the source release to set it up, since it builds a small client app written in C. Unpack, "ant jruby-nailgun", and then you can use jruby-ng-server and jruby-ng to run scripts. There are a few caveats listed here, but it works pretty nice for quick hits:

http://wiki.jruby.org/wiki/JRuby_with_Nailgun

- Charlie

Lionel Bouton wrote:

JRuby is still on my watch list (ruby-gettext is now pure Ruby so one of my main obstacle to using it in production was recently removed) but I'm not sure it is quite here yet (aborting the first program I try with it with a NullException is not encouraging).

I did not get any NPE on JRuby trunk. It's possible something was fixed since then that was causing it. If you can reproduce it, please send it to me and it will be fixed very quickly.

My numbers were not as fast as I would have expected, so some of these benchmarks are hitting slower areas in JRuby. They will also be fixed:

~/NetBeansProjects/jruby/157 $ jruby -J-server 157_benchmark.rb
              user system total real
FRANK 10.661000 0.000000 10.661000 ( 10.661000)
JUSTIN 4.761000 0.000000 4.761000 ( 4.761000)
LIONEL 2.025000 0.000000 2.025000 ( 2.025000)
DOUG 3.065000 0.000000 3.065000 ( 3.065000)
PHILIPP 1.910000 0.000000 1.910000 ( 1.910000)
BILL 0.328000 0.000000 0.328000 ( 0.328000)
~/NetBeansProjects/jruby/157 $ ruby 157_benchmark.rb
              user system total real
FRANK 19.320000 0.120000 19.440000 ( 20.015972)
JUSTIN 5.170000 0.010000 5.180000 ( 5.225455)
LIONEL 0.420000 0.000000 0.420000 ( 0.417798)
DOUG 3.280000 0.030000 3.310000 ( 3.571585)
PHILIPP 1.650000 0.010000 1.660000 ( 1.688573)
BILL 0.240000 0.000000 0.240000 ( 0.250238)

- Charlie

Lionel Bouton wrote:

- 1.0 was slower than MRI (between 2x and 3x) and 1.1RC2 slightly faster (5 to 10%), the code was mainly doing floating point computations.

I did some investigation on the benchmarks.

- I skipped FRANK because it was already easily 2x as fast as MRI

- All the remaining cases showed up as being slower or only slightly faster than MRI, which is very unusual. So I ran a sampling profile and that pointed me toward Struct being slower than it should be. Struct hasn't been updated with many recent optimizations in the rest of the system, and I'm sure it can be made a lot faster.

I filed a bug: http://jira.codehaus.org/browse/JRUBY-2220

It will be fixed for 1.1, and I'll post some new numbers when I've fixed it.

- Charlie

Technically Java can do that for ages already. It has a proper VM that
you can reset. So all you need is a small application that loads a
class, executes it, resets the VM, etc. The only obstacle is you would
have to write it :wink:

Actually it's the only major defect I see in ruby MRI - it is not a
proper VM that can be reset, equipped with various GCs, and whatnot.
JRuby is a very nice bridge that allows exploiting the years of
development that went into JVM without tying your application to it.

BTW I do not find the small script performance that bad. Implementing
something like the shell [ in jruby might have noticeable impact on
shell scripts but running a short script for testing seem fine. That
might be the OS X prelinking, though.

Thanks

Michal

···

On 04/03/2008, Lionel Bouton <lionel-subscription@bouton.name> wrote:

For the small scripts problem with the JVM loading, at some point I've
read that future JVMs would have the ability of acting as resident
interpreters (ie: a JVM is always running in the background and new
instances are simply feeding it the code instead of loading a whole new
JVM). I didn't found any documentation on that in the java man page of
my local Sun 1.6.0.03 JDK install though :frowning:

<snip>

It will be fixed for 1.1, and I'll post some new numbers when I've fixed it.

Great to hear that you keep your ears and eyes on the community :slight_smile:

···

On Tue, Mar 4, 2008 at 4:47 PM, Charles Oliver Nutter <charles.nutter@sun.com> wrote:

Charles Oliver Nutter wrote:

Lionel Bouton wrote:

- 1.0 was slower than MRI (between 2x and 3x) and 1.1RC2 slightly faster (5 to 10%), the code was mainly doing floating point computations.

I did some investigation on the benchmarks.

- I skipped FRANK because it was already easily 2x as fast as MRI

If you happen to get a similar problem, for the reference DOUG makes 1.1RC2 crash here.

I couldn't find any download for jruby trunk (nightly builds would
be even nicer than the RCs). I've not much time so I'll just assume
that this was fixed and not a corner case hit in my environment.

- All the remaining cases showed up as being slower or only slightly faster than MRI, which is very unusual. So I ran a sampling profile and that pointed me toward Struct being slower than it should be. Struct hasn't been updated with many recent optimizations in the rest of the system, and I'm sure it can be made a lot faster.

I filed a bug: http://jira.codehaus.org/browse/JRUBY-2220

Sorry I didn't have time to submit this on the spot and forgot about it later. Glad I could submit this before the final 1.1 and to see that you react so quickly.

By the way, I looked at Nailgun and it would indeed be a good fit for my requirements. Thanks for the pointer.

I've seen a recent video of Matz presenting Ruby 1.9 to the Google people (http://www.youtube.com/watch?v=oEkJvvGEtB4\). He spoke of alternative Ruby VMs and mentionned that JRuby worked great and now had even better performance than MRI in the general case.
I think he made one of the best compliments one could expect from him when telling "I've mixed feelings about this" :slight_smile:

Lionel

Charles Oliver Nutter wrote:

- All the remaining cases showed up as being slower or only slightly faster than MRI, which is very unusual. So I ran a sampling profile and that pointed me toward Struct being slower than it should be. Struct hasn't been updated with many recent optimizations in the rest of the system, and I'm sure it can be made a lot faster.

I filed a bug: http://jira.codehaus.org/browse/JRUBY-2220

I haven't completely cleaned it up, but I removed a lot of unnecessary overhead from struct member access. I haven't done anything for the rest of struct yet.

Here's numbers comparing before and after in JRuby:

http://pastie.org/161223

With these numbers, JUSTIN and PHILIPP are looking a lot better, both convincingly faster than MRI. LIONEL still needs work, so there may be some other bottlenecks involved.

I'm certain they should all run faster in JRuby, at any rate. They will.

Incidentally, if anyone wants to add/improve my little Struct benchmark, I'd appreciate it. There's precious few standard benchmarks available in the Ruby world. It's bench_struct here:

http://svn.codehaus.org/jruby/trunk/jruby/test/bench/

And feel free to improve or add to any of the others as well.

- Charlie

Michal Suchanek wrote:

BTW I do not find the small script performance that bad. Implementing
something like the shell [ in jruby might have noticeable impact on
shell scripts but running a short script for testing seem fine. That
might be the OS X prelinking, though.

For example I use a ruby script as an output filter for incoming mails (ie: it takes a mail coming on an adress, uses ActionMailer to parse it and then process it)... There can be one such mail coming each second during peak hours. With MRI it works fast enough, with JRuby I'll have to use a daemon either by tranforming the script into a SMTP server or by using Nailgun.

Lionel

Lionel Bouton wrote:

Charles Oliver Nutter wrote:

Lionel Bouton wrote:

- 1.0 was slower than MRI (between 2x and 3x) and 1.1RC2 slightly faster (5 to 10%), the code was mainly doing floating point computations.

I did some investigation on the benchmarks.

- I skipped FRANK because it was already easily 2x as fast as MRI

If you happen to get a similar problem, for the reference DOUG makes 1.1RC2 crash here.

I'll give it a try, thanks for the tip.

I couldn't find any download for jruby trunk (nightly builds would
be even nicer than the RCs). I've not much time so I'll just assume
that this was fixed and not a corner case hit in my environment.

Unfortunately our nightly build system is a little on-again, off-again. The easiest way would be to just check it out and build it. Provided you have Java and Ant, it's trivial:

svn co http://svn.codehaus.org/jruby/trunk/jruby
cd jruby
ant
export PATH=`pwd`/bin:$PATH # optional, you can just bin/jruby too

That's really all there is to it. And if you're benchmarking, I strongly recommend Sun's Java 6 and -J-server flag to JRuby for best results.

Sorry I didn't have time to submit this on the spot and forgot about it later. Glad I could submit this before the final 1.1 and to see that you react so quickly.

We're really pushing performance in 1.1, so this is timed well. It would have been nice to have it filed as a bug, but at least I saw it here.

By the way, I looked at Nailgun and it would indeed be a good fit for my requirements. Thanks for the pointer.

Yeah, I'd love for someone to help the Nailgun guy update it and fix a few bugs I filed. Specifically, having signals propagate across would make it a lot more usable, since C-c on the client would do an equivalent action on the server. But signals are a bloody hard thing to map to multiple VMs either way.

I've seen a recent video of Matz presenting Ruby 1.9 to the Google people (http://www.youtube.com/watch?v=oEkJvvGEtB4\). He spoke of alternative Ruby VMs and mentionned that JRuby worked great and now had even better performance than MRI in the general case.
I think he made one of the best compliments one could expect from him when telling "I've mixed feelings about this" :slight_smile:

Yes, I thought that was pretty funny too :slight_smile: We had dinner with Matz the night before, so I think he's taking it all in stride. And we still have some work to do before we can say we're faster than 1.9 in all cases (though we're already faster in several).

- Charlie

Charles Oliver Nutter wrote:

Lionel Bouton wrote:

If you happen to get a similar problem, for the reference DOUG makes 1.1RC2 crash here.

I'll give it a try, thanks for the tip.

Confirmed and fixed already. It was a bit of code allowing Java nulls to get into an array, which should never happen. So it works fine on trunk and will work fine in 1.1 final.

- Charlie