Wow: jruby vs. MRE performance

Hi Folks,

I've been working on an application for a while now using MRE, and for a
variety of reasons, I need to get things working with JRuby. I was
pretty astounded by a couple of things:

First, JRuby takes up a lot more memory than MRE, which I suppose is the
nature of the beast. It can't be 100% apples to apples comparison,
because to do what I'm doing I'm using different UI toolkits in each
environment. However, for comparison (top output):

  PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
6042 ast 20 0 423m 105m 22m S 0 2.7 1:42.55 ruby
6058 ast 20 0 816m 329m 11m S 110 8.3 0:56.71 java

However, I have to say that I was blown away by the performance
*increase* of a particular operation in my application. Here's the
comparison of the timings:

MRE: 15.54s
JRuby: 4.56s

The nature of the application is that it does a bunch of filesystem I/O
operations, lots of regular expression checks, and lots and lots and
lots of Hash lookups. This particular timing figure also has nothing to
do with the UI environment. It's generated internally by code triggered
by the UI.

Would I be right in assuming that this performance difference is due to
faster underlying performance of the relevant Java classes, or is it
down to the execution speed of the VMs (or maybe a bit of both)?

The system environment details are:

Intel(R) Core(TM)2 CPU 6700 @ 2.66GHz
MemTotal: 4046672 kB

ruby 1.8.6 (2007-09-24 patchlevel 111) [x86_64-linux]
jruby 1.3.0 (ruby 1.8.6p287) (2009-04-21 r9535) (Java HotSpot(TM) 64-Bit Server VM 1.6.0_11) [amd64-java]

I wasn't expecting this kind of performance difference, but I have to
say that I'm quite happy about it. :slight_smile:

Great work to the JRuby team!

Cheers,

ast

P.S. No, I haven't tried it on YARV, and I'm not particularly interested
in benchmarks. This was just something that I noticed in the course of
doing my work that I wanted to share and see if other people had seen
this kind of behavior.

···

--
Andrew S. Townley <ast@atownley.org>
http://atownley.org

Andrew S. Townley wrote:

First, JRuby takes up a lot more memory than MRE, which I suppose is the
nature of the beast. It can't be 100% apples to apples comparison,
because to do what I'm doing I'm using different UI toolkits in each
environment. However, for comparison (top output):

  PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 6042 ast 20 0 423m 105m 22m S 0 2.7 1:42.55 ruby 6058 ast 20 0 816m 329m 11m S 110 8.3 0:56.71 java

There are ways to mitigate this, somewhat, but it's also the nature of a VM with a generational garbage collector to be larger in memory.

MRE: 15.54s
JRuby: 4.56s

Very nice! We would probably improve even more on a longer run; this is a pretty short timing, and anything under 10s doesn't really show our full performance.

The nature of the application is that it does a bunch of filesystem I/O
operations, lots of regular expression checks, and lots and lots and
lots of Hash lookups. This particular timing figure also has nothing to
do with the UI environment. It's generated internally by code triggered
by the UI.

Would I be right in assuming that this performance difference is due to
faster underlying performance of the relevant Java classes, or is it
down to the execution speed of the VMs (or maybe a bit of both)?

For that kind of application most of the performance difference is probably coming from our implementations of Regexp, IO, and Hash. IO is usually a bit slower than MRI for us (needs work), but Regexp and Hash are usually very good performance. And the JVM helps a lot as well; most of our core classes have been written in a way that allows the JVM to optimize them well.

You're probably seeing some boost from JRuby's execution performance, which is pretty good as well. But I'd bet most comes from the core class impls.

The system environment details are:

Intel(R) Core(TM)2 CPU 6700 @ 2.66GHz
MemTotal: 4046672 kB

ruby 1.8.6 (2007-09-24 patchlevel 111) [x86_64-linux]
jruby 1.3.0 (ruby 1.8.6p287) (2009-04-21 r9535) (Java HotSpot(TM) 64-Bit Server VM 1.6.0_11) [amd64-java]

The 64-bit JVMs take up quite a bit more memory, as would a 64-bit implementation of anything (pointers at the very least become twice as wide). If you have the option, you might try running in 32-bit mode (-d32 passed to JVM, or -J-d32 passed to JRuby). But on 64-bit linux I don't think there's a 32-bit JVM by default.

You can also try to choke the maximum memory down. By default, JRuby will allow the JVM to grow its heap up to 512MB. You can modify this up or down with the -Xmx JVM flag (or -J-Xmx JRuby flag). So the default would be -J-Xmx512M and if you can run with less, you might be able to force JRuby+JVM to use less memory at a possible cost of performance (more GC).

- Charlie

Andrew S. Townley wrote:
> First, JRuby takes up a lot more memory than MRE, which I suppose is the
> nature of the beast. It can't be 100% apples to apples comparison,
> because to do what I'm doing I'm using different UI toolkits in each
> environment. However, for comparison (top output):
>
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> 6042 ast 20 0 423m 105m 22m S 0 2.7 1:42.55 ruby
> 6058 ast 20 0 816m 329m 11m S 110 8.3 0:56.71 java

There are ways to mitigate this, somewhat, but it's also the nature of a
VM with a generational garbage collector to be larger in memory.

> MRE: 15.54s
> JRuby: 4.56s

Very nice! We would probably improve even more on a longer run; this is
a pretty short timing, and anything under 10s doesn't really show our
full performance.

Since this is a desktop app, if I get anything with over 10s response
time, then I've got serious work to do (normally shooting for the < 3s
range)! :slight_smile: Of course, this is why the MRI (sorry for the MRE typo
earlier) implementation was starting to worry me. Still, this is one of
the largest datasets that I have at the moment (which is still pretty
small).

One of the things on the list is to break this up so you can navigate it
in smaller chunks. Normally, you don't need to see all this stuff at
once anyway. The UI model is evolving as we go...

> The nature of the application is that it does a bunch of filesystem I/O
> operations, lots of regular expression checks, and lots and lots and
> lots of Hash lookups. This particular timing figure also has nothing to
> do with the UI environment. It's generated internally by code triggered
> by the UI.
>
> Would I be right in assuming that this performance difference is due to
> faster underlying performance of the relevant Java classes, or is it
> down to the execution speed of the VMs (or maybe a bit of both)?

For that kind of application most of the performance difference is
probably coming from our implementations of Regexp, IO, and Hash. IO is
usually a bit slower than MRI for us (needs work), but Regexp and Hash
are usually very good performance. And the JVM helps a lot as well; most
of our core classes have been written in a way that allows the JVM to
optimize them well.

You're probably seeing some boost from JRuby's execution performance,
which is pretty good as well. But I'd bet most comes from the core class
impls.

Interesting info. Thanks.

> The system environment details are:
>
> Intel(R) Core(TM)2 CPU 6700 @ 2.66GHz
> MemTotal: 4046672 kB
>
> ruby 1.8.6 (2007-09-24 patchlevel 111) [x86_64-linux]
> jruby 1.3.0 (ruby 1.8.6p287) (2009-04-21 r9535) (Java HotSpot(TM) 64-Bit Server VM 1.6.0_11) [amd64-java]

The 64-bit JVMs take up quite a bit more memory, as would a 64-bit
implementation of anything (pointers at the very least become twice as
wide). If you have the option, you might try running in 32-bit mode
(-d32 passed to JVM, or -J-d32 passed to JRuby). But on 64-bit linux I
don't think there's a 32-bit JVM by default.

I installed this one by hand (along with a few others for testing). I'm
not really too worried about the 32-bit JVM at the moment as I'm mostly
concerned with just making things work.

You can also try to choke the maximum memory down. By default, JRuby
will allow the JVM to grow its heap up to 512MB. You can modify this up
or down with the -Xmx JVM flag (or -J-Xmx JRuby flag). So the default
would be -J-Xmx512M and if you can run with less, you might be able to
force JRuby+JVM to use less memory at a possible cost of performance
(more GC).

Yeah. About that... actually the first time I tried to run anything
with JRuby (and this codebase), it said that I needed more heap size
with everything I tried (1024m, 2048m) up to 4096m. I've no idea what
the issue was, but it seems to hang/get confused when I get exceptions
from time to time. I had it set that way (4096m) the first time I ran
it, but then I tried it again with the defaults to see if it would work.
Fortunately, it did.

So far, it doesn't seem as forgiving or as informational as MRI when
handling exceptions, though. At this stage, I'm learning to translate,
however. :slight_smile:

Once I get more of the UI working (at the moment, I only have one of
about 10 views "ported" to JFC/Swing from Ruby/GNOME2) and start having
something that I can actually use, I might start messing with
performance tweaks. It eventually needs to run (and run well) on
machines like my laptop:

Intel(R) Core(TM)2 Duo CPU U7700 @ 1.33GHz (normally ~800MHz)

It also has half the processor cache size and half the total memory, so
it will be more important. It's also got to run comfortably on Windows
(XP & Vista) without requiring OTT specs, so this might also trigger
some performance improvements.

Still, I'm chuffed, because I was actually expecting it to be slower.

Cheers,

ast

···

On Sat, 2009-04-25 at 00:21 +0900, Charles Oliver Nutter wrote:
--
Andrew S. Townley <ast@atownley.org>
http://atownley.org

Andrew S. Townley wrote:

Yeah. About that... actually the first time I tried to run anything
with JRuby (and this codebase), it said that I needed more heap size
with everything I tried (1024m, 2048m) up to 4096m. I've no idea what
the issue was, but it seems to hang/get confused when I get exceptions
from time to time. I had it set that way (4096m) the first time I ran
it, but then I tried it again with the defaults to see if it would work.
Fortunately, it did.

Odd...if you see it eating up all available memory again find us on #jruby on FreeNode IRC.

So far, it doesn't seem as forgiving or as informational as MRI when
handling exceptions, though. At this stage, I'm learning to translate,
however. :slight_smile:

Yeah, when it comes to stack overflows and out-of-memory we have limited options on the JVM; both are pretty fatal for the thread that encounters them, so the best we can do is say "oops, we used too much" and provide information on flags. But it seems like something else may have been broken if it was eating up over 4GB.

Once I get more of the UI working (at the moment, I only have one of
about 10 views "ported" to JFC/Swing from Ruby/GNOME2) and start having
something that I can actually use, I might start messing with
performance tweaks. It eventually needs to run (and run well) on
machines like my laptop:

Intel(R) Core(TM)2 Duo CPU U7700 @ 1.33GHz (normally ~800MHz)

We'll definitely want to look at ways to reduce the in-memory working set. If you can't shrink things down enough with your upcoming changes, come back and we'll see about doing some heap profiling and investigate whether something in JRuby is taking more memory than it ought to.

Still, I'm chuffed, because I was actually expecting it to be slower.

Well, I'm glad you gave it a try :slight_smile: Keep in touch and maybe we can find specific ways to improve it even more.

- Charlie