Huge performance gap

There's a significant difference between GCC and the JVM for example:
VM's can collect performance data while the application is running
whereas GCC has to optimize at compile time. This yields advantages
for the VM approach because it can better target optimizations.
Depending on application performance of a Java app doesn't differ
significantly from a C app but the programming model is more
convenient, more robust and thus more efficient.

Kind regards

robert

···

2006/7/1, M. Edward (Ed) Borasky <znmeb@cesmail.net>:

4. The Ruby community needs to get Ruby's performance up where PHP 4 is
on benchmarks like this. It would be wonderful if it was better than
Perl and PHP, but a bare minimum is to be competitive with PHP 4.

On 4, I'm not sure a "virtual machine" is the answer, by the way.
"Virtual machines", or as I prefer to call them, "abstract machines",
were primarily intended for portability, not performance. C happens to
be a great abstract machine, and GCC happens to be a great way to
achieve portability and performance.

--
Have a look: Robert K. | Flickr

No except. It's a useless test. I wouldn't trust a single thing about
it. What this is *essentially* measuring, especially with CGI-style
output, is startup time. Ruby does have a slower start-up time than
other options.

This is *particularly* true of the nonsensical test of running Rails
in a CGI mode.

When real-world tests are done, Ruby *is* slower for now. But it's not
as slow as this would suggest. I never ended up doing an apache bench
on it, but at least subjectively, Ruwiki rendered as fast or faster
than most PHP wikis.

That's why I said that a guestbook would be a far better test and more
reliable. It's simple enough to implement in C, yet complex enough
that you're going to get more interesting results. You'd also get a
decent measure of code size differences.

-austin

···

On 7/2/06, Reggie Mr <buppcpp@yahoo.com> wrote:

Austin Ziegler wrote:
> On 7/1/06, Reggie Mr <buppcpp@yahoo.com> wrote:
>> Here is a simple graph of performance by different platforms.
>> http://www.usenetbinaries.com/doc/Web_Platform_Benchmarks.html
> I can't think of a more useless "test" other than anything put out by
> the Alioth shootout.
I would agree...except Ruby did VERY poorly in this "useless" test.

--
Austin Ziegler * halostatue@gmail.com * http://www.halostatue.ca/
               * austin@halostatue.ca * You are in a maze of twisty little passages, all alike. // halo • statue
               * austin@zieglers.ca

Except...the test results have to be taken with a very large grain of salt.

Just a moment ago I did a couple quick tests.

A "Hello, World" in PHP4:

<?php echo "Hello, World" ?>

and a "Hello, World" delivered via IOWA.

This is running on a machine which is not unloaded. It's also nowhere near the power of the machine the above test was done on, being a simple AMD Athlon box with slower RAM running a Linux 2.4 kernel and a 2.0.58 Apache, and it is not running the fastest configuration for IOWA (which is through FastCGI), but, rather, is running through mod_ruby.

Okay, enough with the disclaimers.

Multiple runs of ab -n 1000 produced a mean of about 700 requests per second from the PHP page, and about 200 from the IOWA page.

The ratio there is much better than on the Web_Platform_Benchmarks.html, and if I were to setup a test using fastcgi, it would improve further.

However, this is a weak comparison, because, like things really are not being compared.

So, to get something a little more similar, I dropped a "Hello World" into an existing CakePHP (1.1.3.2967) site that I have, and likewise dropped a "Hello World" into an IOWA (0.99.2.6) site with a comparable page layout and final page size.

CakePHP, if you are unfamiliar, is an MVC framework for PHP that is stylistically similar to Rails. So we're at least comparing frameworks to frameworks here, in the respective langauges.

IOWA beats CakePHP handily, and would expect RoR and Nitro to, as well, given what I know about their performance.

On an ab -n 1000 -c 1

I average 18 requests per second with CakePHP and 60 per second with IOWA. Again, with similarly sized pages, though the navigation in the IOWA example is generated dynamically, while it is static in the CakePHP example.

Playing with different levels of concurrency, I managed to get 35/second out of the CakePHP app and 80/second out of the IOWA one, which is still a ratio that falls dramatically in Ruby's favor when comparing actual frameworks.

Performance comparison can be an entertaining exercise, but with something with as many variables as web page delivery, all performance comparisons need to be interpreted with a bit of skepticism, including those I present above. Still, Ruby doesn't strike me as surprisingly slow in any comparisons that I have ever done.

Kirk Haines

···

On Sun, 2 Jul 2006, Reggie Mr wrote:

http://www.usenetbinaries.com/doc/Web_Platform_Benchmarks.html

I can't think of a more useless "test" other than anything put out by
the Alioth shootout.

I would agree...except Ruby did VERY poorly in this "useless" test.

Reggie Mr wrote:

Austin Ziegler wrote:

Here is a simple graph of performance by different platforms.

http://www.usenetbinaries.com/doc/Web_Platform_Benchmarks.html

I can't think of a more useless "test" other than anything put out by
the Alioth shootout.

Sure it's a simple test but that doesn't make it useless. Systemic performance testing is the most relevant, but "unit" performance testing also has its use.

I would agree...except Ruby did VERY poorly in this "useless" test.

Ruby didn't do poorly. With fastcgi, it compares with PHP5. I think that's quite respectable. What did poorly was RoR. ruby+fastcgi has good performance, but it drops by over an order of magnitude if you add rails to the mix. Now *that* is quite telling.

Daniel

···

On 7/1/06, Reggie Mr <buppcpp@yahoo.com> wrote:

E. Saynatkari wrote:

You could try to run your script on YARV[1] to see if it helps.

[1] http://atdot.net/yarv

E

And turn the magic opcodes on :smiley:

http://eigenclass.org/hiki.rb?yarv+ueber+algorithmical+optimization

lopex

Alexis Reigel wrote:

Stephen Waits wrote:
> E. Saynatkari wrote:
>
>>
>> Post the code somewhere, there might be room for improvement
>> in the algorithm though it will still be considerably slower.
>
> It looks, to me, like he attached his code to the OP.
>
> Regardless, it doesn't matter. Algorithmic improvements may help both
> the C++ and Ruby versions - but it's not going to change the fact that
> one is a relatively low-level language, compiled to native machine code,
> and the other is an interpreted dynamic language. To compare them is
> either ridiculous, or more likely in this case, simply ignorant.
>
> --Steve
>
Why should that be ridiculous or ignorant?
I stated that I was aware of the differences between interpreted and
compiled languages. But that does not change the fact that I believe
that this does not explain the performance gap. An execution time of
27.65 seconds against 0.33 seconds is not just nothing is it? It's a
factor of over 80 times. Besides, I implemented the same code in java
too, which isn't native code

More ignorance. Java has a JIT compiler which produces
machine code.

Alexis,

This is usually a friendly community, by the way. :rolleyes:

But yes, it's harder to make a language like Ruby, which is highly
dynamic at runtime, fast like C++ and Java, which are primarily statically compiled. The Smalltalk folks have reportedly done pretty
well though, so there exists the possiblilty that Ruby may get
substantially faster in the future. YARV is already making some
headway.

Regards,

Bill

···

From: "William James" <w_a_x_man@yahoo.com>

More ignorance. Java has a JIT compiler which produces
machine code.

Why are you being so mean? I wasn't aware of that.
I was just wondering why my results were so significantly different. All
I was asking was if someone had some explanations and comments on why
that is like that. I got some nice and reasonable answers, but I got
some unkind answers too... I didn't ask for bitter and unconstructive
comments. Did I somehow offend your honor? I did not say that ruby is
crap compared to c++ or java. I find ruby is an absolute fantastic
language. I was just surprised about my results.

Hi,

I found this article today :slight_smile:

E. Saynatkari wrote:

Well, Ruby is strictly interpreted using the parse tree instead
of VM opcodes which may or may not (depending on who you ask)
make a difference. Ruby is pretty slow but usually Fast Enough(tm).

You could try to run your script on YARV[1] to see if it helps.

On my Linux (on VMware) machine:

  Ruby: time elapsed: 27.811268 sec.
  YARV: time elapsed: 2.892428 sec.

(YARV with some special optimization option)

Regards,

···

--
// SASADA Koichi at atdot dot net

I really *should* learn how to program in C. :slight_smile: I can just barely read
C, actually.

Then again, at least a generation of C programmers have implemented GCC,
Perl, R, Linux, Ruby, etc., so I haven't felt the need. And some of the
Lisp and Scheme environments seem to be just as efficient abstract
machines as C/GCC. And then there's Forth -- another efficient abstract
machine. Choices, choices, too many choices. :slight_smile:

···

ara.t.howard@noaa.gov wrote:

On Sun, 2 Jul 2006, M. Edward (Ed) Borasky wrote:

On 4, I'm not sure a "virtual machine" is the answer, by the way.
"Virtual
machines", or as I prefer to call them, "abstract machines", were
primarily
intended for portability, not performance. C happens to be a great
abstract
machine, and GCC happens to be a great way to achieve portability and
performance.

amen!

-a

--
M. Edward (Ed) Borasky

http://linuxcapacityplanning.com

how is this performance data available significantly different from that made
transparent by gcc/gprof/gdb/dmalloc/etc - gcc can encode plenty of
information for tools like these to dump reams of info at runtime. or are you
referring to a vm's ability to actually adapt the runtime code? if so then it
seems like even compiled languages can accomplish this if the language is a
first class data type and the code segment can be manipulated as data

   http://64.233.167.104/search?q=cache:mNkjHYGIbE4J:tratt.net/laurie/research/publications/papers/tratt__compile-time_meta-programming_in_a_dynamically_typed_oo_language.pdf+lisp+compiled+metaprogramming&hl=en&gl=us&ct=clnk&cd=1

is an interesting read. if one accepts that compile time metaprogamming is
useful that it's a small leap to execute compile time metaprogramming to
enhance performance based on runtime characteristics. but maybe i'm way off
base here...

kind regards.

-a

···

On Sun, 2 Jul 2006, Robert Klemme wrote:

There's a significant difference between GCC and the JVM for example: VM's
can collect performance data while the application is running whereas GCC
has to optimize at compile time. This yields advantages for the VM approach
because it can better target optimizations. Depending on application
performance of a Java app doesn't differ significantly from a C app but the
programming model is more convenient, more robust and thus more efficient.

Kind regards

robert

--
suffering increases your inner strength. also, the wishing for suffering
makes the suffering disappear.
- h.h. the 14th dali lama

Robert Klemme wrote:

4. The Ruby community needs to get Ruby's performance up where PHP 4 is
on benchmarks like this. It would be wonderful if it was better than
Perl and PHP, but a bare minimum is to be competitive with PHP 4.

On 4, I'm not sure a "virtual machine" is the answer, by the way.
"Virtual machines", or as I prefer to call them, "abstract machines",
were primarily intended for portability, not performance. C happens to
be a great abstract machine, and GCC happens to be a great way to
achieve portability and performance.

There's a significant difference between GCC and the JVM for example:
VM's can collect performance data while the application is running
whereas GCC has to optimize at compile time. This yields advantages
for the VM approach because it can better target optimizations.

Ah, but at least for the multiprogramming case, so can (and *does*) the
operating system! And the *interpreter* can "collect performance data
while the application is running" and optimize just as easily -- maybe
ever more easily -- than some underlying abstract machine.

In any event:

1. The hardware is optimized to statistical properties of the workloads
it is expected to run.

2. The operating system is optimized to statistical properties of the
workloads it is expected to run and the hardware it is expected to run on.

3. Compilers are optimized to statistical properties of the programs
they are expected to compile and the hardware the compiled programs are
expected to run on.

As a result, I don't see the need for another layer of abstraction. It's
something else that needs to be optimized!

···

2006/7/1, M. Edward (Ed) Borasky <znmeb@cesmail.net>:

--
M. Edward (Ed) Borasky

http://linuxcapacityplanning.com

>>>>>>>>>>>>>
Ruby didn't do poorly. With fastcgi, it compares with PHP5. I think that's
quite
respectable. What did poorly was RoR. ruby+fastcgi has good performance,
but it
drops by over an order of magnitude if you add rails to the mix. Now
*that* is
quite telling.

Daniel
<<<<<<<<

We recently did a simple hello world test with Rails on a very low-end
machine and compared it with a Ruby framework that we built for our
commercial apps. Both apps had no database, and simply served the phrase
"Hello, world" with a text/plain mime type. The test client was running
localhost to minimize TCP and network effects. Rails was running in fast-cgi
mode (one process for the whole run) and our framework was running in CGI
mode (one fork per request).

Rails did 20 pages per second. The other app did 200 per second.
(Straight-run apache with a cached static page of similar size could
probably do 1000/second or more on this machine.)

Bear in mind, both of these frameworks are *Ruby*. This tells me the
comparison to other languages is misleading at best.

Austin makes a good point; I'd expect they'd all blow away Java in CGI mode
:slight_smile:

···

On 7/2/06, Austin Ziegler <halostatue@gmail.com> wrote:

On 7/2/06, Reggie Mr <buppcpp@yahoo.com> wrote:
> I would agree...except Ruby did VERY poorly in this "useless" test.

No except. It's a useless test. I wouldn't trust a single thing about
it. What this is *essentially* measuring, especially with CGI-style
output, is startup time. Ruby does have a slower start-up time than
other options.

--
Charles Oliver Nutter @ headius.blogspot.com
JRuby Developer @ www.jruby.org
Application Architect @ www.ventera.com

Alexis Reigel wrote:

>

More ignorance. Java has a JIT compiler which produces
machine code.

Why are you being so mean? I wasn't aware of that.

Calm down Alexis.. nobody is being mean. When someone doesn't understand something because they aren't educated about it, they are considered ignorant. There's nothing wrong with that.

Enough people have explained the reasons for the performance gap by this point that it should be clear. If you still need more help with this issue, please let us know.

Meanwhile, I'm certain that none of us intended to be mean, or otherwise attack you in any way.

--Steve

Yep. YARV does do better on this test:

# YARV 0.4.0
e:\yarv\bin\ruby sudoku-solver.rb
time elapsed: 18.953 sec.
count: 127989
3 6 2 4 9 5 7 8 1
9 7 1 6 2 8 5 3 4
8 5 4 1 3 7 9 6 2
2 9 3 5 6 4 1 7 8
5 1 7 3 8 2 4 9 6
6 4 8 9 7 1 2 5 3
7 2 9 8 1 3 6 4 5
1 8 5 7 4 6 3 2 9
4 3 6 2 5 9 8 1 7

# Regular ruby 1.8.4
ruby sudoku-solver.rb
time elapsed: 27.812 sec.
count: 127989
3 6 2 4 9 5 7 8 1
9 7 1 6 2 8 5 3 4
8 5 4 1 3 7 9 6 2
2 9 3 5 6 4 1 7 8
5 1 7 3 8 2 4 9 6
6 4 8 9 7 1 2 5 3
7 2 9 8 1 3 6 4 5
1 8 5 7 4 6 3 2 9
4 3 6 2 5 9 8 1 7

···

On 2/23/06, Bill Kelly <billk@cts.com> wrote:

From: "William James" <w_a_x_man@yahoo.com>
> Alexis Reigel wrote:
>> Stephen Waits wrote:
>> > E. Saynatkari wrote:
>> >
>> >>
>> >> Post the code somewhere, there might be room for improvement
>> >> in the algorithm though it will still be considerably slower.
>> >
>> >
>> > It looks, to me, like he attached his code to the OP.
>> >
>> > Regardless, it doesn't matter. Algorithmic improvements may help both
>> > the C++ and Ruby versions - but it's not going to change the fact that
>> > one is a relatively low-level language, compiled to native machine code,
>> > and the other is an interpreted dynamic language. To compare them is
>> > either ridiculous, or more likely in this case, simply ignorant.
>> >
>> > --Steve
>> >
>> Why should that be ridiculous or ignorant?
>> I stated that I was aware of the differences between interpreted and
>> compiled languages. But that does not change the fact that I believe
>> that this does not explain the performance gap. An execution time of
>> 27.65 seconds against 0.33 seconds is not just nothing is it? It's a
>> factor of over 80 times. Besides, I implemented the same code in java
>> too, which isn't native code
>
> More ignorance. Java has a JIT compiler which produces
> machine code.

Alexis,

This is usually a friendly community, by the way. :rolleyes:

But yes, it's harder to make a language like Ruby, which is highly
dynamic at runtime, fast like C++ and Java, which are primarily
statically compiled. The Smalltalk folks have reportedly done pretty
well though, so there exists the possiblilty that Ruby may get
substantially faster in the future. YARV is already making some
headway.

Alexis Reigel wrote:

>
> More ignorance. Java has a JIT compiler which produces
> machine code.
>

Why are you being so mean? I wasn't aware of that.
I was just wondering why my results were so significantly different. All
I was asking was if someone had some explanations and comments on why
that is like that. I got some nice and reasonable answers, but I got
some unkind answers too... I didn't ask for bitter and unconstructive
comments. Did I somehow offend your honor? I did not say that ruby is
crap compared to c++ or java. I find ruby is an absolute fantastic
language. I was just surprised about my results.

If you have the curiousity there are all kinds of results to wonder
about, for example Ruby compared to a C interpreter :wink:
http://shootout.alioth.debian.org/gp4sandbox/benchmark.php?test=all&lang=ruby&lang2=ch

SASADA Koichi wrote:

Hi,

I found this article today :slight_smile:

E. Saynatkari wrote:
  
On my Linux (on VMware) machine:

  Ruby: time elapsed: 27.811268 sec.
  YARV: time elapsed: 2.892428 sec.

(YARV with some special optimization option)
  

Guest machines under VMware are pretty much useless as a performance
profiling platform, for a variety of reasons.

···

--
M. Edward (Ed) Borasky

http://linuxcapacityplanning.com

Ahhh, venturing into a domain I love talking about.

Runtime-modification of code is exactly what sets the JVM apart from static
compilation and optimization in something like GCC. The link Ara posted
above is another way to look at the same kind of runtime modification. The
JVM, because it JIT compiles code rather than AOT, can change the parameters
of that compilation whenever it likes. If a particular piece of JIT-compiled
code is heavier on integer math than on memory allocation, it may re-JIT to
avoid processor-intensive aspects of object creation and garbage collection.
If a piece of code is used heavily, there may be opportunities to
dynamically inline or reorder subroutines at runtime. All the tricks C
coders might have to do by hand or decide on at compile time can be done as
needed based on runtime performance profiling.

You could go through a gcc/gdb/gprof and so on cycle, but the point of the
JVM is that you don't have to burn those hours. You write the code once, and
the JVM gobbles it up, runs it interpreted for a (short) while, and then
starts generating native machine code that fits the runtime profile it has
gathered. As that profile changes, code can be regenerated, and indeed the
longer an application runs, the faster it gets.

It is for this reason that many algorithms running in the JVM run as fast as
C or C++ equivalents.

Ruby has great potential to make these same kinds of optimizations at
runtime, and as I understand it, YARV will do quite a bit of "smart"
optimization at runtime.

The JVM got a bad wrap because in versions 1.2 and earlier, it really was
slow. Since 1.3, however, it has increased tremendously in
performance. 1.3was many times faster than
1.2. 1.4 was twice as fast as 1.3. 1.5 and 1.6 are each another 20-25%
faster again. I think the JVM and the recent success of the .NET CLR have
shown that a VM approach is a great way to go.

···

On 7/1/06, ara.t.howard@noaa.gov <ara.t.howard@noaa.gov> wrote:

On Sun, 2 Jul 2006, Robert Klemme wrote:

> There's a significant difference between GCC and the JVM for example:
VM's
> can collect performance data while the application is running whereas
GCC
> has to optimize at compile time. This yields advantages for the VM
approach
> because it can better target optimizations. Depending on application
> performance of a Java app doesn't differ significantly from a C app but
the
> programming model is more convenient, more robust and thus more
efficient.
>
> Kind regards
>
> robert

how is this performance data available significantly different from that
made
transparent by gcc/gprof/gdb/dmalloc/etc - gcc can encode plenty of
information for tools like these to dump reams of info at runtime. or are
you
referring to a vm's ability to actually adapt the runtime code? if so
then it
seems like even compiled languages can accomplish this if the language is
a
first class data type and the code segment can be manipulated as data

http://64.233.167.104/search?q=cache:mNkjHYGIbE4J:tratt.net/laurie/research/publications/papers/tratt__compile-time_meta-programming_in_a_dynamically_typed_oo_language.pdf+lisp+compiled+metaprogramming&hl=en&gl=us&ct=clnk&cd=1

is an interesting read. if one accepts that compile time metaprogamming
is
useful that it's a small leap to execute compile time metaprogramming to
enhance performance based on runtime characteristics. but maybe i'm way
off
base here...

--
Charles Oliver Nutter @ headius.blogspot.com
JRuby Developer @ www.jruby.org
Application Architect @ www.ventera.com

An interpreter *is* an underlying abstract machine. Ruby and Java both have
interpreters that do basically the same thing. Java additionally has a JIT
compiler that takes code the next step toward native.

···

On 7/1/06, M. Edward (Ed) Borasky <znmeb@cesmail.net> wrote:

Ah, but at least for the multiprogramming case, so can (and *does*) the
operating system! And the *interpreter* can "collect performance data
while the application is running" and optimize just as easily -- maybe
ever more easily -- than some underlying abstract machine.

--
Charles Oliver Nutter @ headius.blogspot.com
JRuby Developer @ www.jruby.org
Application Architect @ www.ventera.com

Robert Klemme wrote:
> There's a significant difference between GCC and the JVM for example:
> VM's can collect performance data while the application is running
> whereas GCC has to optimize at compile time. This yields advantages
> for the VM approach because it can better target optimizations.
Ah, but at least for the multiprogramming case, so can (and *does*) the
operating system! And the *interpreter* can "collect performance data
while the application is running" and optimize just as easily -- maybe
ever more easily -- than some underlying abstract machine.

As Charles pointed out the interpreter *is* a virtual machine and thus
equivalent to a JVM with regard to the runtime information it can
collect (AFAIK the current Ruby runtime does not, but it could).

In any event:

1. The hardware is optimized to statistical properties of the workloads
it is expected to run.

2. The operating system is optimized to statistical properties of the
workloads it is expected to run and the hardware it is expected to run on.

3. Compilers are optimized to statistical properties of the programs
they are expected to compile and the hardware the compiled programs are
expected to run on.

As a result, I don't see the need for another layer of abstraction. It's
something else that needs to be optimized!

I'm not sure whether you read Charles excellent posting about the
properties of a VM. All optimizations you mention are *static*, which
is reflected in the fact that they are based on statistical
information of a large set of applications, i.e. there is basically
just one application that those optimizations can target. A VM on the
other hand (and this is especially true for the JVM) has more precise
information about the current application's behavior and thus can
target optimizations better.

I'll try an example: consider method inlining. With C++ you can have
methods inlined at compile time. This will lead to code bloat and the
developer will have to decide which methods he wants inlined. This
takes time, because he has to do tests and profile the application.
Even then it might be that his tests do not reflect the production
behavior due to some error in the setup of wrong assumptions about the
data to be processed etc. In the worst case method inlining can have
an advers

···

2006/7/2, M. Edward (Ed) Borasky <znmeb@cesmail.net>:

--
Have a look: Robert K. | Flickr