Benchmark for Ruby

Ok let us get off our nice host thread, which is much better of course.

Austin what you are suggesting seems very interesting to me, you claim that
we do not know anything about benchmarking.
For myself I accept this as a safe and comfortable working theory.
I am more than willing to learn though (to know even less afterwards but
philopsophy can wait, unless Ara is with us;).
So it is Ed, if I read correctly, who could teach us some tricks, R U with
us Ed?

Links?

I am looking forward to this.

Cheers
Robert

···

--
Deux choses sont infinies : l'univers et la bêtise humaine ; en ce qui
concerne l'univers, je n'en ai pas acquis la certitude absolue.

- Albert Einstein

Robert Dober wrote:

Ok let us get off our nice host thread, which is much better of course.

Austin what you are suggesting seems very interesting to me, you claim that
we do not know anything about benchmarking.
For myself I accept this as a safe and comfortable working theory.
I am more than willing to learn though (to know even less afterwards but
philopsophy can wait, unless Ara is with us;).
So it is Ed, if I read correctly, who could teach us some tricks, R U with
us Ed?

Links?

I am looking forward to this.

Cheers
Robert

Yeah, I'm with you. I actually took a look at the shootout page. First
of all, it isn't as bad a site as some people make it out to be. Second,
they are running Debian and Gentoo, which means almost anyone could
duplicate their work (assuming the whole enchilada can be downloaded as
a tarball).

Analysis Phase (Trick 1):

1. Collect the whole matrix of benchmarks. The rows will be benchmark
names and the columns will be languages, and the cells in the matrix
will be benchmark run times. Pick a language to be the "standard". C is
probably the obvious choice, since it's likely to be the most "practical
low-level language" (meaning not as many folks know Forth.) :slight_smile:

2. Now you compute the *natural log* of ratios of the times for all the
languages to the standard for each of the benchmarks. In some convenient
statistics package (A spreadsheet works fine, but I'd do it in R because
the kernel density estimators, boxplots, etc. are built in), compute the
histograms (or kernel density estimators, or boxplots, or all of the
above) of the ratios for each language. That tells you how the ratios
are distributed.

Example:

  Ruby Perl Python PHP C
Bench1 tr1 tp1 ty1 th1 tc1
Bench2 tr2 tp2 ty2 th2 tc2
Bench3 tr3 tp3 ty3 th3 tc3

  Ruby Perl Python PHP C
Bench1 ln(tr1/tc1) ln(tp1/tc1) . . 1
Bench2 ln(tr2/tc2) ln(tp2/tc2) . . 1
Bench3 ln(tr3/tc3) ln(tp3/tc3) . . 1

And then take the histograms of the columns (smaller is better).

Tuning Phase (Trick 2):

Find the midpoints on the density curves, boxplots or histograms. These
are the "typical" benchmarks. They are more representative than the
"outliers". I saw one, for example, where Ruby was over 100 times as
fast as Perl. That's not worth investing any time in -- it's some kind
of fluke, something either Perl sucks at, Ruby is wonderful at, or a
better implementation in the Ruby code than the Perl code.

Now you build a "profiling Ruby", run the mid-range benchmarks with
profiling, and see where Ruby is spending its time. If you happen to
have a friend on the YARV team or the Cardinal team, have them run the
benchmarks too.

Some other tricks:

Once you know where Ruby is spending its time, play with compiler flags.
gcc has oodles of possible optimizations, and gcc itself was tuned by
processes like this. It's worth spending a lot of time compiling the
Ruby interpreter, since it's going to be run often.

Those are simple "low-hanging fruit" tricks ... stuff you can do without
actually knowing what's going on inside the Ruby interpreter. It will be
painfully obvious from the profiles, I think, where the opportunities are.

M. Edward (Ed) Borasky wrote:

Robert Dober wrote:
> Ok let us get off our nice host thread, which is much better of course.
>
> Austin what you are suggesting seems very interesting to me, you claim that
> we do not know anything about benchmarking.
> For myself I accept this as a safe and comfortable working theory.
> I am more than willing to learn though (to know even less afterwards but
> philopsophy can wait, unless Ara is with us;).
> So it is Ed, if I read correctly, who could teach us some tricks, R U with
> us Ed?
>
> Links?
>
> I am looking forward to this.
>
> Cheers
> Robert
>
Yeah, I'm with you. I actually took a look at the shootout page. First
of all, it isn't as bad a site as some people make it out to be. Second,
they are running Debian and Gentoo, which means almost anyone could
duplicate their work (assuming the whole enchilada can be downloaded as
a tarball).

Grab the CVS tree
http://shootout.alioth.debian.org/gp4/faq.php#downsource

If you have problems building and measuring
http://shootout.alioth.debian.org/gp4/faq.php#talk

Or try something different
http://shootout.alioth.debian.org/gp4/faq.php#similar

···

Analysis Phase (Trick 1):

1. Collect the whole matrix of benchmarks. The rows will be benchmark
names and the columns will be languages, and the cells in the matrix
will be benchmark run times. Pick a language to be the "standard". C is
probably the obvious choice, since it's likely to be the most "practical
low-level language" (meaning not as many folks know Forth.) :slight_smile:

2. Now you compute the *natural log* of ratios of the times for all the
languages to the standard for each of the benchmarks. In some convenient
statistics package (A spreadsheet works fine, but I'd do it in R because
the kernel density estimators, boxplots, etc. are built in), compute the
histograms (or kernel density estimators, or boxplots, or all of the
above) of the ratios for each language. That tells you how the ratios
are distributed.

Example:

  Ruby Perl Python PHP C
Bench1 tr1 tp1 ty1 th1 tc1
Bench2 tr2 tp2 ty2 th2 tc2
Bench3 tr3 tp3 ty3 th3 tc3

  Ruby Perl Python PHP C
Bench1 ln(tr1/tc1) ln(tp1/tc1) . . 1
Bench2 ln(tr2/tc2) ln(tp2/tc2) . . 1
Bench3 ln(tr3/tc3) ln(tp3/tc3) . . 1

And then take the histograms of the columns (smaller is better).

Tuning Phase (Trick 2):

Find the midpoints on the density curves, boxplots or histograms. These
are the "typical" benchmarks. They are more representative than the
"outliers". I saw one, for example, where Ruby was over 100 times as
fast as Perl. That's not worth investing any time in -- it's some kind
of fluke, something either Perl sucks at, Ruby is wonderful at, or a
better implementation in the Ruby code than the Perl code.

Now you build a "profiling Ruby", run the mid-range benchmarks with
profiling, and see where Ruby is spending its time. If you happen to
have a friend on the YARV team or the Cardinal team, have them run the
benchmarks too.

Some other tricks:

Once you know where Ruby is spending its time, play with compiler flags.
gcc has oodles of possible optimizations, and gcc itself was tuned by
processes like this. It's worth spending a lot of time compiling the
Ruby interpreter, since it's going to be run often.

Those are simple "low-hanging fruit" tricks ... stuff you can do without
actually knowing what's going on inside the Ruby interpreter. It will be
painfully obvious from the profiles, I think, where the opportunities are.

Analysis Phase (Trick 1):

1. Collect the whole matrix of benchmarks. The rows will be benchmark
names and the columns will be languages, and the cells in the matrix
will be benchmark run times. Pick a language to be the "standard". C is
probably the obvious choice, since it's likely to be the most "practical
low-level language" (meaning not as many folks know Forth.) :slight_smile:

<OT>I've tried, but FORTH still hasn't clicked with me yet...</OT>
        [...]

Some other tricks:

Once you know where Ruby is spending its time, play with compiler flags.
gcc has oodles of possible optimizations, and gcc itself was tuned by
processes like this. It's worth spending a lot of time compiling the
Ruby interpreter, since it's going to be run often.

There exists at least this effort to use Genetic Algorithms for
tuning compiler options. I've not explored it yet.

http://www.coyotegulch.com/products/acovea/index.html

One may need a cluster of machines (of many platforms?) to do this
usefully, but still. Maybe Rinda can help us all contribute...

Those are simple "low-hanging fruit" tricks ... stuff you can do without
actually knowing what's going on inside the Ruby interpreter. It will be
painfully obvious from the profiles, I think, where the opportunities are.

        Hugh

···

On Fri, 15 Sep 2006, M. Edward (Ed) Borasky wrote:

<snip some very interesting benchmark methodology>

Once you know where Ruby is spending its time, play with compiler flags.
gcc has oodles of possible optimizations, and gcc itself was tuned by
processes like this. It's worth spending a lot of time compiling the
Ruby interpreter, since it's going to be run often.

I compiled Ruby on my system without --enable-pthreads and had a ~15-20%
performance increase in real-world runs of my application, which makes no use
of threads or external libraries. I emphasise that this is specific to my
system (linux kernel 2.6, nptl-only) and my application, but that's still a
non-insignificant performance increase for a real application run (rather
than a micro-benchmark).

Regards, Alex

···

On Friday 15 September 2006 03:48, M. Edward (Ed) Borasky wrote:

Playing with compiler flags is of limited utility across the board.
The compiler flags for each platform *will* *differ*, and GCC
optimizations are different than native compiler optimizations (and
GCC isn't well-optimized off PPC and Intel; I will never compile
something with a non-native compiler if I can avoid it for any
reason).

I'm not interested in things that require compiler flag tweaks --
that's far too variant and should be done on a per-system basis. I'm
looking for a benchmark suite that show areas where the implementation
can be improved. I'm not looking for artificially limited benchmarks
where a simple tweak of an operating system option (e.g., ulimit)
enables the benchmark to run or run faster.

There's the difference.

-austin

···

On 9/14/06, M. Edward (Ed) Borasky <znmeb@cesmail.net> wrote:

Once you know where Ruby is spending its time, play with compiler flags.
gcc has oodles of possible optimizations, and gcc itself was tuned by
processes like this. It's worth spending a lot of time compiling the
Ruby interpreter, since it's going to be run often.

--
Austin Ziegler * halostatue@gmail.com * http://www.halostatue.ca/
               * austin@halostatue.ca * You are in a maze of twisty little passages, all alike. // halo • statue
               * austin@zieglers.ca

Ed, Isaac

hopefully I have not wasted your time, I read your posts with interest, but
that is not what I wanted, or understood.
I *really* should have been clearer, sorry, sorry!

What I want is a Benchmark site for ruby
ruby vs. ruby, ruby only
teaching people how to benchmark code and giving them a good idea what is
fast, what is slow

an example: inject vs. each

Austin pointed out that this is more complicated than one might think, so
what I was interested in and I should have said so (no more posts after
23:00 local time!!!)

* are there problems with rubies benchmark module
* how to use or enhance it correctly
* what OS conditions to we have to assure for fair comparision
* maybe more

Cheers
Robert

···

--
Deux choses sont infinies : l'univers et la bêtise humaine ; en ce qui
concerne l'univers, je n'en ai pas acquis la certitude absolue.

- Albert Einstein

I'm with Austin on this. Raw performance improvements on particular systems
are interesting and worth doing and can be achieved in a whole range of good
ways. But it would be really interesting to *all* Ruby programmers to
understand where the implementation itself falls short (or less
provocatively, where it can be improved). I recently went through a
ruby-prof exercise with Net::LDAP's search function and I find a whole raft
of surprising things. In the first place, there were no "hot spots" where
the code was spending a double-digit percentage of its time. But there were
a lot of opportunities for 2% and 5% improvements, and they added up to
about a 60-70% improvement overall (meaning that a query which used to
execute in x time now takes about 0.4x).

Some of the surprises: Symbol#=== is really slow. Replace case statements
against Symbols with if/then constructions. Accessing hash tables is really
slow (no big surprise), so in really hot loops look for an algorithmic
alternative. And there a quite a few more. Maybe they ought to be compiled
and published.

And recently I was discussing GC with Kirk Haines and decided to test my
oft-expressed feeling that Ruby performance degrades very rapidly with
working-set size. And I turned up some strong hints (perhaps not surprising)
that Ruby would be a whole lot faster with generational GC.

···

On 9/15/06, Austin Ziegler <halostatue@gmail.com> wrote:

I'm not interested in things that require compiler flag tweaks --
that's far too variant and should be done on a per-system basis. I'm
looking for a benchmark suite that show areas where the implementation
can be improved. I'm not looking for artificially limited benchmarks
where a simple tweak of an operating system option (e.g., ulimit)
enables the benchmark to run or run faster.

There's the difference.

A. S. Bradbury wrote:

<snip some very interesting benchmark methodology>

Once you know where Ruby is spending its time, play with compiler flags.
gcc has oodles of possible optimizations, and gcc itself was tuned by
processes like this. It's worth spending a lot of time compiling the
Ruby interpreter, since it's going to be run often.

I compiled Ruby on my system without --enable-pthreads and had a ~15-20%
performance increase in real-world runs of my application, which makes no use
of threads or external libraries. I emphasise that this is specific to my
system (linux kernel 2.6, nptl-only) and my application, but that's still a
non-insignificant performance increase for a real application run (rather
than a micro-benchmark).

Regards, Alex

Despite the fact that gizmos like hyperthreading and dual-core
processors are the "default" in new boxes, a lot of "us" are still
running on quite serviceable single-processor machines. For such
machines, turning off pthreads when you recompile is usually a good
thing, for Ruby and quite a few other languages and applications that
implement their own threading models.

Speaking of such, if you are set up to rebuild your Linux kernel,
single-processor machines tend to run faster if you turn off SMP when
you rebuild the kernel.

···

On Friday 15 September 2006 03:48, M. Edward (Ed) Borasky wrote:

Hugh Sasse wrote:

<OT>I've tried, but FORTH still hasn't clicked with me yet...</OT>

<not-quite-OT>
Check out the gForth and vmgen manuals at

http://www.ugcs.caltech.edu/manuals/devtool/vmgen-0.6.2/index.html

There was a project to build a Ruby virtual machine using vmgen.
</not-quite-OT>

There exists at least this effort to use Genetic Algorithms for
tuning compiler options. I've not explored it yet.

http://www.coyotegulch.com/products/acovea/index.html

One may need a cluster of machines (of many platforms?) to do this
usefully, but still. Maybe Rinda can help us all contribute...

I think I installed acovea once -- it's part of Gentoo -- but I don't
remember doing anything with it. But the concept is certainly
intriguing, and might be more so to the folks on this list who are
always talking about how machine cycles are cheaper than programmer
cycles. Of course, if the programmer has to spend his or her cycles
waiting for a genetic algorithm to converge ...

:slight_smile:

I've had bad experiences in the past with this sort of optimization.
"Real" compiler optimization is a hard problem in the complexity sense,
plus there's all the time you have to spend correctness-testing the
optimized versions. My experience has been it's far better to pluck the
low-hanging fruit, which is what gcc does by itself, and which is what
the designers of virtual machines do.

Those are simple "low-hanging fruit" tricks ... stuff you can do without
actually knowing what's going on inside the Ruby interpreter. It will be
painfully obvious from the profiles, I think, where the opportunities are.

        Hugh

Yeah ...

I have a lot of requirement of the kind shown below, where ':asdf' would
be passed in as a parameter. I have written trivial versions of both
'case' and 'if/elsif/else'. The difference is very little, even over 10
million comparisons. But, in fact, 'case' seems to be faster.

js@srinivasj:~> cat tmp/test/1.rb
def timer
  b = Time.now
  yield
  e = Time.now
  puts "Time taken is #{e - b} seconds."
end

timer do
  10_000_000.times do | __ |
    x = case :asdf
        when :a
          'a'
        when :b
          'b'
        when :c
          'c'
        when :d
          'd'
        when :e
          'e'
        else
          'asdf'
        end
  end
end

timer do
  10_000_000.times do | __ |
    x = if :asdf == :a
          'a'
        elsif :asdf == :b
          'b'
        elsif :asdf == :c
          'c'
        elsif :asdf == :d
          'd'
        elsif :asdf == :e
          'e'
        else
          'asdf'
        end
  end
end

Were you referring to some other kind of 'if/then'? I am very interested
in this, since, as I mentioned above, I need this construct several
times.

Greetings,
JS

···

On Fri, 2006-09-15 at 21:43 +0900, Francis Cianfrocca wrote:

Some of the surprises: Symbol#=== is really slow. Replace case statements
against Symbols with if/then constructions.

M. Edward (Ed) Borasky wrote:

/ ...

Speaking of such, if you are set up to rebuild your Linux kernel,
single-processor machines tend to run faster if you turn off SMP when
you rebuild the kernel.

Further, I have seen single-processor machines lock up when running some
builds of SMP kernels, to the degree that I never allow them to run.

···

--
Paul Lutus
http://www.arachnoid.com

Hugh Sasse wrote:

> <OT>I've tried, but FORTH still hasn't clicked with me yet...</OT>

<not-quite-OT>

        [...]

</not-quite-OT>

Thanks, I'll reply off list about that.

> There exists at least this effort to use Genetic Algorithms for
> tuning compiler options. I've not explored it yet.
>
> http://www.coyotegulch.com/products/acovea/index.html
>
> One may need a cluster of machines (of many platforms?) to do this
> usefully, but still. Maybe Rinda can help us all contribute...

I think I installed acovea once -- it's part of Gentoo -- but I don't
remember doing anything with it. But the concept is certainly
intriguing, and might be more so to the folks on this list who are
always talking about how machine cycles are cheaper than programmer
cycles. Of course, if the programmer has to spend his or her cycles
waiting for a genetic algorithm to converge ...

:slight_smile:

GA's aren't that quick, and would not be for a ruby build. But it's
something to explore, just because it might teach us something.

I've had bad experiences in the past with this sort of optimization.
"Real" compiler optimization is a hard problem in the complexity sense,
plus there's all the time you have to spend correctness-testing the

Well, at least we have a set of tests for ruby, and we can use that
as part of the fitness function.

optimized versions. My experience has been it's far better to pluck the
low-hanging fruit, which is what gcc does by itself, and which is what
the designers of virtual machines do.

People have stated that implementation method despatch in ruby are naive.

http://smallthought.com/avi/?p=16

that creating Procs, and continuations are slow:

http://lambda-the-ultimate.org/node/1470

and other people have mentioned the garbage collection system.

I'm certainly not in a position to suggest what might be done about
these things, or to denigrate the implementations as they stand, but
these are about the only specific things I can find people pointing
to, (other than the general remarks about ruby being slow, which add
more heat than light). So I think we have some juicy pieces of fruit
to bite into here, but I don't think they are low hanging, not for
me anyway. :slight_smile:

        Hugh

···

On Sat, 16 Sep 2006, M. Edward (Ed) Borasky wrote:

You have a literal Symbol in the case statement. Try it with a variable that
refers to an object of type Symbol. I got nearly a three percent speed
improvement by changing this out, and it was only over a few hundred
thousand iterations, not ten million. I haven't looked at the implementation
(yet) so I have no clue why this behaves as it does.

···

On 9/15/06, Srinivas JONNALAGADDA <srinivas.j@siritech.com> wrote:

     x = case :asdf
        when :a
          'a'

Hugh Sasse wrote:

> Hugh Sasse wrote:
>
> > <OT>I've tried, but FORTH still hasn't clicked with me yet...</OT>
>
> <not-quite-OT>
        [...]
> </not-quite-OT>

Thanks, I'll reply off list about that.
>
> > There exists at least this effort to use Genetic Algorithms for
> > tuning compiler options. I've not explored it yet.
> >
> > http://www.coyotegulch.com/products/acovea/index.html
> >
> > One may need a cluster of machines (of many platforms?) to do this
> > usefully, but still. Maybe Rinda can help us all contribute...
>
> I think I installed acovea once -- it's part of Gentoo -- but I don't
> remember doing anything with it. But the concept is certainly
> intriguing, and might be more so to the folks on this list who are
> always talking about how machine cycles are cheaper than programmer
> cycles. Of course, if the programmer has to spend his or her cycles
> waiting for a genetic algorithm to converge ...
>
> :slight_smile:

GA's aren't that quick, and would not be for a ruby build. But it's
something to explore, just because it might teach us something.
>
> I've had bad experiences in the past with this sort of optimization.
> "Real" compiler optimization is a hard problem in the complexity sense,
> plus there's all the time you have to spend correctness-testing the

Well, at least we have a set of tests for ruby, and we can use that
as part of the fitness function.

> optimized versions. My experience has been it's far better to pluck the
> low-hanging fruit, which is what gcc does by itself, and which is what
> the designers of virtual machines do.

People have stated that implementation method despatch in ruby are naive.

http://smallthought.com/avi/?p=16

that creating Procs, and continuations are slow:

Block performance in Ruby | Lambda the Ultimate

and other people have mentioned the garbage collection system.

I'm certainly not in a position to suggest what might be done about
these things, or to denigrate the implementations as they stand, but
these are about the only specific things I can find people pointing
to, (other than the general remarks about ruby being slow, which add
more heat than light). So I think we have some juicy pieces of fruit
to bite into here, but I don't think they are low hanging, not for
me anyway. :slight_smile:

        Hugh

I can't recall whether I read about this on this site or in some
magazine article but I recall it being interesting to me: I think it
was called profile driven optimization. My vague recollection is that
gcc can optimize based on a runtime profile. So, you run ruby over
your application while profiling it all together, essentually profiling
ruby in the context of your application. Then you use the profile guide
gcc to rebuild a version of ruby optimized for your application. Might
be worth some research.

Ken

···

On Sat, 16 Sep 2006, M. Edward (Ed) Borasky wrote:

Hugh Sasse wrote:
>
> > Hugh Sasse wrote:
> >
> > > <OT>I've tried, but FORTH still hasn't clicked with me yet...</OT>
> >
> > <not-quite-OT>
> [...]
> > </not-quite-OT>
>
> Thanks, I'll reply off list about that.
> >
> > > There exists at least this effort to use Genetic Algorithms for
> > > tuning compiler options. I've not explored it yet.
> > >
> > > http://www.coyotegulch.com/products/acovea/index.html
> > >
> > > One may need a cluster of machines (of many platforms?) to do this
> > > usefully, but still. Maybe Rinda can help us all contribute...
> >
> > I think I installed acovea once -- it's part of Gentoo -- but I don't
> > remember doing anything with it. But the concept is certainly
> > intriguing, and might be more so to the folks on this list who are
> > always talking about how machine cycles are cheaper than programmer
> > cycles. Of course, if the programmer has to spend his or her cycles
> > waiting for a genetic algorithm to converge ...
> >
> > :slight_smile:
>
> GA's aren't that quick, and would not be for a ruby build. But it's
> something to explore, just because it might teach us something.
> >
> > I've had bad experiences in the past with this sort of optimization.
> > "Real" compiler optimization is a hard problem in the complexity sense,
> > plus there's all the time you have to spend correctness-testing the
>
> Well, at least we have a set of tests for ruby, and we can use that
> as part of the fitness function.
>
> > optimized versions. My experience has been it's far better to pluck the
> > low-hanging fruit, which is what gcc does by itself, and which is what
> > the designers of virtual machines do.
>
> People have stated that implementation method despatch in ruby are naive.
>
> http://smallthought.com/avi/?p=16
>
> that creating Procs, and continuations are slow:
>
> Block performance in Ruby | Lambda the Ultimate
>
> and other people have mentioned the garbage collection system.
>
> I'm certainly not in a position to suggest what might be done about
> these things, or to denigrate the implementations as they stand, but
> these are about the only specific things I can find people pointing
> to, (other than the general remarks about ruby being slow, which add
> more heat than light). So I think we have some juicy pieces of fruit
> to bite into here, but I don't think they are low hanging, not for
> me anyway. :slight_smile:
>
> Hugh

I can't recall whether I read about this on this site or in some
magazine article but I recall it being interesting to me: I think it
was called profile driven optimization. My vague recollection is that
gcc can optimize based on a runtime profile. So, you run ruby over
your application while profiling it all together, essentually profiling
ruby in the context of your application. Then you use the profile guide
gcc to rebuild a version of ruby optimized for your application. Might
be worth some research.

RedHat Magazine had an article on GCC optimizations that talked about
this:

http://www.redhat.com/magazine/011sep05/features/gcc/

it's also a recurring topic at the GCC summit:
http://www.gccsummit.org/2005/view_abstract.php?content_key=7
http://www.gccsummit.org/2006/view_abstract.php?content_key=17

···

On 9/15/06, Kenosis <kenosis@gmail.com> wrote:

> On Sat, 16 Sep 2006, M. Edward (Ed) Borasky wrote:

Ken

--
thanks,
-pate
-------------------------

pat eyler wrote:

I can't recall whether I read about this on this site or in some
magazine article but I recall it being interesting to me: I think it
was called profile driven optimization. My vague recollection is that
gcc can optimize based on a runtime profile. So, you run ruby over
your application while profiling it all together, essentually profiling
ruby in the context of your application. Then you use the profile guide
gcc to rebuild a version of ruby optimized for your application. Might
be worth some research.

RedHat Magazine had an article on GCC optimizations that talked about
this:

http://www.redhat.com/magazine/011sep05/features/gcc/

it's also a recurring topic at the GCC summit:
http://www.gccsummit.org/2005/view_abstract.php?content_key=7
http://www.gccsummit.org/2006/view_abstract.php?content_key=17

Well ... the good news is that I have gcc 4.1.1 and a "test suite"
consisting of a single benchmark, plus scripts to build Ruby and
YARV-Ruby with "gprof" enabled. The bad news is that I have very little
play time this weekend because I'm attending a couple of workshops on
... well ... other programming languages. :slight_smile:

The test suite can be found at

http://rubyforge.org/cgi-bin/viewvc.cgi/MatrixBenchmark/?root=cougar

By the way, the kind of code I'm interested in running efficiently in
Ruby is well-represented by the matrix benchmark. Pretty much everything
I want to do can be expressed ultimately in terms of matrix
multiplication, and I'm on the verge of filing a Ruby Change Request to
get the Mathn, Rational, Complex and Matrix libraries coded up in C and
become part of the base Ruby language.

I'm well aware of the dozens of C and C++ math libraries that have been
interfaced with Ruby, and dozens more that *could* be interfaced, given
some love (and SWIG). :slight_smile: However, the "pure Ruby" libraries I listed
above are exactly what I need.

···

On 9/15/06, Kenosis <kenosis@gmail.com> wrote:

For those who are interested, I already measured the effect of GCC
optimization flags on Ruby speed using the MatrixBenchmark, which was the
only one I had at the time. Results here:
http://www.jhaampe.org/software/ruby-gcc

···

--
Sylvain Joyeux

Well I do not believe in the right of OP to have the thread following the
topic in a ML, the irony is that I just got off another thread because some
other people do not share that idea and I thaught it might be a good idea to
respect their believes.

However what I really wanted to say:
Very interesting but is there nothing which can be said about Ruby,
everybody is talking about implementations and tweaks.
I thaught it might be a good idea to talk about crimes and blunders,
performance wise.
I am not sure any more that this might be possible, maybe we wait for YARV
and Ruby2 to have a sound base for discussion.

Thanks for *all* contributions (included the future ones :wink:

Robert

···

On 9/15/06, Sylvain Joyeux <sylvain.joyeux@polytechnique.org> wrote:

For those who are interested, I already measured the effect of GCC
optimization flags on Ruby speed using the MatrixBenchmark, which was the
only one I had at the time. Results here:
http://www.jhaampe.org/software/ruby-gcc
--
Sylvain Joyeux

I find all that very interesting, completely OT, but interesting.

--
Deux choses sont infinies : l'univers et la bêtise humaine ; en ce qui
concerne l'univers, je n'en ai pas acquis la certitude absolue.

- Albert Einstein

Quoting Sylvain Joyeux <sylvain.joyeux@polytechnique.org>:

For those who are interested, I already measured the effect of GCC
optimization flags on Ruby speed using the MatrixBenchmark, which was the
only one I had at the time. Results here:
http://www.jhaampe.org/software/ruby-gcc

Very interesting ... 22 percent speedup overall. I believe, though I'd need to
go back and check the scripts, that when I ran the original tests, I compiled
with O3 and "march" for my machine, which is my default setting for Gentoo
Linux. But I didn't know about the GCC feature of recompiling after profiling,
so I didn't try that. It's clearly worth doing, and when I get some free time
I'll put it in the benchmark.

While we're on the subject, I haven't forgotten the needs of our Windows
brethren. :slight_smile: Does anyone here know if either the Visual Studio Express C++ or
MinGW compilers are capable of

a) Tuning code to the architecture a la GCC's "march", and/or
b) Inserting profiling counters and recompiling on the basis of the profiled
runs?

···

--
Sylvain Joyeux