Any tricks to speed up ruby?

and voila, a faster Ruby (not sure if the real reason was the compiler
flags, the pthread, or the updated version from p110 to p114, but
something helped it--I'm guessing it was the compiler options).

new ruby with compiler options:
time ./ruby -e "10000000.times {}"
real 0m4.049s

old ruby (the mac osx port one):
time ruby -e "10000000.times {}"
real 0m5.400s

So turns out that the difference seems to be that between p110 and p114.
compiler options were maybe 0.2s difference.

Some interesting benchmarks:

p110
5.4s

p111
5.23s

p114
4.1s

latest stable snapshot from the ruby-lang page:
5.8s

[latest snapshot build doesn't run since it appears to be based on 1.9
(?) ]

Anybody have any idea what might going on here? All compiled similarly,
p114 seems to smoke the rest.

Thanks.
-R

[Fri Mar 21 15:39:11 ~/Downloads/ruby-1.8.6-p114 ]$ time ./ruby -e
"10000000.times {}"

real 0m4.143s
user 0m3.601s
sys 0m0.026s
[Fri Mar 21 15:39:18 ~/Downloads/ruby-1.8.6-p114 ]$ time ruby -e
"10000000.times {}" # 'normal' ruby p111

real 0m5.742s
user 0m4.616s
sys 0m0.063s

···

--
Posted via http://www.ruby-forum.com/\.

...at least take me out for a drink first...

eww

···

On Jan 14, 2008, at 20:23 , s.ross wrote:

If you have a Rails app that is taking a long time in Ruby code, then you obviously have a compute-intensive action. Perhaps you could find the most expensive part of that action and rewrite it in C using ruby-inline (http://www.zenspider.com/ZSS/Products/RubyInline/\). If you haven't looked into this, and if you really need to write some tight C code, then you will kiss the feet of Ryan Davis for making it all so easy.

It's an interesting thought. However, I wasn't able to get gcc 4.2.2 to do some simpler things, like profile-based optimization, on the Ruby source, so I wouldn't expect something that complex to work out of the box. There are a lot of great things in gcc, but not many of them are as well tested as, say, the standard modular C library or program, and, of course, the Linux kernel.

I'm not sure thats simpler... I also looked at that once and decided
there were some distinctly unsimple things going on.

If I'm right about kde, then yes, its a very well tested feature. Also
there would be major impetus from embedded systems users to use whole
program optimization features.

As far as I know, "-O3 -march=<your processor type>" is about the best you can get out of gcc without a *lot* of work.

In the "medium work" category I suspect there are things relating to
function attributes and builtins that could get 5% or so more juice
(but tend to clutter the code with unportable improvements).

And there are a lot more things you can do at the Ruby source level
that have a bigger payoff than that does.

As always. My 100-10-1 rule of thumb is to expect speed up factors of up
to about 100x for using a much better algorithm, factors up to about
10x for code tweaks, factors up to 2x but usually near 1x for compiler
optimization tweaks.

John Carter Phone : (64)(3) 358 6639
Tait Electronics Fax : (64)(3) 359 4632
PO Box 1645 Christchurch Email : john.carter@tait.co.nz
New Zealand

···

On Wed, 16 Jan 2008, M. Edward (Ed) Borasky wrote:

I suspect this is pretty much noop as most of the code is in libruby
anyway and it usually has to be compiled with -fPIC to link at all.

Thanks

Michal

···

On 12/03/2008, Paul Brannan <pbrannan@atdesk.com> wrote:

On Wed, Mar 12, 2008 at 04:58:27PM +0900, Roger Pack wrote:
> Wow that does indeed help (once figured out). for me on my G4 it was
> (after I figured out I had a 7450 processor)
> export CFLAGS='-mtune=7450 -mcpu=7450 -fast -fPIC'
>
> and compile with --disable-pthread
>
> and voila, a faster Ruby (not sure if the real reason was the compiler
> flags, the pthread, or the updated version from p110 to p114, but
> something helped it--I'm guessing it was the compiler options).

I suspect --disable-pthread had the largest impact. The cost of memory
allocations can be high when linking with the threading library. Re-run
your tests without the other options if you want to verify.

I'm also surprised you got improved performance with -fPIC. I thought
position-independant code was supposed to run slower, usually.

Anybody have any idea what might going on here? All compiled
similarly, p114 seems to smoke the rest.

I'm guessing it's the way you built it; these are the only changes
listed in the ChangeLog between p111 and p114:

···

On Sat, Mar 22, 2008 at 06:40:26AM +0900, Roger Pack wrote:

Mon Mar 3 23:34:13 2008 GOTOU Yuuzou <gotoyuzo@notwork.org>

  * lib/webrick/httpservlet/filehandler.rb: should normalize path
    separators in path_info to prevent directory traversal attacks
    on DOSISH platforms.
    reported by Digital Security Research Group [DSECRG-08-026].

  * lib/webrick/httpservlet/filehandler.rb: pathnames which have
    not to be published should be checked case-insensitively.
    
Mon Dec 3 08:13:52 2007 Kouhei Sutou <kou@cozmixng.org>

  * test/rss/test_taxonomy.rb, test/rss/test_parser_1.0.rb,
    test/rss/test_image.rb, test/rss/rss-testcase.rb: ensured
    declaring XML namespaces.

Paul

Ground on which you walk, then, Ryan.

eeewwww ;/

···

On Jan 14, 2008, at 8:30 PM, Ryan Davis wrote:

On Jan 14, 2008, at 20:23 , s.ross wrote:

If you have a Rails app that is taking a long time in Ruby code, then you obviously have a compute-intensive action. Perhaps you could find the most expensive part of that action and rewrite it in C using ruby-inline (rubyinline | software projects | by ryan davis). If you haven't looked into this, and if you really need to write some tight C code, then you will kiss the feet of Ryan Davis for making it all so easy.

...at least take me out for a drink first...

eww

This is getting off the Ruby topic, but I think you're being unfair to optimizers (or rather, those who write them). The great thing about optimizers is that we can write clear and expressive code and not worry so much about strength reduction or loop unrolling or all those other things optimizers do for us. Honestly, does it make sense to you to read code like this:

a = (a << 1) + 1;

-or-

a *= 3;

The performance difference in a tight loop would be noticeable, but the next person to pick up the code would probably get a quizzical look on his or her face reading it. So the benchmark improvement may be between 1 and 2 percent, but the maintainability improvement could be of far greater benefit.

I'm not too familiar with gcc's global opts, but these are the most dangerous, but at the same time least obvious ones. If the optimizer gets it wrong, the code can break in the oddest ways; however, when the optimizer nails a global opt, it can make a difference you might never have predicted.

I am truly a fan of compiler optimizations because they fall into the "geez, that really is smart stuff" category. Still, you are absolutely correct that a better algorithm will win almost every time.

Just my $.02 :slight_smile:

···

On Jan 16, 2008, at 7:04 PM, John Carter wrote:

factors up to 2x but usually near 1x for compiler
optimization tweaks

Perhaps it depends on the platform, but on x86 linux, libruby is a
static library by default unless --enable-shared is used.

Paul

···

On Thu, Mar 13, 2008 at 12:34:31AM +0900, Michal Suchanek wrote:

I suspect this is pretty much noop as most of the code is in libruby
anyway and it usually has to be compiled with -fPIC to link at all.

Paul Brannan wrote:

I'm guessing it's the way you built it; these are the only changes
listed in the ChangeLog between p111 and p114:

Yep you were right on.

stable branch is fast

[Mon Mar 24 15:09:40 ~/Downloads/ruby_stable ]$ time ./ruby -e
"10000000.times {}"

real 0m4.222s

p111 is fast
time /usr/bin/ruby_old -e "10000000.times {}"

real 0m4.276s

and the rest in between similarly are.

The only truly slow one appears to be the macPort version. I don't know
what compile flags they are using but it appears truly slower.

time ruby -e "10000000.times {}"

real 0m5.710s
(consistently)
Despite that they both have similar startup speeds.

Thanks for pointing that out!

···

--
Posted via http://www.ruby-forum.com/\.

What... and waste a good drink?!?!

Mikel

···

> On Jan 14, 2008, at 20:23 , s.ross wrote:
>> some tight C code, then you will kiss the feet of Ryan Davis for
>> making it all so easy.

On Jan 14, 2008, at 8:30 PM, Ryan Davis wrote:
> ...at least take me out for a drink first...
> eww

On Jan 15, 2008 5:26 PM, s.ross <cwdinfo@gmail.com> wrote:

Ground on which you walk, then, Ryan.
eeewwww ;/

ruby -rrbconfig -e 'puts Config::CONFIG["configure_args"]'

Paul

···

On Tue, Mar 25, 2008 at 06:14:38AM +0900, Roger Pack wrote:

The only truly slow one appears to be the macPort version. I don't know
what compile flags they are using but it appears truly slower.