Ruby is a slow performer

Sergei Gnezdov ha scritto:

Hi,

I just found an interesting benchmark site. I am sure many of you know about it :slight_smile:

  http://shootout.alioth.debian.org/

Ruby is one of the slowest performers among compared languages.
Are there any plans to get better?

I guess there are :slight_smile:
The latest message about YARV that plans to make ruby from 50% to >300% faster (warning benchmarks are 80% made of..) not to mention the rubydium JIT.
So let's join koichi and alexander hacking on them ASAP :slight_smile:

About the shootout. Most of the stuff is plan useless in ruby.
You won't implemente heapsort in ruby, cause you have sort/sort_by.
Nor you would use
http://shootout.alioth.debian.org/lang/ruby/random.ruby.html
because you have rand().
(note that this gen_random algorithm is applied in other places slowing down even those)

Basically remeber that ruby is 'fast enough' for most things, useless for others. You can't have everything I guess :slight_smile:

PS
Anyway, stuff from that shootout is wildly unidiomatic, just look at
http://shootout.alioth.debian.org/lang/ruby/lists.ruby.html

I remember sending a message to them but getting no feedback. Maybe I should try again.

gabriele renzi wrote:

About the shootout. Most of the stuff is plan useless in ruby.
You won't implemente heapsort in ruby, cause you have sort/sort_by.
Nor you would use
http://shootout.alioth.debian.org/lang/ruby/random.ruby.html
because you have rand().

You are correct. I would say that the shootout is extremely
misleading, for example the Python people do dubious tricks
to make it look fast: for example the matrix multiplication is
implemented by calling the an extension method.

Basically remeber that ruby is 'fast enough' for most things, useless for others. You can't have everything I guess :slight_smile:

I find Ocaml very approporiate for almost all tasks :wink:
Only for very short scripts (up to 100 lines) has Ruby
a clear competitive edge.

Why is that unrealistic? You could do the same in Ruby by
implementing the library 'Matrix' class in C. We already have
a library that handles the stuff; it's just a matter of
optimization (== time == extra code in C that's harder to
maintain, unfortunately). I actually played with this a bit.
Right now, using the Ruby matrix class does two things to
that test:

  a) Reduces the lines of code by nearly half
  b) Increases the runtime by a factor of two. (ouch)

If we had a superfast matrix class, it would give (a) along
with decreasing the runtime by a factor of ten or so. Sounds
reasonable. The same test in matlab would be about 1/4 the lines of
code and 20x faster if implemented the way it should be.

Admittedly, most people would probably use external hooks
to lapack for serious matrix manipulation, but who's counting?
There's something to be said, in all honesty, for fast built-in
library support.

The shootout _is_ useful for understanding some general
aspects of the languages, and for looking for things that
you might want to really work on speeding up (say, for
instance, the fact that Ruby's hash accesses seem to be
about 1/2 the speed of Perl's. :wink:

  -Dave

路路路

On Sat, Nov 06, 2004 at 06:23:41AM +0900, Christian Szegedy scribed: > gabriele renzi wrote:

>About the shootout. Most of the stuff is plan useless in ruby.
>You won't implemente heapsort in ruby, cause you have sort/sort_by.
>Nor you would use
>http://shootout.alioth.debian.org/lang/ruby/random.ruby.html
>because you have rand().

You are correct. I would say that the shootout is extremely
misleading, for example the Python people do dubious tricks
to make it look fast: for example the matrix multiplication is
implemented by calling the an extension method.

--
work: dga@lcs.mit.edu me: dga@pobox.com
      MIT Laboratory for Computer Science http://www.angio.net/

bazad wrote:

I was supprised to learn about existence of Ocaml. I just don't know
how valuable (in all possible meanings) it is.

I personally find it extremely valuable.

OCaml has a somewhat unfamiliar syntax at the first look.
The first two days with OCaml were really hard and I cursed
a lot. Nowadays I find its syntax very natural.

The only problem is the limited number of extensions,
but for most tasks there is one.

We have a superfast matrix class: NArray (not in standard lib).
It claimed to be 3x faster than NumPy for some tests and a bit faster
than Octave for for matrix operations.

http://www.ir.isas.ac.jp/~masa/ruby/index-e.html

Regards,

  Michael

路路路

On Sat, Nov 06, 2004 at 06:31:33AM +0900, David G. Andersen wrote:

On Sat, Nov 06, 2004 at 06:23:41AM +0900, Christian Szegedy scribed: > > gabriele renzi wrote:
>
> >About the shootout. Most of the stuff is plan useless in ruby.
> >You won't implemente heapsort in ruby, cause you have sort/sort_by.
> >Nor you would use
> >http://shootout.alioth.debian.org/lang/ruby/random.ruby.html
> >because you have rand().
>
> You are correct. I would say that the shootout is extremely
> misleading, for example the Python people do dubious tricks
> to make it look fast: for example the matrix multiplication is
> implemented by calling the an extension method.

Why is that unrealistic? You could do the same in Ruby by
implementing the library 'Matrix' class in C. We already have
a library that handles the stuff; it's just a matter of
optimization (== time == extra code in C that's harder to
maintain, unfortunately). I actually played with this a bit.
Right now, using the Ruby matrix class does two things to
that test:

  a) Reduces the lines of code by nearly half
  b) Increases the runtime by a factor of two. (ouch)

If we had a superfast matrix class, it would give (a) along
with decreasing the runtime by a factor of ten or so. Sounds

David G. Andersen wrote:

If we had a superfast matrix class, it would give (a) along
with decreasing the runtime by a factor of ten or so. Sounds
reasonable. The same test in matlab would be about 1/4 the lines of
code and 20x faster if implemented the way it should be.

Great! Then put all C-solutions from the shootout into
an extension and call them from Ruby. This way, Ruby
would have the second place in runtime after C and
the 1st in LOC count.

The shootout _is_ useful for understanding some general
aspects of the languages, and for looking for things that
you might want to really work on speeding up (say, for
instance, the fact that Ruby's hash accesses seem to be
about 1/2 the speed of Perl's. :wink:

It is not even clear that this is really the hash-access.
It could be the GC, the loop or who knows what...

In article <20041105220018.GA5484@miya.intranet.ntecs.de>,

>
> >About the shootout. Most of the stuff is plan useless in ruby.
> >You won't implemente heapsort in ruby, cause you have sort/sort_by.
> >Nor you would use
> >http://shootout.alioth.debian.org/lang/ruby/random.ruby.html
> >because you have rand().
>
> You are correct. I would say that the shootout is extremely
> misleading, for example the Python people do dubious tricks
> to make it look fast: for example the matrix multiplication is
> implemented by calling the an extension method.

Why is that unrealistic? You could do the same in Ruby by
implementing the library 'Matrix' class in C. We already have
a library that handles the stuff; it's just a matter of
optimization (== time == extra code in C that's harder to
maintain, unfortunately). I actually played with this a bit.
Right now, using the Ruby matrix class does two things to
that test:

  a) Reduces the lines of code by nearly half
  b) Increases the runtime by a factor of two. (ouch)

If we had a superfast matrix class, it would give (a) along
with decreasing the runtime by a factor of ten or so. Sounds

We have a superfast matrix class: NArray (not in standard lib).
It claimed to be 3x faster than NumPy for some tests and a bit faster
than Octave for for matrix operations.

http://www.ir.isas.ac.jp/~masa/ruby/index-e.html

Is the Python extension in question part of Python's standard lib? If so,
I could see how using it in the shootout would be OK. Perhaps it's time
to nominate NArray for inclusion in Ruby's standard lib. Given that
matrix operations are one of the more compute intensive tasks it might be
a good idea.

....Also, I wonder if NArray might be in need of renaming. maybe something
like FastArray/FastMatrix/FastVector - that would make the purpose perhaps
a bit clearer. I find that I always forget about NArray when I need a
fast matrix (which isn't often, but sometimes).

Phil

路路路

Michael Neumann <mneumann@ntecs.de> wrote:

On Sat, Nov 06, 2004 at 06:31:33AM +0900, David G. Andersen wrote:

On Sat, Nov 06, 2004 at 06:23:41AM +0900, Christian Szegedy scribed: >> > gabriele renzi wrote:

On Sat, Nov 06, 2004 at 07:00:23AM +0900, Michael Neumann scribed:

>
> If we had a superfast matrix class, it would give (a) along
> with decreasing the runtime by a factor of ten or so. Sounds

We have a superfast matrix class: NArray (not in standard lib).
It claimed to be 3x faster than NumPy for some tests and a bit faster
than Octave for for matrix operations.

http://www.ir.isas.ac.jp/~masa/ruby/index-e.html

This looks great, btw. Not having looked at the code
size, I don't know if it's worth having it replace the library
version of 'Matrix' or not. If we actually just want some
quick speedups to our current matrix class, I offer the
following bit of code that performs most of the matrix multiplication
in C. It could be faster if it assumed anything about the contents
of the matrix; as it is, it's pretty much a straightforward
translation of the current matrix code into C. It speeds matrix
multiplication up by about an order of magnitude. Division is
implemented as multiplication by the inverse, so it's also accelerated
somewhat.

The code is attached at the end of this message. It should probably
be extended with a few more sanity checks by someone who knows what
they're doing. I haven't tested it extensively. I mostly wrote
it to see how well it would work to just make a minor tweak with a
little bit of C code, while preserving the majority of the 1,200
lines of ruby. The answer seems to be "pretty well!"

It takes 100 iterations of the computer shootout matrix
test from 7.5 seconds (their version) / 13 seconds (ruby 'matrix')
down to about 1.3 seconds. The perl version takes 3.4 seconds.
The modified python version takes under .2 seconds, in comparison,
as would a Ruby version using NArray...

The happy part about this is that it's all of 103 lines of
C, since all of the wrapper code for matrix handling is still done
in ruby. To use it, modify matrix.rb to include fastmath after
the class definition, and dispatch to the mult version instead.
Bonus points if you want to make it depend on having fastmath
installed; should be easy. :slight_smile:

路路路

***************
*** 107,112 ****
--- 107,113 ----
  class Matrix
    @RCS_ID='-$Id: matrix.rb,v 1.11 1999/10/06 11:01:53 keiju Exp keiju $-'
    
+ require 'fastmath'
  # extend Exception2MessageMapper
    include ExceptionForMatrix
    
***************
*** 465,470 ****
--- 466,472 ----
        return r.column(0)
      when Matrix
        Matrix.Raise ErrDimensionMismatch if column_size != m.row_size
+ return Matrix[ *self.mult(m) ]
      
        rows = (0 .. row_size - 1).collect {
          >i>

  -Dave

=========== ext/matrix/fastmath/extconf.rb =======
require 'mkmf'
create_makefile 'matrix/fastmath'

=========== ext/matrix/fastmath/fastmath.c =======

/*
    fastmath.c -- matrix manipulation core
  
    Copyright (c) 2004 David Andersen <dga@pobox.com>
  
    This library is free software.
    You can distribute/modify this program under the same terms of ruby.
    (I hereby assign copyright to matz if he wants it).

*/

#include "ruby.h"
#include <stdio.h>

VALUE matclass; /* Set in Init */
static int id_length;
static int id_plus;
static int id_mul;

#define FASTMATH_VERSION "0.0.1"

/*
=begin
= Fastmath methods.
= end
*/

/*
=begin
--- mult
    Multiplies two matrices. Returns an array that must be
    coerced back into a matrix by the caller.
=end
*/

static VALUE
fastmath_mult(VALUE self, VALUE other)
{
  VALUE a, b, c, item, tmp, val, row, m1i, at, bt;
  int arows, acols;
  int brows, bcols;
  int crows, ccols;
  int i, j, k;
  /* Note: Might be better to call to_a .. this is not speed critical */
  a = rb_iv_get(self, "@rows");
  b = rb_iv_get(other, "@rows");
  if (rb_obj_is_instance_of(other, matclass) != Qtrue) {
    rb_raise(rb_eTypeError, "type of arg must be Matrix");
    return self;
  }
  tmp = rb_funcall(a, id_length, 0);
  arows = NUM2INT(tmp);
  tmp = rb_funcall(b, id_length, 0);
  brows = NUM2INT(tmp);
  if (!arows || !brows) return self;

  item = rb_ary_entry(a, 0);
  if (NIL_P(item) || TYPE(item) != T_ARRAY) {
    raise(rb_eIndexError, "Invalid self matrix");
    return self;
  }
  tmp = rb_funcall(item, id_length, 0);
  acols = NUM2INT(tmp);

  item = rb_ary_entry(b, 0);
  if (NIL_P(item) || TYPE(item) != T_ARRAY) {
    raise(rb_eIndexError, "Invalid self matrix");
    return self;
  }
  tmp = rb_funcall(item, id_length, 0);
  bcols = NUM2INT(tmp);

  crows = arows;
  ccols = bcols;

  c = rb_ary_new2(crows);
  for (i = 0; i < crows; i++) {
    row = rb_ary_new2(ccols);
    m1i = rb_ary_entry(a, i);
    for (j = 0; j < ccols; j++) {
      val = INT2FIX(0);
      for (k = 0; k < ccols; k++) {
        at = rb_ary_entry(m1i, k);
        bt = rb_ary_entry(rb_ary_entry(b, k), j);
        val = rb_funcall(val, id_plus, 1,
          rb_funcall(at, id_mul, 1, bt));
      }
      rb_ary_store(row, j, val);
    }
    rb_ary_store(c, i, row);
  }

  return c; /* remember to coerce to matrix in caller */
}

void
Init_fastmath()
{
  matclass = rb_const_get(rb_cObject, rb_intern("Matrix"));
  rb_define_method(matclass, "mult", fastmath_mult, 1);
  id_length = rb_intern("length");
  id_plus = rb_intern("+");
  id_mul = rb_intern("*");
}

>
>If we had a superfast matrix class, it would give (a) along
>with decreasing the runtime by a factor of ten or so. Sounds
>reasonable. The same test in matlab would be about 1/4 the lines of
>code and 20x faster if implemented the way it should be.

Great! Then put all C-solutions from the shootout into
an extension and call them from Ruby. This way, Ruby
would have the second place in runtime after C and
the 1st in LOC count.

  This doesn't follow. Ruby _already_ has a matrix convenience
class - my point is simply that by adding 61 additional lines of
code (73 lines of C - the 12 lines of ruby it replaces), we turn
the internal matrix class from something that's twice as slow
as what you would get if you implemented the whole thing in
optimized ruby to something that's 5x faster. Being able to use
an already well-implemented matrix class is a very nice bonus;
who wants to roll their own just to speed things up a bit?

  Matrix manipulation is _not_ a "just for the benchmarks"
optimization. There are a lot of cases in which having a pleasantly
fast, easy to use matrix multiply is beneficial, but you don't want
to quite go all the way and force an external dependency on NArray.

By way of explanation: I use Ruby for a lot of data analysis; most
of what I do is string manipulation and counting things in hash
tables, but every now and then I have to throw in a little math.
It's not speed-utterly-critical, but it's speed-pleasant. My updated
matrix multiply code fills that space...)

The main point I was exploring in this thread was the ease
of optimizing a few bits of performance-intensive ruby code with
a very small, managable bit of C. I find it much more pleasant
to have the bulk of the code I implement be in ruby, where it
looks like:

  class Window_Repository
    attr_reader :winsize

    def initialize(winsize, gc_thresh = 100)
      @winsize, @gc_thresh = winsize, gc_thresh
      ...
    end
  end

instead of

  myClass = rb_define_class("Window_Repository", rb_cObject);
  ...
  static VALUE
  window_repository_init(...) {
    ...
  }

but still be able to make a couple of small tweaks to get
an order of magnitude performance improvement.

>The shootout _is_ useful for understanding some general
>aspects of the languages, and for looking for things that
>you might want to really work on speeding up (say, for
>instance, the fact that Ruby's hash accesses seem to be
>about 1/2 the speed of Perl's. :wink:

It is not even clear that this is really the hash-access.
It could be the GC, the loop or who knows what...

  Disabling the GC doesn't help most of the hash-bound benchmarks
much if they're written properly. The basic loop iteration is
quite fast if you remove the hash access. You can draw your own
conclusions from this.

  I admit that I may be an atypical ruby user. :slight_smile: But if we
can manage it, things that speed up ruby without harming
maintainability of flexibility are good for everyone. This
is why it's occasionally useful to look at benchmarks, flawed
as they may be. The shootout certainly has its flaws, but it
can also point out areas where improvement would help.

  -Dave

路路路

On Sun, Nov 07, 2004 at 07:48:38AM +0900, Christian Szegedy scribed: > David G. Andersen wrote:

--
work: dga@lcs.mit.edu me: dga@pobox.com
      MIT Laboratory for Computer Science http://www.angio.net/

On Sat, Nov 06, 2004 at 09:06:38AM +0900, David G. Andersen scribed:

I offer the following bit of code that performs most of the matrix
multiplication in C.

The code is attached at the end of this message. It should probably
be extended with a few more sanity checks by someone who knows what
they're doing. I haven't tested it extensively. I mostly wrote
[...]

  I've added the requisite sanity checks to the code and
converted it to directly access the array elements. It's about
10% faster, a bit more readable, and decidedly safer. :). As
before, it preserves the semantics of matrix.rb, so it should
be drop-in compatible.

  (Sorry for the 100 line email)

  -Dave

/*
    fastmath.c -- matrix manipulation core

路路路

Copyright (c) 2004 David Andersen <dga@pobox.com>
  
    This library is free software.
    You can distribute/modify this program under the same terms of ruby.
*/

#include "ruby.h"
#include <stdio.h>

VALUE matclass; /* Set in Init */
static int id_length, id_plus, id_mul;

#define FASTMATH_VERSION "0.0.1"

/*
=begin
= Fastmath methods.
= end
*/

/*
=begin
--- mult
    Multiplies two matrices
=end
*/

static VALUE
fastmath_mult(VALUE self, VALUE other)
{
  VALUE a, b, c, val, row, m1i, at, bt;
  int arows, brows, bcols, i, j, k;
  
  if (rb_obj_is_instance_of(other, matclass) != Qtrue) {
    rb_raise(rb_eTypeError, "type of arg must be Matrix");
  }

  a = rb_iv_get(self, "@rows");
  b = rb_iv_get(other, "@rows");
  if (NIL_P(a) || NIL_P(b) || TYPE(a) != T_ARRAY || TYPE(b) != T_ARRAY) {
    rb_raise(rb_eTypeError, "invalid matrix arguments");
  }
  arows = RARRAY(a)->len;
  brows = RARRAY(b)->len;
  if (!arows || !brows) {
    rb_raise(rb_eIndexError, "Zero-length matrix");
  }

  row = RARRAY(b)->ptr[0];
  if (NIL_P(row) || TYPE(row) != T_ARRAY) {
    rb_raise(rb_eIndexError, "Invalid other matrix");
  }
  bcols = RARRAY(row)->len;

  for (i = 0; i < brows; i++) {
    row = RARRAY(b)->ptr[i];
    if (TYPE(row) != T_ARRAY || RARRAY(row)->len != arows) {
      rb_raise(rb_eIndexError, "Invalid other row");
    }
  }

  c = rb_ary_new2(arows);
  for (i = 0; i < arows; i++) {
    row = rb_ary_new2(bcols);
    m1i = RARRAY(a)->ptr[i];
    if (TYPE(m1i) != T_ARRAY || RARRAY(m1i)->len != brows) {
      rb_raise(rb_eIndexError, "Invalid self row len");
    }
    for (j = 0; j < bcols; j++) {
      val = INT2FIX(0);
      for (k = 0; k < bcols; k++) {
        at = RARRAY(m1i)->ptr[k];
        bt = RARRAY(RARRAY(b)->ptr[k])->ptr[j];
        val = rb_funcall(val, id_plus, 1,
          rb_funcall(at, id_mul, 1, bt));
      }
      RARRAY(row)->ptr[j] = val;
    }
    RARRAY(row)->len = bcols;
    rb_ary_store(c, i, row);
  }

  if (OBJ_TAINTED(self) || OBJ_TAINTED(other)) OBJ_TAINT(c);
  return c; /* remember to coerce to matrix in caller */
}

void
Init_fastmath()
{
  matclass = rb_const_get(rb_cObject, rb_intern("Matrix"));
  rb_define_method(matclass, "mult", fastmath_mult, 1);
  id_length = rb_intern("length");
  id_plus = rb_intern("+");
  id_mul = rb_intern("*");
}

David G. Andersen wrote:

If we had a superfast matrix class, it would give (a) along
with decreasing the runtime by a factor of ten or so. Sounds
reasonable. The same test in matlab would be about 1/4 the lines of
code and 20x faster if implemented the way it should be.

Great! Then put all C-solutions from the shootout into
an extension and call them from Ruby. This way, Ruby
would have the second place in runtime after C and
the 1st in LOC count.

  This doesn't follow. Ruby _already_ has a matrix convenience
class - my point is simply that by adding 61 additional lines of
code (73 lines of C - the 12 lines of ruby it replaces), we turn
the internal matrix class from something that's twice as slow
as what you would get if you implemented the whole thing in optimized ruby to something that's 5x faster. Being able to use
an already well-implemented matrix class is a very nice bonus;
who wants to roll their own just to speed things up a bit?

What's the criteria for having Ruby include something written in C?

For example, I've read complaints concerning Ruby's speed in processing
large XML files. REXML is pure Ruby, and the speed just can't match a
C-based parser.

So what if REXML, or just parts of it, were re-written in C? Fair game?

Or simply include libxml or expat in the core Ruby distribution, and
include a Ruby binding?

Not advocating, just using XML parsing as an example. Partly though
because, by comparison, Ruby ships with a YAML parser written in C. So,
in principle, I would imagine that a C-based XML parser would be at
least eligible for consideration. In general, though, what are the
criteria for such consideration?

Some issues I can think of for deciding to include C code:

  * License: Code would need to be compatible with/equivalent to
    Ruby's license
  * Flexibility/Access: Pure-Ruby libs are available for metaprogramming;
    I can, in the REXML example, dynamically munge the workings of the
    parser, something that might vanish were parts replaced with C
  * Compilation: Adding more C ups the chance that someone, somewhere,
    will not be able to build Ruby on some platform. Or move it to Rite.
  * Maintenance/Ownership: Does it make sense to ship a library, such as
    expat, that is maintained outside of the Ruby core?
    If code is added to the core, does it make Ruby harder/easier to
    maintain?

James

路路路

On Sun, Nov 07, 2004 at 07:48:38AM +0900, Christian Szegedy scribed: > >>David G. Andersen wrote:

David G. Andersen wrote:

> Disabling the GC doesn't help most of the hash-bound benchmarks
> much if they're written properly. The basic loop iteration is
> quite fast if you remove the hash access. You can draw your own
> conclusions from this.
>

Hihi, you are so self-confident. I made some tests, and in fact,
it turned out, that the hash access is not the problem at all:

The original code is:

路路路

======================================================
n = (ARGV.shift || 1).to_i

hash = {}
for i in 1..n
聽聽聽聽聽hash['%x' % i] = 1
end

c = 0
n.downto 1 do |i|
聽聽聽聽聽c += 1 if hash.has_key? i.to_s
end

puts c

I let it run:

chris@gentoo:~$ time h.rb 500000
real 0m8.929s
user 0m8.724s
sys 0m0.131s

No, I try the decimal conversion instead of the
hexadecimal:

I changed the line
hash['%x' % i] = 1
to
hash['%d' % i] = 1

chris@gentoo:~$ time h.rb 500000
real 0m9.101s
user 0m8.955s
sys 0m0.106s

And the result is roughly the same.
Let's try the functionally equivalent:
hash[i.to_s] = 1

chris@gentoo:~$ time h.rb 500000
real 0m4.673s
user 0m4.600s
sys 0m0.050s

So, in fact the hash access is about the
same speed as in Perl. The slowdown comes from
the formatted output!

So much about the reliability of the shootout!

Best Regards, Christian

Christian Szegedy wrote:

And the result is roughly the same.
Let's try the functionally equivalent:
hash[i.to_s] = 1

Just a side remark: one can use

i.to_s(16)

and it is about as fast as the unformatted
decimal conversion and it is equivalant to
the original specification.

In article <418D7482.6070909@neurogami.com>,

David G. Andersen wrote:

What's the criteria for having Ruby include something written in C?

This is a good question.

Some issues I can think of for deciding to include C code:

* License: Code would need to be compatible with/equivalent to
   Ruby's license
* Flexibility/Access: Pure-Ruby libs are available for metaprogramming;
   I can, in the REXML example, dynamically munge the workings of the
   parser, something that might vanish were parts replaced with C
* Compilation: Adding more C ups the chance that someone, somewhere,
   will not be able to build Ruby on some platform. Or move it to Rite.

That's why I like the approach of having a pure-Ruby option. In general
various pacakges are usually written first in Ruby (for example the Matrix
class) and then later on someone finds the need for a faster implementation and
implements various speed critical methods in C (as was the case earlier in this
thread where someone posted some Ruby/C code for matrix multiplication).

The Matrix module could be changed so that an optional C code version is
require'd - if that require fails, then no problem, you've still got the pure
Ruby implementation. It'll be slow but it will work. If the require
of the shared library containing the Ruby/C implementation succeeds,
then you've replaced some speed-critical methods in the class/module with
faster C implementations of those methods.

Phil

路路路

James Britt <jamesUNDERBARb@neurogami.com> wrote:

James Britt wrote:

What's the criteria for having Ruby include something written in C?

For example, I've read complaints concerning Ruby's speed in processing
large XML files. REXML is pure Ruby, and the speed just can't match a
C-based parser.

So what if REXML, or just parts of it, were re-written in C? Fair game?

Or simply include libxml or expat in the core Ruby distribution, and
include a Ruby binding?

Not advocating, just using XML parsing as an example. Partly though
because, by comparison, Ruby ships with a YAML parser written in C. So,
in principle, I would imagine that a C-based XML parser would be at
least eligible for consideration. In general, though, what are the
criteria for such consideration?

Some issues I can think of for deciding to include C code:

* License: Code would need to be compatible with/equivalent to
   Ruby's license
* Flexibility/Access: Pure-Ruby libs are available for metaprogramming;
   I can, in the REXML example, dynamically munge the workings of the
   parser, something that might vanish were parts replaced with C
* Compilation: Adding more C ups the chance that someone, somewhere,
   will not be able to build Ruby on some platform. Or move it to Rite.
* Maintenance/Ownership: Does it make sense to ship a library, such as
   expat, that is maintained outside of the Ruby core?
   If code is added to the core, does it make Ruby harder/easier to
   maintain?

These are good criteria. Maybe another would be:

   * Application logic remains in ruby code.

The compilation/porting criterion is a bit easier now that MSVC is freely downloadable, but it's still an issue and I am pessimistic enough to think that it probably always will be.

Nevertheless, I do think Ruby with a limited amount of C is fair game when choosing Ruby examples that are intended to show, realistically, whether Ruby is computationally adequate for some task. The Ruby API is a feature that should be considered by anyone shopping for languages.

IMHO, we should especially promote examples involving code generation,
which can succeed on the flexibility and maintenance points. But I'm not sure there are any problems on the shootout site that are suited to code generation. The problems for which code generation is useful tend to be more complex (or complex in a different way) than sorting, hashing, counting, etc., which are better solved with a fixed library.

Here's an example for which code generation is essential. In my work, there are libraries* that allow a ruby program to express, in standard ruby syntax, specifications for a network of hybrid automata. Hybrid automata are essentially state machines with ordinary differential equations in the states and guard expressions on the transitions. The library takes these specs and generates/compiles/loads/runs C code for solving the ODE's and evaluating the guard predicates. Doing this is complicated by dynamic reconfiguration of the network: formulas involve not just variables like 'x' but indirect references to 'obj.x', where obj may change from timestep to timestep depending on discrete transitions. This behavior would make it difficult to use a fixed math lib efficiently--the C code must, for optimal speed, depend on the user's specifications.

Performance is comparable (though a bit slower because of the indirect references) to solvers like Matlab, which can't (when I last checked) even handle dynamic reconfiguration. Yet the programmer using these libraries doesn't even need to know that there is a compiler involved--you just run ruby scripts that include the libraries and define certain structures.

My point isn't that this example would make sense on the "shootout" site, but that writing code generators is a realisitic approach with Ruby, because of Ruby's:

   - C API and mkmf.rb

   - string processing

   - ease of working with complex object models

With a little work, you can get the performance of C without losing metaprogramming and other dynamic aspects of ruby code (for example, debugging the hybrid automata using irb). And you are still expressing your application logic entirely in ruby code.

路路路

--
* Cgen and RedShift, which are themselves pure ruby. Cgen is on RAA, but I haven't released RedShift yet.

Funny you should mention that. On my list of things to do with REXML is to
rewrite some of the critical sections in C and see how it fairs. There are a
couple of classes that consume a fair chunk of the parsing time, and that
haven't changed much in the past couple years. The only thing I'm really not
looking forward to are the inevitable problems that come with C code, such as
memory leaks, buffer overflow bugs, and portability issues -- all of which
I'm by now used to not having to worry about since I've been using high level
languages for the past ten years.

Anyway, I've got another XPath rewrite in the works, RelaxNG validation about
halfway done, and a number of bug fixes to do, and we lost a hard drive on
the server last week, so it may be a little while until I get around to it.

路路路

On Saturday 06 November 2004 20:03, James Britt wrote:

For example, I've read complaints concerning Ruby's speed in processing
large XML files. REXML is pure Ruby, and the speed just can't match a
C-based parser.

So what if REXML, or just parts of it, were re-written in C? Fair game?

--
### SER
### Deutsch|Esperanto|Francaise|Linux|XML|Java|Ruby|Aikido
### http://www.germane-software.com/~ser jabber.com:ser ICQ:83578737
### GPG: http://www.germane-software.com/~ser/Security/ser_public.gpg

> Disabling the GC doesn't help most of the hash-bound benchmarks
> much if they're written properly. The basic loop iteration is
> quite fast if you remove the hash access. You can draw your own
> conclusions from this.
>

Hihi, you are so self-confident. I made some tests, and in fact,
it turned out, that the hash access is not the problem at all:

The original code is:

Actually, I was referring more to the performance of the
other hash tests -- you'll note in the comments for the example you
cited below that the test author believes the "hash 1" test to
be dominated by sprintf performance, not hash performance.

Consider instead the code

n = Integer(ARGV.shift || 1)

hash1 = {}
for i in 0 .. 9999
    hash1["foo_" << i.to_s] = i
end

hash2 = Hash.new(0)
n.times do
    for k in hash1.keys
  hash2[k] += hash1[k]
    end
end

printf "%d %d %d %d\n",
    hash1["foo_1"], hash1["foo_9999"], hash2["foo_1"], hash2["foo_9999"]

Which is pretty close to your example of:

hash[i.to_s] = 1

but it forces the hashing into a string context so that languages
that don't make a strong distinction between strings and numbers
can't cheat (hi, perl. ;-). Or consider the "word frequency" test -
it's pretty hash dominated. For the hash2 test:

503 eep:~> time ./t-hash2.rb 100
1 9999 100 999900
3.518u 0.031s 0:03.70 95.6% 5+4499k 0+0io 0pf+0w

Without the hash access:

505 eep:~> time ./t-hash2.rb 100
1 9999 0 0
0.635u 0.023s 0:00.66 98.4% 14+3998k 0+0io 0pf+0w

(the loop overhead and initial hash creation don't add too much)

With the hash access changed to only be the LHS : hash2[k] += 1

507 eep:~> time ./t-hash2.rb 100
1 9999 100 100
2.793u 0.031s 0:03.35 84.1% 7+4521k 0+0io 0pf+0w

And just accessing the RHS: hash1[k]

510 eep:~> time ./t-hash2.rb 100
1 9999 0 0
1.590u 0.023s 0:01.68 95.8% 8+4233k 0+0io 0pf+0w

it's pretty clear that assigning to a hash is the most expensive
operation. The perl equivalent takes
2.367u 0.031s 0:02.48 96.3% 850+3242k 0+0io 0pf+0w

The ruby version could be improved a bit by pre-declaring the
loop variables, but this is all contained within that .66 seconds
that wasn't the hash access; even if we eliminated it all,
we're still slower than the perl equivalent. Predeclaring:

520 eep:~> time ./t-hash2.rb 100
1 9999 100 999900
3.349u 0.015s 0:03.49 95.9% 6+4524k 0+0io 0pf+0w

Now, is it the actual hash implementation that's the bottleneck,
or merely the operations required to access it? That I don't know.
But hash[blah] = blah _does_ seem to be measurably slower than
its equivalent in, say, perl. It's not drastic, but it's 20%.

  -Dave

路路路

On Sun, Nov 07, 2004 at 11:18:39AM +0900, Christian Szegedy scribed: > David G. Andersen wrote:

--
work: dga@lcs.mit.edu me: dga@pobox.com
      MIT Laboratory for Computer Science http://www.angio.net/

Christian Szegedy ha scritto:

So much about the reliability of the shootout!

well, there is a reason the ranking system is named CRAPS :slight_smile:
Anyway, the ruby code was written mostly from non rubyists, and IIRC they would like to have suggestions :wink:

Phil Tomson wrote:

The Matrix module could be changed so that an optional C code version is require'd - if that require fails, then no problem, you've still got the pure Ruby implementation. It'll be slow but it will work. If the require of the shared library containing the Ruby/C implementation succeeds, then you've replaced some speed-critical methods in the class/module with faster C implementations of those methods.

IIRC, YAML and Syck work that way.

David G. Andersen wrote:

The ruby version could be improved a bit by pre-declaring the
loop variables, but this is all contained within that .66 seconds
that wasn't the hash access; even if we eliminated it all,
we're still slower than the perl equivalent. Predeclaring:

520 eep:~> time ./t-hash2.rb 100
1 9999 100 999900
3.349u 0.015s 0:03.49 95.9% 6+4524k 0+0io 0pf+0w

Now, is it the actual hash implementation that's the bottleneck,
or merely the operations required to access it? That I don't know.
But hash[blah] = blah _does_ seem to be measurably slower than
its equivalent in, say, perl. It's not drastic, but it's 20%.

  -Dave

I tried it on my machine, but my numbers are a bit different:

chris@gentoo:~/work$ time perl h2.pl 200
1 9999 200 1999800

real 0m5.353s
user 0m5.251s
sys 0m0.008s

chris@gentoo:~/work$ time h2.rb 200
1 9999 200 1999800

real 0m4.384s
user 0m4.309s
sys 0m0.026s

So, on my machine Ruby is 20% faster (not that I would care...).

Perl is: v5.8.2 built for i686-linux
Ruby is: ruby 1.8.0 (2003-08-04) [i686-linux-gnu]

I don't use the binary package, since it is a Gentoo-box and
everything is built from source.

Best Regards, Christian