Converting some autogenerated ruby code to C

One of the next things I want to do in my grammar package is to give
it a performance boost by generating C code instead of ruby code.
Right now the parsers I generate are a set of a few methods with giant
expressions in each. I would like to try improving the performance by
generating and compiling C code instead. My preference would be to
convert my small subset of ruby code to C code (easily have an option
to use pure ruby or use ruby/C), but I could also autogenerate C code
instead. Anybody have any opinions/ideas about what approach I whould
take? I saw a rubyToC package on rubyforge that I could possibly use,
but I can't seem to find any good documentation/examples. It also
looks like it has some type inference that I don't want (I take
advantage of duck typing quite a bit). rb2c looks to be another
option.

Opinions?

Eric

One of the next things I want to do in my grammar package is to give
it a performance boost by generating C code instead of ruby code.
Right now the parsers I generate are a set of a few methods with giant
expressions in each. I would like to try improving the performance by
generating and compiling C code instead. My preference would be to
convert my small subset of ruby code to C code (easily have an option
to use pure ruby or use ruby/C), but I could also autogenerate C code
instead. Anybody have any opinions/ideas about what approach I whould
take? I saw a rubyToC package on rubyforge that I could possibly use,
but I can't seem to find any good documentation/examples.

Well, we don't have any examples yet. We're still driving features into it from concrete ruby code so haven't focused on documentation because most people aren't using it.

The best example is the zenoptimize demo in ZenHacks.

It also looks like it has some type inference that I don't want (I take
advantage of duck typing quite a bit).

It is largely optional, we need it in the ANSI C translator (int, double, char *, etc). In the Ruby C translator (VALUE) it could be fairly easily removed.

···

On Jul 27, 2006, at 2:25 PM, Eric Mahurin wrote:

--
Eric Hodel - drbrain@segment7.net - http://blog.segment7.net
This implementation is HODEL-HASH-9600 compliant

http://trackmap.robotcoop.com

Eric Mahurin wrote:

One of the next things I want to do in my grammar package is to give
it a performance boost by generating C code instead of ruby code.

Does this mean that new versions of Grammar will be released in the
future and that works continues on it?

Phil

Thanks Eric. This pointed me in the right direction.

I started with the factorial example. This was the reference method I used:

  def factorial(n)
    f=1
    n.downto(2) { |x| f *= x }
    f
  end

Here were the performance results on my machine for a million factorial(20):

ruby : 19.2s
rubyToAnsiC : 0.7s
* assumed argument was a Fixnum
* converted downto iterator to a while loop
* didn't handle overflow properly and use Bignum - gave wrong answer
rubyToRubyC : 20.9s
* converted downto iterator to a while loop
* 3 messages sent (rb_funcall) per iteration of the inner loop
hand-optimized rubyC : 14.8s
* precalculated interns/symbols for method calls
* reordered operands so that a constant operand was the receiver if possible
* used rb_funcall2 instead of rb_funcall
downto iterator rubyC : 15.3s
* more direct translation of original ruby
* used downto iterator and block (needed a couple extra helper functions)
* precalculated interns/symbols for method calls
* one block call and one message sent per iteration of the inner loop

For my purposes, rubyToAnsiC is pretty much useless since I need duck
typing on most of my arguments (can't infer the type). I also found
that rubyToC wasn't very robust - simply changing n.downto(2) to
n.step(2,-1) broke it. I think a more direct translation would be
better. I don't see a lot of benefit from the other solutions, so
I'll pursue other avenues for optimization or just wait for YARV.

···

On 7/28/06, Eric Hodel <drbrain@segment7.net> wrote:

> It also looks like it has some type inference that I don't want (I
> take
> advantage of duck typing quite a bit).

It is largely optional, we need it in the ANSI C translator (int,
double, char *, etc). In the Ruby C translator (VALUE) it could be
fairly easily removed.

Yep. I haven't done a checkin to CVS in a while though. I've been fairly
actively working on in the last few months. One of the latest interesting
things I've figured out and put in is left recursion. Normally this is
unheard of with LL parsers. This opens up a many of the LR grammars (i.e.
from YACC) to be used with my parser (but I still need a (E)BNF converter to
make it easier).

Hopefully once Dominik implements some kind of *eval replacement, I'll be
able to use ruby2cext and get the performance of Regexp. With a hand coded
character-by-character lexer (the kind Grammar generates), I see a 4-5X
performance improvement (using his CVS ruby2cext) which beats the fastest
equivalent Regexp/StringScanner lexer (only a small improvement using
ruby2cext). This could make Grammar as a Regexp alternative - equivalent
performance and much, much more flexible.

I'll try to make a checkin soon so that you can play around with it.

···

On 8/13/06, Phil Tomson <philtomson@gmail.com> wrote:

Eric Mahurin wrote:
> One of the next things I want to do in my grammar package is to give
> it a performance boost by generating C code instead of ruby code.

Does this mean that new versions of Grammar will be released in the
future and that works continues on it?

Phil

> It also looks like it has some type inference that I don't want (I
> take
> advantage of duck typing quite a bit).

It is largely optional, we need it in the ANSI C translator (int,
double, char *, etc). In the Ruby C translator (VALUE) it could be
fairly easily removed.

Thanks Eric. This pointed me in the right direction.

I started with the factorial example. This was the reference method I used:

def factorial(n)
   f=1
   n.downto(2) { |x| f *= x }
   f
end

Here were the performance results on my machine for a million factorial(20):

ruby : 19.2s
rubyToAnsiC : 0.7s
* assumed argument was a Fixnum
* converted downto iterator to a while loop
* didn't handle overflow properly and use Bignum - gave wrong answer
rubyToRubyC : 20.9s

This is how we've defined our subset. C doesn't have bignums or blocks.

* 3 messages sent (rb_funcall) per iteration of the inner loop

hand-optimized rubyC : 14.8s
* precalculated interns/symbols for method calls
* reordered operands so that a constant operand was the receiver if possible
* used rb_funcall2 instead of rb_funcall

downto iterator rubyC : 15.3s
* more direct translation of original ruby
* used downto iterator and block (needed a couple extra helper functions)
* precalculated interns/symbols for method calls
* one block call and one message sent per iteration of the inner loop

For my purposes, rubyToAnsiC is pretty much useless since I need duck
typing on most of my arguments (can't infer the type). I also found
that rubyToC wasn't very robust - simply changing n.downto(2) to
n.step(2,-1) broke it. I think a more direct translation would be
better. I don't see a lot of benefit from the other solutions, so
I'll pursue other avenues for optimization or just wait for YARV.

We've been working towards this for Ruby2RubyC, but we aren't there yet. The speedup created by doing automatic translation to the C API isn't that big, and sometimes can be a slowdown. For Ruby2RubyC I've typically broken even using the current API. As we drive more features into it from ZenObfuscate we might gain some speed, but it isn't ready to be used for optimization yet.

···

On Jul 30, 2006, at 8:02 PM, Eric Mahurin wrote:

On 7/28/06, Eric Hodel <drbrain@segment7.net> wrote:

--
Eric Hodel - drbrain@segment7.net - http://blog.segment7.net
This implementation is HODEL-HASH-9600 compliant

http://trackmap.robotcoop.com

Hi Eric,

ruby : 19.2s
rubyToAnsiC : 0.7s
* assumed argument was a Fixnum
* converted downto iterator to a while loop
* didn't handle overflow properly and use Bignum - gave wrong answer
rubyToRubyC : 20.9s
* converted downto iterator to a while loop
* 3 messages sent (rb_funcall) per iteration of the inner loop
hand-optimized rubyC : 14.8s
* precalculated interns/symbols for method calls
* reordered operands so that a constant operand was the receiver if possible
* used rb_funcall2 instead of rb_funcall
downto iterator rubyC : 15.3s
* more direct translation of original ruby
* used downto iterator and block (needed a couple extra helper functions)
* precalculated interns/symbols for method calls
* one block call and one message sent per iteration of the inner loop

For my purposes, rubyToAnsiC is pretty much useless since I need duck
typing on most of my arguments (can't infer the type). I also found
that rubyToC wasn't very robust - simply changing n.downto(2) to
n.step(2,-1) broke it. I think a more direct translation would be
better. I don't see a lot of benefit from the other solutions, so
I'll pursue other avenues for optimization or just wait for YARV.

You could try Ruby2CExtension (http://ruby2cext.rubyforge.org/\), it is similar to rubyToRubyC but supports blocks and closures and tries to match Ruby's semantics as close as possible. It can handle most Ruby code.

You should first try the 0.1.0 release and see if it works with your Ruby code (also read the limitations document). This version doesn't have any real optimizations, so the compiled code will probably only be a bit faster than the pure Ruby version.

Once that works you can try the HEAD revision of svn://rubyforge.org/var/svn/ruby2cext/trunk. This new version has constant lookup caching and optimizes calls to most public methods (including operators) of builtin classes (Array, Bignum, FalseClass, Fixnum, Float, Hash, NilClass, Regexp, String, Symbol, TrueClass) that don't do anything with blocks. This can produce significant speedups depending on your code.

I am currently thinking about doing optimizations for common iterators like Array/Range#each, map, ... and Fixnum#times, upto, ... which should yield further speedups.

Dominik

···

On Mon, 31 Jul 2006 05:02:08 +0200, Eric Mahurin <eric.mahurin@gmail.com> wrote:

Hello,
I installed RubyToC (gem install RubyToC) but I don't find a
translate.rb.

How I can translate ruby code to ansi c?

regards

···

--
Posted via http://www.ruby-forum.com/.

Thanks Dominik,

What you have sounds quite robust and able to handle a wide variety of
ruby functionality. Unfortunately, it still doesn't help much on the
performance front :frowning: 1-1.5X just doesn't seem like enough to justify
the effort. The main problem looks to be rb_funcall*. I looked
through eval.c a little and it looks to have a large amount of
overhead before you get to the final C call (if it is a C-based
method). I'm wondering how much optimization has been done there. I
also mentioned on ruby-core that it would be quite useful to be able
to separate the method lookup from the method call in C (and ruby) to
gain additional performance when you are dealing with a variable of a
constant class. I'm sure you'd agree with the ruby2C stuff you are
doing.

Out of curiousity, could this be used with meta-classes? I do
something like this to get self-defining methods:

        def test(a)
            (class << self;self;end).class_eval( "
                def test(a)
                    #{tester("a")}
                end
            " )
            test
        end
        def tester(a)
            %Q{puts("hello " + #{a})}
        end

If I were to go down the C optimization route, I would want the
ability to convert the generated ruby (from #tester) to C and
compile/load it for this object. This is the only place I really want
a lot of optimization. I'm already doing quite a bit in my ruby code
generation and have more to go.

If possible and you don't do it already, you might want to think about
ruby2cext replacement methods (different names) for all of the eval
methods that work on strings/files: eval, instance_eval, class_eval,
require, etc. When a compiler isn't available on the system or
something goes wrong, you could revert to the original eval methods.

···

On 7/31/06, Dominik Bathon <dbatml@gmx.de> wrote:

Hi Eric,

On Mon, 31 Jul 2006 05:02:08 +0200, Eric Mahurin <eric.mahurin@gmail.com> > wrote:

> ruby : 19.2s
> rubyToAnsiC : 0.7s
> * assumed argument was a Fixnum
> * converted downto iterator to a while loop
> * didn't handle overflow properly and use Bignum - gave wrong answer
> rubyToRubyC : 20.9s
> * converted downto iterator to a while loop
> * 3 messages sent (rb_funcall) per iteration of the inner loop
> hand-optimized rubyC : 14.8s
> * precalculated interns/symbols for method calls
> * reordered operands so that a constant operand was the receiver if
> possible
> * used rb_funcall2 instead of rb_funcall
> downto iterator rubyC : 15.3s
> * more direct translation of original ruby
> * used downto iterator and block (needed a couple extra helper
> functions)
> * precalculated interns/symbols for method calls
> * one block call and one message sent per iteration of the inner loop
>
> For my purposes, rubyToAnsiC is pretty much useless since I need duck
> typing on most of my arguments (can't infer the type). I also found
> that rubyToC wasn't very robust - simply changing n.downto(2) to
> n.step(2,-1) broke it. I think a more direct translation would be
> better. I don't see a lot of benefit from the other solutions, so
> I'll pursue other avenues for optimization or just wait for YARV.

You could try Ruby2CExtension (http://ruby2cext.rubyforge.org/\), it is
similar to rubyToRubyC but supports blocks and closures and tries to match
Ruby's semantics as close as possible. It can handle most Ruby code.

You should first try the 0.1.0 release and see if it works with your Ruby
code (also read the limitations document). This version doesn't have any
real optimizations, so the compiled code will probably only be a bit
faster than the pure Ruby version.

Once that works you can try the HEAD revision of
svn://rubyforge.org/var/svn/ruby2cext/trunk. This new version has constant
lookup caching and optimizes calls to most public methods (including
operators) of builtin classes (Array, Bignum, FalseClass, Fixnum, Float,
Hash, NilClass, Regexp, String, Symbol, TrueClass) that don't do anything
with blocks. This can produce significant speedups depending on your code.

I am currently thinking about doing optimizations for common iterators
like Array/Range#each, map, ... and Fixnum#times, upto, ... which should
yield further speedups.

Dominik

I tried it, it doesn't work.

$ svn info . | egrep 'Revision|URL'
URL: svn://rubyforge.org/var/svn/ruby2cext/trunk
Revision: 10
$ svn info ../../rubynode/trunk/ | egrep 'Revision|URL'
URL: svn://rubyforge.org/var/svn/rubynode/trunk
Revision: 4
$ ruby stuff/builtin_methods_test.rb
433
cc -dynamic -bundle -undefined suppress -flat_namespace -fno-common -O -pipe -fno-common -g -I. -I /usr/local/lib/ruby/1.8/powerpc-darwin8.7.0 -o bm_test.bundle bm_test.c -lruby185-static -ldl -lobjc
$ ruby -rbm_test -e0
./bm_test.bundle: [BUG] Bus Error
ruby 1.8.5 (2006-07-25) [powerpc-darwin8.7.0]

Abort trap
$ gdb `which ruby`
GNU gdb 6.3.50-20050815 (Apple version gdb-477) (Sun Apr 30 20:06:22 GMT 2006)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "powerpc-apple-darwin"...Reading symbols for shared libraries ... done

(gdb) run -r bm_test -e0
Starting program: /usr/local/bin/ruby -r bm_test -e0
Reading symbols for shared libraries .. done
Reading symbols for shared libraries . done

Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_PROTECTION_FAILURE at address: 0x00000000
st_lookup (table=0x0, key=6005024, value=0xbfffe918) at st.c:250
250 hash_val = do_hash(key, table);
(gdb) bt
#0 st_lookup (table=0x0, key=6005024, value=0xbfffe938) at st.c:250
#1 0x00566c04 in rb_intern (name=0x5ba120 "abs") at parse.y:6006
#2 0x005061c4 in init_syms () at bm_test.c:9
#3 0x0051b584 in Init_bm_test () at bm_test.c:7131
#4 0x0001fe0c in dln_load (file=0x420a50 "./bm_test.bundle") at dln.c:1480
#5 0x0001c390 in rb_require_safe (fname=3299340, safe=0) at eval.c:7165
#6 0x000030e0 in rb_protect (proc=0x1c4e0 <rb_require>, data=4326896, state=0xbffff208) at eval.c:5427
#7 0x000666e4 in require_libraries () at ruby.c:356
#8 0x00067d24 in proc_options (argc=0, argv=0x5ba120) at ruby.c:812
#9 0x00067ef4 in ruby_process_options (argc=3, argv=0xbffff7a8) at ruby.c:1199
#10 0x00015b44 in ruby_options (argc=3, argv=0xbffff7a8) at eval.c:1523
#11 0x00002358 in main (argc=3, argv=0xbffff7a8, envp=0xbfffe938) at main.c:45

···

On Jul 31, 2006, at 12:54 PM, Dominik Bathon wrote:

You could try Ruby2CExtension (http://ruby2cext.rubyforge.org/\), it is similar to rubyToRubyC but supports blocks and closures and tries to match Ruby's semantics as close as possible. It can handle most Ruby code.

--
Eric Hodel - drbrain@segment7.net - http://blog.segment7.net
This implementation is HODEL-HASH-9600 compliant

http://trackmap.robotcoop.com

You should first try the 0.1.0 release and see if it works with your Ruby
code (also read the limitations document). This version doesn't have any
real optimizations, so the compiled code will probably only be a bit
faster than the pure Ruby version.

Once that works you can try the HEAD revision of
svn://rubyforge.org/var/svn/ruby2cext/trunk. This new version has constant
lookup caching and optimizes calls to most public methods (including
operators) of builtin classes (Array, Bignum, FalseClass, Fixnum, Float,
Hash, NilClass, Regexp, String, Symbol, TrueClass) that don't do anything
with blocks. This can produce significant speedups depending on your code.

I am currently thinking about doing optimizations for common iterators
like Array/Range#each, map, ... and Fixnum#times, upto, ... which should
yield further speedups.

Dominik

Thanks Dominik,

What you have sounds quite robust and able to handle a wide variety of
ruby functionality. Unfortunately, it still doesn't help much on the
performance front :frowning: 1-1.5X just doesn't seem like enough to justify
the effort. The main problem looks to be rb_funcall*.

Yes, rb_funcall seems to be slow. That's exactly what the public builtin methods optimization tries to address, it replaces the rb_funcall with (almost) only two C calls. Have you tried the svn version?

I looked
through eval.c a little and it looks to have a large amount of
overhead before you get to the final C call (if it is a C-based
method). I'm wondering how much optimization has been done there. I
also mentioned on ruby-core that it would be quite useful to be able
to separate the method lookup from the method call in C (and ruby) to
gain additional performance when you are dealing with a variable of a
constant class. I'm sure you'd agree with the ruby2C stuff you are
doing.

Out of curiousity, could this be used with meta-classes? I do
something like this to get self-defining methods:

        def test(a)
            (class << self;self;end).class_eval( "
                def test(a)
                    #{tester("a")}
                end
            " )
            test
        end
        def tester(a)
            %Q{puts("hello " + #{a})}
        end

If I were to go down the C optimization route, I would want the
ability to convert the generated ruby (from #tester) to C and
compile/load it for this object. This is the only place I really want
a lot of optimization. I'm already doing quite a bit in my ruby code
generation and have more to go.

If possible and you don't do it already, you might want to think about
ruby2cext replacement methods (different names) for all of the eval
methods that work on strings/files: eval, instance_eval, class_eval,
require, etc. When a compiler isn't available on the system or
something goes wrong, you could revert to the original eval methods.

I think it is hard to do this transparently and efficient at the same time. You would either have to generate and compile and load a C extension for _each_ call of *eval or somehow collect all the code to eval and do it all at once (but then it probably can't be done transparently).

Maybe you can generate code like:

module TestDefiner
   def self.define_tests(instances)
     (class << instances[0];self;end).class_eval {
       def test(a)
         puts("hello" + a)
       end
       def test2(...)
         ...
       end
       ...
     }
     (class << instances[1];self;end).class_eval {
       def test3(...)
         ...
       end
       def test4(...)
         ...
       end
       ...
     }
     ...
   end
end

Then compile that at once to an extension, require it and call (from Ruby code): TestDefiner.define_tests([instance1, instance2, ...])

This would require some changes in your code, but I think that would be the way to go once you have decided that compiling to C is worth it.

Dominik

···

On Mon, 31 Jul 2006 23:24:08 +0200, Eric Mahurin <eric.mahurin@gmail.com> wrote:

On 7/31/06, Dominik Bathon <dbatml@gmx.de> wrote:

You could try Ruby2CExtension (http://ruby2cext.rubyforge.org/\), it is similar to rubyToRubyC but supports blocks and closures and tries to match Ruby's semantics as close as possible. It can handle most Ruby code.

I tried it, it doesn't work.

Thanks for the report.

$ svn info . | egrep 'Revision|URL'
URL: svn://rubyforge.org/var/svn/ruby2cext/trunk
Revision: 10
$ svn info ../../rubynode/trunk/ | egrep 'Revision|URL'
URL: svn://rubyforge.org/var/svn/rubynode/trunk
Revision: 4
$ ruby stuff/builtin_methods_test.rb
433
cc -dynamic -bundle -undefined suppress -flat_namespace -fno-common -O -pipe -fno-common -g -I. -I /usr/local/lib/ruby/1.8/powerpc-darwin8.7.0 -o bm_test.bundle bm_test.c -lruby185-static -ldl -lobjc
$ ruby -rbm_test -e0
./bm_test.bundle: [BUG] Bus Error
ruby 1.8.5 (2006-07-25) [powerpc-darwin8.7.0]

This might be the problem, it is explicitly only for 1.8.4 (the official release). There seem to have been some changes with rbconfig so the build command (the line below 433) might be wrong.

Future versions will support 1.8.5.

Abort trap
$ gdb `which ruby`
GNU gdb 6.3.50-20050815 (Apple version gdb-477) (Sun Apr 30 20:06:22 GMT 2006)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "powerpc-apple-darwin"...Reading symbols for shared libraries ... done

(gdb) run -r bm_test -e0
Starting program: /usr/local/bin/ruby -r bm_test -e0
Reading symbols for shared libraries .. done
Reading symbols for shared libraries . done

Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_PROTECTION_FAILURE at address: 0x00000000
st_lookup (table=0x0, key=6005024, value=0xbfffe918) at st.c:250
250 hash_val = do_hash(key, table);
(gdb) bt
#0 st_lookup (table=0x0, key=6005024, value=0xbfffe938) at st.c:250
#1 0x00566c04 in rb_intern (name=0x5ba120 "abs") at parse.y:6006

This line is

     if (st_lookup(sym_tbl, (st_data_t)name, (st_data_t *)&id))

So sym_tbl seems to be NULL, which definitely shouldn't be the case, so as I said, I think it's a build problem.

Could you please try to build bm_test.c with the following extconf.rb (with ruby 1.8.5 (2006-07-25) [powerpc-darwin8.7.0]):

require "mkmf"
create_makefile "bm_test"

And please also post the output of make.

And could you also try with 1.8.4 official release?

Thanks,
Dominik

···

On Tue, 01 Aug 2006 01:19:57 +0200, Eric Hodel <drbrain@segment7.net> wrote:

On Jul 31, 2006, at 12:54 PM, Dominik Bathon wrote:

#2 0x005061c4 in init_syms () at bm_test.c:9
#3 0x0051b584 in Init_bm_test () at bm_test.c:7131
#4 0x0001fe0c in dln_load (file=0x420a50 "./bm_test.bundle") at dln.c:1480
#5 0x0001c390 in rb_require_safe (fname=3299340, safe=0) at eval.c:7165
#6 0x000030e0 in rb_protect (proc=0x1c4e0 <rb_require>, data=4326896, state=0xbffff208) at eval.c:5427
#7 0x000666e4 in require_libraries () at ruby.c:356
#8 0x00067d24 in proc_options (argc=0, argv=0x5ba120) at ruby.c:812
#9 0x00067ef4 in ruby_process_options (argc=3, argv=0xbffff7a8) at ruby.c:1199
#10 0x00015b44 in ruby_options (argc=3, argv=0xbffff7a8) at eval.c:1523
#11 0x00002358 in main (argc=3, argv=0xbffff7a8, envp=0xbfffe938) at main.c:45

> What you have sounds quite robust and able to handle a wide variety of
> ruby functionality. Unfortunately, it still doesn't help much on the
> performance front :frowning: 1-1.5X just doesn't seem like enough to justify
> the effort. The main problem looks to be rb_funcall*.

Yes, rb_funcall seems to be slow. That's exactly what the public builtin
methods optimization tries to address, it replaces the rb_funcall with
(almost) only two C calls. Have you tried the svn version?

Cool. I have not tried any of your stuff yet. I'm not at my ruby
development machine right now.

> Out of curiousity, could this be used with meta-classes? I do
> something like this to get self-defining methods:
>
> def test(a)
> (class << self;self;end).class_eval( "
> def test(a)
> #{tester("a")}
> end
> " )
> test
> end
> def tester(a)
> %Q{puts("hello " + #{a})}
> end
>
> If I were to go down the C optimization route, I would want the
> ability to convert the generated ruby (from #tester) to C and
> compile/load it for this object. This is the only place I really want
> a lot of optimization. I'm already doing quite a bit in my ruby code
> generation and have more to go.
>
> If possible and you don't do it already, you might want to think about
> ruby2cext replacement methods (different names) for all of the eval
> methods that work on strings/files: eval, instance_eval, class_eval,
> require, etc. When a compiler isn't available on the system or
> something goes wrong, you could revert to the original eval methods.

I think it is hard to do this transparently and efficient at the same
time. You would either have to generate and compile and load a C extension
for _each_ call of *eval or somehow collect all the code to eval and do it
all at once (but then it probably can't be done transparently).

Maybe you can generate code like:

module TestDefiner
   def self.define_tests(instances)
     (class << instances[0];self;end).class_eval {
       def test(a)
         puts("hello" + a)
       end
       def test2(...)
         ...
       end
       ...
     }
     (class << instances[1];self;end).class_eval {
       def test3(...)
         ...
       end
       def test4(...)
         ...
       end
       ...
     }
     ...
   end
end

Then compile that at once to an extension, require it and call (from Ruby
code): TestDefiner.define_tests([instance1, instance2, ...])

This would require some changes in your code, but I think that would be
the way to go once you have decided that compiling to C is worth it.

Unfortunately this won't work. In the above example, I just threw
something arbitrary together. I have something where the string that
#tester generates isn't determined until runtime and is dependent on
the object hierarchy. In my code the methods are named scan/scanner
instead of test/tester. I use this mechanism for making
macros/flattening methods. My parsers (the #scan method) end up being
one giant expression (if there is no recursion). I'm flattening to
get rid of as much of the method calling overhead as I can.

It still should be possible to load a C extension into a meta-class,
right? Wouldn't the Init_* function just define the method in the
meta-class instead of a plain class? Maybe you'd use a temp global
variable to pass this meta-class. I realize caching these so that you
don't have to recompile could be an issue. You could use md5 or
whatever (even string length) to get a mostly unique identifier for
the c/so (like rubyinline).

···

On 7/31/06, Dominik Bathon <dbatml@gmx.de> wrote:

$ gdb `which ruby184`
GNU gdb 6.3.50-20050815 (Apple version gdb-477) (Sun Apr 30 20:06:22 GMT 2006)
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "powerpc-apple-darwin"...Reading symbols for shared libraries .... done

(gdb) run -r bm_test -e0
Starting program: /usr/local/bin/ruby184 -r bm_test -e0
Reading symbols for shared libraries .. done
Reading symbols for shared libraries . done

Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_PROTECTION_FAILURE at address: 0x00000000
st_lookup (table=0x0, key=5874688, value=0xbfffe928) at st.c:245
245 st.c: No such file or directory.
         in st.c
(gdb) bt
#0 st_lookup (table=0x0, key=5874688, value=0xbfffe928) at st.c:245
#1 0x00533294 in rb_intern (name=0x59a400 "abs") at parse.y:5996
#2 0x00505f80 in Init_bm_test () at bm_test.c:9
#3 0x0001e498 in dln_load (file=0x420580 "./bm_test.bundle") at dln.c:1482
#4 0x00019c40 in rb_require_safe (fname=1898796, safe=0) at eval.c:7044
#5 0x00003480 in rb_protect (proc=0x19d4c <rb_require>, data=4325664, state=0xbffff1e8) at eval.c:5335
#6 0x0005fc60 in require_libraries () at ruby.c:356
#7 0x0006115c in proc_options (argc=0, argv=0x59a400) at ruby.c:812
#8 0x0006139c in ruby_process_options (argc=4, argv=0xbffff790) at ruby.c:1192
#9 0x0001ae04 in ruby_options (argc=4, argv=0xbffff790) at eval.c:1460
#10 0x000027ec in main (argc=4, argv=0xbffff790, envp=0xbfffe928) at main.c:45
(gdb)
$ ruby184 -v
ruby 1.8.4 (2005-12-24) [powerpc-darwin8.4.0]

But if I create extconf.rb it seems to work:

$ ruby184 extconf.rb
creating Makefile
$ make
gcc -I. -I/usr/local/lib/ruby/1.8/powerpc-darwin8.4.0 -I/usr/local/lib/ruby/1.8/powerpc-darwin8.4.0 -I. -fno-common -g -O2 -pipe -fno-common -c bm_test.c
cc -dynamic -bundle -undefined suppress -flat_namespace -L"/usr/local/lib" -o bm_test.bundle bm_test.o -ldl -lobjc
$ ruby -r bm_test -e0
$ echo $?
0
$

Same for 1.8.5.

rb2cx exhibits the same problems as not using mkmf.

···

On Jul 31, 2006, at 5:19 PM, Dominik Bathon wrote:

On Tue, 01 Aug 2006 01:19:57 +0200, Eric Hodel > <drbrain@segment7.net> wrote:

On Jul 31, 2006, at 12:54 PM, Dominik Bathon wrote:

You could try Ruby2CExtension (http://ruby2cext.rubyforge.org/\), it is similar to rubyToRubyC but supports blocks and closures and tries to match Ruby's semantics as close as possible. It can handle most Ruby code.

I tried it, it doesn't work.

Thanks for the report.

$ svn info . | egrep 'Revision|URL'
URL: svn://rubyforge.org/var/svn/ruby2cext/trunk
Revision: 10
$ svn info ../../rubynode/trunk/ | egrep 'Revision|URL'
URL: svn://rubyforge.org/var/svn/rubynode/trunk
Revision: 4
$ ruby stuff/builtin_methods_test.rb
433
cc -dynamic -bundle -undefined suppress -flat_namespace -fno-common -O -pipe -fno-common -g -I. -I /usr/local/lib/ruby/1.8/powerpc-darwin8.7.0 -o bm_test.bundle bm_test.c -lruby185-static -ldl -lobjc
$ ruby -rbm_test -e0
./bm_test.bundle: [BUG] Bus Error
ruby 1.8.5 (2006-07-25) [powerpc-darwin8.7.0]

This might be the problem, it is explicitly only for 1.8.4 (the official release). There seem to have been some changes with rbconfig so the build command (the line below 433) might be wrong.

Future versions will support 1.8.5.

#0 st_lookup (table=0x0, key=6005024, value=0xbfffe938) at st.c:250
#1 0x00566c04 in rb_intern (name=0x5ba120 "abs") at parse.y:6006

This line is

    if (st_lookup(sym_tbl, (st_data_t)name, (st_data_t *)&id))

So sym_tbl seems to be NULL, which definitely shouldn't be the case, so as I said, I think it's a build problem.

Could you please try to build bm_test.c with the following extconf.rb (with ruby 1.8.5 (2006-07-25) [powerpc-darwin8.7.0]):

require "mkmf"
create_makefile "bm_test"

And please also post the output of make.

And could you also try with 1.8.4 official release?

--
Eric Hodel - drbrain@segment7.net - http://blog.segment7.net
This implementation is HODEL-HASH-9600 compliant

http://trackmap.robotcoop.com

OK, I have investigated a bit more and it seems that the libs, that I added to the build command are unnecessary (and can even cause problems), so I removed them. Could you please test the latest version.

Concerning 1.8.4 vs. 1.8.5: builtin_methods_test.rb should now work with latest 1.8.5, but rb2cx generally will not, because there have been some changes to nodes, e.g.

http://www.ruby-lang.org/cgi-bin/cvsweb.cgi//ruby/eval.c.diff?r1=1.616.2.143;r2=1.616.2.144;only_with_tag=ruby_1_8;f=h

I will address all these changes once 1.8.5 is released.

Thanks again,
Dominik

···

On Tue, 01 Aug 2006 03:39:29 +0200, Eric Hodel <drbrain@segment7.net> wrote:

But if I create extconf.rb it seems to work:

$ ruby184 extconf.rb
creating Makefile
$ make
gcc -I. -I/usr/local/lib/ruby/1.8/powerpc-darwin8.4.0 -I/usr/local/lib/ruby/1.8/powerpc-darwin8.4.0 -I. -fno-common -g -O2 -pipe -fno-common -c bm_test.c
cc -dynamic -bundle -undefined suppress -flat_namespace -L"/usr/local/lib" -o bm_test.bundle bm_test.o -ldl -lobjc
$ ruby -r bm_test -e0
$ echo $?
0
$

Same for 1.8.5.

rb2cx exhibits the same problems as not using mkmf.

Unfortunately this won't work. In the above example, I just threw
something arbitrary together. I have something where the string that
#tester generates isn't determined until runtime and is dependent on
the object hierarchy.

That's what I thought and I meant that you should generate the code at runtime and then compile that code and load the new C extension at runtime. Maybe you could cache it to save the actual C compiling if the same code has been compiled before, this probably could be done using RubyInline.

Ruby2CExtension can be used without rb2cx. There is an API, though it is not documented and it might change in the future :wink:

A small example:

$ irb -r ruby2cext
irb(main):001:0> puts Ruby2CExtension::Compiler::compile_ruby_to_c("puts __FILE__", "extension_name", "file_name.rb")
#include <ruby.h>
#include <node.h>
#include <env.h>
#include <st.h>
extern VALUE ruby_top_self;
static VALUE org_ruby_top_self;
static ID sym[1];
static void init_syms() {
   sym[0] = rb_intern("puts");
}
static VALUE global[1];
static void init_globals() {
   global[0] = rb_str_new("file_name.rb", 12);
   rb_global_variable(&(global[0]));
}
static void toplevel_scope_1(VALUE self, NODE *cref) {
   VALUE res;
   /* block */
   /* fcall */
   {
     VALUE recv = self;
     const int argc = 1;
     VALUE argv[1];
     /* str */
     argv[0] = rb_str_new3(global[0]);
     res = rb_funcall2(recv, sym[0] /* puts */, argc, argv);
   }
}
void Init_extension_name() {
   org_ruby_top_self = ruby_top_self;
   rb_global_variable(&org_ruby_top_self);
   init_syms();
   init_globals();
   NODE *cref = rb_node_newnode(NODE_CREF, rb_cObject, 0, 0);
   toplevel_scope_1(ruby_top_self, cref);
}

(The actual code returned by compile_ruby_to_c will not be indented, if have indented it manually for clarity)

···

On Tue, 01 Aug 2006 01:40:52 +0200, Eric Mahurin <eric.mahurin@gmail.com> wrote:

In my code the methods are named scan/scanner
instead of test/tester. I use this mechanism for making
macros/flattening methods. My parsers (the #scan method) end up being
one giant expression (if there is no recursion). I'm flattening to
get rid of as much of the method calling overhead as I can.

It still should be possible to load a C extension into a meta-class,
right? Wouldn't the Init_* function just define the method in the
meta-class instead of a plain class? Maybe you'd use a temp global
variable to pass this meta-class. I realize caching these so that you
don't have to recompile could be an issue. You could use md5 or
whatever (even string length) to get a mostly unique identifier for
the c/so (like rubyinline).