ANN: ParseTree 1.3.3 and ruby2c 1.0.0 beta 1

Actual announcements are on http://blog.zenspider.com/

Copy/paste job below:

···

=====

I am releasing ParseTree 1.3.3 today in preparation of our ruby2c release (also today). Changes in ParseTree are minor, but necessary for ruby2c.

  ParseTree is a C extension (using RubyInline) that extracts the parse tree for an entire class or a specific method and returns it as a s-expression (aka sexp) using ruby's arrays, strings, symbols, and integers.

  As an example:
   def conditional1(arg1)
     if arg1 == 0 then
       return 1
     end
     return 0
   end

  becomes:
   [:defn,
     :conditional1,
     [:scope,
      [:block,
       [:args, :arg1],
       [:if,
        [:call, [:lvar, :arg1], :==, [:array, [:lit, 0]]],
        [:return, [:lit, 1]],
        nil],
       [:return, [:lit, 0]]]]]

Features/Problems:
  + Uses RubyInline, so it just drops in.
  + Includes SexpProcessor and CompositeSexpProcessor.
    + Allows you to write very clean filters.
  + Includes show.rb, which lets you quickly snoop code.
  + Includes abc.rb, which lets you get abc metrics on code.
    + abc metrics = numbers of assignments, branches, and calls.
    + whitespace independent metric for method complexity.
  + Only works on methods in classes/modules, not arbitrary code.
  + Does not work on the core classes, as they are not ruby (yet).

Changes:
  + 3 minor enhancement
    + Cleaned up parse_tree_abc output
    + Patched up null class names (delegate classes are weird!)
    + Added UnknownNodeError and switched SyntaxError over to it.
  + 2 bug fixes
    + Fixed BEGIN node handling to recurse instead of going flat.
    + FINALLY fixed the weird compiler errors seen on some versions of gcc 3 .4.x related to type punned pointers.

=====

Releasing ruby2c 1.0.0 beta 1

After far too long, I finally have the dubious honor of releasing ruby2c 1.0.0 beta 1 today. I'm itching to do it, we really need to get it out there so people can get their eyes on it and give us feedback. I'm also nervous as hell... the thing is a mess!

  Understand what we mean by beta. It means we need eyes on it, it means it was ready enough to put out in the wild, but it also means that it isn't ready for any real use.

  What can it do?

Well, currently it can pass all of its unit tests (325 tests with 512 assertions) and it can translate nice simple static algorithmic code into C without much problem. For example:
& cat x.rb
class Something
   def blah; return 2+2; end
   def main; return blah; end
end
& ./translate.rb x.rb > x.c
& gcc -I /usr/local/lib/ruby/1.8/powerpc-darwin x.c
x.c: In function `main':
x.c:17: warning: return type of `main' is not `int'
& ./a.out
& echo $?
4

  What can it not do?

More than it can.

It can't (and won't) translate dynamic code. Period. That is simply not the intent.

It probably can't translate a lot of static code that we simply haven't come across or anticipated yet. Our tests cover a fair amount, our validation runs cover a lot more than that, but it is still fairly idiomatic ruby and that puts us at being better at certain styles of coding and much worse at others.

It is also simply rough around the edges. We've rounded out the rdoc but haven't done a thing for general documentation yet. These are on our list, and rather high on our priority list, but we just haven't had the time yet. For now, check out the rdoc and the PDF presentation that we've had up for a while.

PLEASE: file bugs! We need feedback and we'd like to be able to track it. The ruby2c project is on rubyforge and I'm getting the trackers set up today as well.

This is what I 've been waiting for a LOONG time :slight_smile:
Cant wait to start playing with this!

thank you very much!
George.

It can't (and won't) translate dynamic code. Period. That is simply not
the intent.

[This is just my personal opinion, and is not meant to be mean]

I guess the name Ruby2C and its goals are not well choosen, for in my
opinion it makes no sense to rely on type inferal and conversion to
C-types in a inherently dynamic language like ruby. Furthermore, it is
very restrictive subset you choose.

In your example, you end up with a method like

void hello1(long param) { .. }

Classes compiled in such a way:
* Need a very restrictive wrapper to be called from ruby
* Methods have to be final
* No dynamic binding
* Explicit type conversion before calling the method
* Cannot be extended from ruby
* Do not use the ruby C framework

Smalltalk was mentioned in the BLOG as an example of a language, where
most functionality is written in the language itself, but:

In Smalltalk, altough most things are written in Smalltalk itself, they
rely on a VM, which is able to interpret all kinds of smalltalk code. I
think this approach, which maybe YARV may realize, is much more
appropriate for a dynamic language like ruby.

But then again, maybe I did not understand your intent.

best regards,

···

On Wed, 02 Feb 2005 10:56:09 +0900, Ryan Davis wrote:

congratulations on the release!!!!
Alex

···

On Feb 2, 2005, at 2:56 AM, Ryan Davis wrote:

Actual announcements are on http://blog.zenspider.com/

Copy/paste job below:

=====

I am releasing ParseTree 1.3.3 today in preparation of our ruby2c release (also today). Changes in ParseTree are minor, but necessary for ruby2c.

ParseTree is a C extension (using RubyInline) that extracts the parse tree for an entire class or a specific method and returns it as a s-expression (aka sexp) using ruby's arrays, strings, symbols, and integers.

As an example:
  def conditional1(arg1)
    if arg1 == 0 then
      return 1
    end
    return 0
  end

becomes:
  [:defn,
    :conditional1,
    [:scope,
     [:block,
      [:args, :arg1],
      [:if,
       [:call, [:lvar, :arg1], :==, [:array, [:lit, 0]]],
       [:return, [:lit, 1]],
       nil],
      [:return, [:lit, 0]]]]]

Features/Problems:
  + Uses RubyInline, so it just drops in.
  + Includes SexpProcessor and CompositeSexpProcessor.
    + Allows you to write very clean filters.
  + Includes show.rb, which lets you quickly snoop code.
  + Includes abc.rb, which lets you get abc metrics on code.
    + abc metrics = numbers of assignments, branches, and calls.
    + whitespace independent metric for method complexity.
  + Only works on methods in classes/modules, not arbitrary code.
  + Does not work on the core classes, as they are not ruby (yet).

Changes:
  + 3 minor enhancement
    + Cleaned up parse_tree_abc output
    + Patched up null class names (delegate classes are weird!)
    + Added UnknownNodeError and switched SyntaxError over to it.
  + 2 bug fixes
    + Fixed BEGIN node handling to recurse instead of going flat.
    + FINALLY fixed the weird compiler errors seen on some versions of gcc 3 .4.x related to type punned pointers.

=====

Releasing ruby2c 1.0.0 beta 1

After far too long, I finally have the dubious honor of releasing ruby2c 1.0.0 beta 1 today. I'm itching to do it, we really need to get it out there so people can get their eyes on it and give us feedback. I'm also nervous as hell... the thing is a mess!

Understand what we mean by beta. It means we need eyes on it, it means it was ready enough to put out in the wild, but it also means that it isn't ready for any real use.

What can it do?

Well, currently it can pass all of its unit tests (325 tests with 512 assertions) and it can translate nice simple static algorithmic code into C without much problem. For example:
& cat x.rb
class Something
  def blah; return 2+2; end
  def main; return blah; end
end
& ./translate.rb x.rb > x.c
& gcc -I /usr/local/lib/ruby/1.8/powerpc-darwin x.c
x.c: In function `main':
x.c:17: warning: return type of `main' is not `int'
& ./a.out
& echo $?
4

What can it not do?

More than it can.

It can't (and won't) translate dynamic code. Period. That is simply not the intent.

It probably can't translate a lot of static code that we simply haven't come across or anticipated yet. Our tests cover a fair amount, our validation runs cover a lot more than that, but it is still fairly idiomatic ruby and that puts us at being better at certain styles of coding and much worse at others.

It is also simply rough around the edges. We've rounded out the rdoc but haven't done a thing for general documentation yet. These are on our list, and rather high on our priority list, but we just haven't had the time yet. For now, check out the rdoc and the PDF presentation that we've had up for a while.

PLEASE: file bugs! We need feedback and we'd like to be able to track it. The ruby2c project is on rubyforge and I'm getting the trackers set up today as well.

This is very cool. Would it be possible to use PraseTree for a
refactoring toolkit?

martinus

Ryan Davis wrote:

Releasing ruby2c 1.0.0 beta 1

Care to offer any comparisons with Python Pyrex?
http://nz.cosc.canterbury.ac.nz/~greg/python/Pyrex/

···

--
Glenn Parker | glenn.parker-AT-comcast.net | <http://www.tetrafoil.com/&gt;

I guess the name Ruby2C and its goals are not well choosen, for in my
opinion it makes no sense to rely on type inferal and conversion to
C-types in a inherently dynamic language like ruby. Furthermore, it is
very restrictive subset you choose.

not my place to say really as i'm not involved
directly in the project, but... the idea of ruby2c
is to make it possible to write an interpreter in
a fairly idiomatic ruby subset. the aim is not to
be used for directly executing end user code, but
instead for making a maintainable interpreter written
in this subset ruby, and for making it *much* easier
for ruby coders to write fast extension modules
without forcing them to code c :slight_smile:

In Smalltalk, altough most things are written in Smalltalk itself, they
rely on a VM, which is able to interpret all kinds of smalltalk code. I
think this approach, which maybe YARV may realize, is much more
appropriate for a dynamic language like ruby.

maybe the paragraph is just confusing me :), but just in case,
the smalltalk vm doesn't directly execute smalltalk but instead
a fairly low level (though certainly not processor level) bytecode,
the smalltalk execution still requires compilation. much as with yarv.

yarv is written in c. thats already enough for me to
dislike it unfortunately though no offense to koichi he's
doing an *excellent* job.

Alex

···

On Feb 2, 2005, at 12:00 PM, Benedikt Huber wrote:

AFAIK, rrb project which adds some refactoring capabilities to emacs uses similar to ParseTree library called ripper. BTW, ripper is now part of Ruby 1.9.

Link: http://www.kmc.gr.jp/proj/rrb/index-en.html

Cheers,
Kent.

···

On Feb 2, 2005, at 8:45 AM, martinus wrote:

This is very cool. Would it be possible to use PraseTree for a
refactoring toolkit?

martinus

Glenn Parker ha scritto:

Ryan Davis wrote:

Releasing ruby2c 1.0.0 beta 1

Care to offer any comparisons with Python Pyrex?
http://nz.cosc.canterbury.ac.nz/~greg/python/Pyrex/

I can at least think of one:
pyrex can use type hints, and actually usese a slightly different syntax from the python one, while ruby2c translate actual ruby code to C

It can't (and won't) translate dynamic code. Period. That is simply not
the intent.

[This is just my personal opinion, and is not meant to be mean]

I guess the name Ruby2C and its goals are not well choosen, for in my
opinion it makes no sense to rely on type inferal and conversion to
C-types in a inherently dynamic language like ruby.

We're only human. Getting Ruby2C as far as it is has been a very large investment of our time, and I think its cool that we have any type inferencing at all. We'd like to be able to translate more dynamic things, but it will be easier if we have other eyeballs on this helping us out.

You should, however, check out the propaganda document if you haven't already, it gives a much better idea of our goals:

http://www.zenspider.com/~ryand/Ruby2C.pdf

Furthermore, it is very restrictive subset you choose.

That's because we haven't yet had the time to make it any larger than it is. We are releasing because we want to recruit people to help us expand that subset. (And to do other things, check out the propaganda document above.)

Instead, we focused on having a very helpful tool-chain and an extensive suite of tests. These have helped us do very powerful things to the Ruby AST in a very short amount of time.

In your example, you end up with a method like

void hello1(long param) { .. }

Classes compiled in such a way:
* Need a very restrictive wrapper to be called from ruby
* Methods have to be final
* No dynamic binding

Check out the propaganda document... There's an interesting slide near the very end.

* Explicit type conversion before calling the method

You have to do this when writing wrappers to C code anyhow, but again, read that propaganda document.

* Cannot be extended from ruby

The propaganda document gives a good workaround for this, and is a good example of the 90/10 rule.

* Do not use the ruby C framework

But they can! See the propaganda document.

Remember that extension writing is *not* our goal, it is more of a side benefit that Ruby2C gives you.

Smalltalk was mentioned in the BLOG as an example of a language, where
most functionality is written in the language itself, but:

In Smalltalk, altough most things are written in Smalltalk itself, they
rely on a VM, which is able to interpret all kinds of smalltalk code. I
think this approach, which maybe YARV may realize, is much more
appropriate for a dynamic language like ruby.

But then again, maybe I did not understand your intent.

You catch it exactly, but you miss how our tool fits into what Squeak Smalltalk does.

You could write all of the code in Ruby's core, even the VM, in Ruby. Then you translate the absolute minimum to C you need automatically with Ruby2C (eventually, just the VM).

In order to get there, however, you need to have Ruby2C working in general.

Ruby2C is better suited to the C side of things, Array, Hash, String, a VM, than it is to the Ruby side of things, because the C side of Ruby is much less dynamic. These things can all be written in the Ruby2C subset then translated automatically. As Ruby VMs get faster, eventually Ruby2C will no longer be necessary, and hopefully will only be used on the Ruby VM itself.

PGP.sig (186 Bytes)

···

On 02 Feb 2005, at 03:00, Benedikt Huber wrote:

On Wed, 02 Feb 2005 10:56:09 +0900, Ryan Davis wrote:

--
Eric Hodel - drbrain@segment7.net - http://segment7.net
FEC2 57F1 D465 EB15 5D6E 7C11 332A 551C 796C 9F04

You are very welcome. We look forward to your feedback.

···

On Feb 2, 2005, at 3:00 AM, George Moschovitis wrote:

This is what I 've been waiting for a LOONG time :slight_smile:
Cant wait to start playing with this!

Thank you!

(your turn)

:slight_smile:

···

On Feb 2, 2005, at 3:41 AM, Alexander Kellett wrote:

congratulations on the release!!!!

Hrm. Not having written a refactoring toolkit, I can only speculate. I'd say it has an OK chance at it, but ParseTree itself is a very very thin layer to (what I'm guessing) is a lot of infrastructure needed to do a full refactoring tool.

The problem is that the parse tree doesn't preserve comments at all. This alone could make a good refactoring browser difficult. My guess is that you'd use ParseTree just for analysis and something like emacs or freeride to do the actual work.

As an aside, I wrote a (very) quick proof of concept for RubyInline that uses ParseTree. It is called Ruby2Ruby. It allows you to "translate" ruby into ruby by providing a to_ruby method on the Method class. By doing something like: obj.method(:meth).to_ruby you get a string back that is (as close as we can get it) a reconstruction of the method's source code. to_ruby is actually pretty small. It grabs the parse tree for the method in question, and runs it through a rather small (because it is incomplete) and simple (because SexpProcessor's architecture is really cool) class called RubyToRubyProcessor. Not much work. It took me roughly 30 minutes to get the PoC done.

···

On Feb 2, 2005, at 5:45 AM, martinus wrote:

This is very cool. Would it be possible to use PraseTree for a
refactoring toolkit?

I've only taken a brief look, but from what I can gather Pyrex is a lot more like RubyInline than it is like ruby2c. In RI you can do things like this:

require 'inline'
class MyTest
   def factorial(n)
     f = 1
     n.downto(2) { |x| f *= x }
     f
   end

   inline do |builder|
     builder.include "<math.h>"
     builder.c "
     long factorial_c(int max) {
       int i=max, result=1;
       while (i >= 2) { result *= i--; }
       return result;
     }"
   end
end

and call both MyTest.new.factorial(5) and MyTest.new.factorial_c(5). Just by loading the "file" above, all of your argument and return type conversion is done for you, the code is exported, compiled, linked, and loaded back in (And only done when actual changes occur in the code I might add). No install.rb or setup.rb. No waiting. No extra phases at all.

···

On Feb 2, 2005, at 6:25 AM, Glenn Parker wrote:

Ryan Davis wrote:

Releasing ruby2c 1.0.0 beta 1

Care to offer any comparisons with Python Pyrex?
http://nz.cosc.canterbury.ac.nz/~greg/python/Pyrex/

--
ryand-ruby@zenspider.com - http://blog.zenspider.com/
http://rubyforge.org/projects/ruby2c/
http://rubyforge.org/projects/rubyinline/

It can't (and won't) translate dynamic code. Period. That is simply not
the intent.

[This is just my personal opinion, and is not meant to be mean]

I guess the name Ruby2C and its goals are not well choosen, for in my
opinion it makes no sense to rely on type inferal and conversion to
C-types in a inherently dynamic language like ruby. Furthermore, it is
very restrictive subset you choose.

While I respect your opinion, your statement is loaded. You admit (in a later email) to having read our propaganda, so you know why we chose type inference and conversion to C types and you know that the subset we chose is restrictive on purpose (although we are working on opening that up as much as we can, there is still only so much that we can do with a static subset of ruby). You also know _why_ we are doing this, so I'm terribly confused why you think that both the name "ruby2c" (which is what it does, for a subset of ruby) and the goals are not well chosen. I don't think your bullet points below do a very good job of explaining that and I'd like to understand your opinion more.

In your example, you end up with a method like

void hello1(long param) { .. }

Classes compiled in such a way:

* Need a very restrictive wrapper to be called from ruby

I'm not sure I understand this point, but I'm guessing it is just a matter of nomenclature.

* Methods have to be final
* No dynamic binding
* Explicit type conversion before calling the method

Yup yup. This is exactly our intent and I think we are right for choosing it.

* Cannot be extended from ruby

Do you mean the code that we translate to C cannot be extended? If so, then yes, that is our intent, in the exact same sense that ruby's C code cannot be extended (in C) very easily (but then can be extended in ruby).

* Do not use the ruby C framework

Well at this point, we have only scrapped up a rather poor translation to C, and whether we do completely generic C, or ruby extension C, or _both_ is up in the air. Since we architected the tools using a pipeline with a very clean architecture, it is very feasible (and EASY) to do both and have two different ends of the pipeline, depending on what the user is trying to translate to. I'll be working on some doco/propaganda to illustrate that in the near future because we haven't made that clear at all.

Smalltalk was mentioned in the BLOG as an example of a language, where
most functionality is written in the language itself, but:

In Smalltalk, altough most things are written in Smalltalk itself, they
rely on a VM, which is able to interpret all kinds of smalltalk code. I
think this approach, which maybe YARV may realize, is much more
appropriate for a dynamic language like ruby.

Yes, Smalltalk relies on a VM. And in at least the case of squeak (but I also think a couple of others--but minor ones at that) the VM is written in a static smalltalk subset and then translated down to C. The product of which made the squeak team MUCH more efficient and able to port their code to more platforms faster than they'd been able to prior to having the system written entirely in smalltalk.

But then again, maybe I did not understand your intent.

I'm pretty sure you do, based on email from you that I haven't responded to yet. :slight_smile:

···

On Feb 2, 2005, at 3:00 AM, Benedikt Huber wrote:

On Wed, 02 Feb 2005 10:56:09 +0900, Ryan Davis wrote:

--
ryand-ruby@zenspider.com - http://blog.zenspider.com/
http://rubyforge.org/projects/ruby2c/
Seattle.rb | Home

I guess the name Ruby2C and its goals are not well choosen...

for making it *much* easier for ruby coders to write fast extension
modules without forcing them to code c :slight_smile:

I understood this point. So RubyC would be a better name (i.e. a high
level description language for C with automatic type inference). If
this is the _main_ goal, i can see some benefits. Also, you would
have to supply some low-level IO mechanism if you want to write e.g.
hardware related extensions.

In Smalltalk, altough most things are written in Smalltalk itself, they
rely on a VM, which is able to interpret all kinds of smalltalk code. I
think this approach, which maybe YARV may realize, is much more
appropriate for a dynamic language like ruby.

maybe the paragraph is just confusing me :), but just in case, the
smalltalk vm doesn't directly execute smalltalk but instead a fairly low
level (though certainly not processor level) bytecode, the smalltalk
execution still requires compilation. much as with yarv.

I apologize. I was talking about Smalltalk bytecode - but bytecode can do
the same things as sourcecode (if you have a compiler, of course).

···

On Wed, 02 Feb 2005 20:40:39 +0900, Alexander Kellett wrote:

On Feb 2, 2005, at 12:00 PM, Benedikt Huber wrote:

You should, however, check out the propaganda document if you haven't
already, it gives a much better idea of our goals:

I had read it, but I missed that page at the end. Sorry for that.
But inlining a method, and converting a whole program to plain C (w.o.
the overhead from dynamic method dispatch etc.) are two different
things. For my defense: the latter is what you promote in the first 25
slides.

You could write all of the code in Ruby's core, even the VM, in Ruby.
Then you translate the absolute minimum to C you need automatically with
Ruby2C (eventually, just the VM).

Ok, this sounds very ambitious. The ruby core is well written and it's
hard to write an equivalent substitution.
And the VM part: It think it is very hard to write a fast VM in C. A
Ruby2C translator which generates a fast VM sounds like a miracle.

Anyway, good luck - I'm sure it is a lot of fun.
At least you do not have to write in C[1] :wink:

[1] http://gnu.de.uu.net/wic.html

···

On Thu, 03 Feb 2005 03:52:11 +0900, Eric Hodel wrote:

Whoa. You give ParseTree a lot more credit than it deserves. Ripper is big, and it is doing real work to do what it does. ParseTree is a little brown stinky ferret that digs down a hole and violently rips the AST away from the warm bosom of ruby. In other words, we cheat, they don't.

···

On Feb 2, 2005, at 6:40 AM, Kent Sibilev wrote:

AFAIK, rrb project which adds some refactoring capabilities to emacs uses similar to ParseTree library called ripper. BTW, ripper is now part of Ruby 1.9.

--
ryand-ruby@zenspider.com - http://blog.zenspider.com/
http://rubyforge.org/projects/ruby2c/
http://rubyforge.org/projects/parsetree/

I'm torn. On one hand I'd like to solely focus on metaruby (keep the eye on the ball). On the other, I think ruby2c has good potential to be generally usable by a much wider audience to optimized bottlenecked code. It seems to me a good way to recruit for a majority of the toolset so we can then better balance our time between the two goals.

···

On Feb 2, 2005, at 3:40 AM, Alexander Kellett wrote:

not my place to say really as i'm not involved directly in the project, but... the idea of ruby2c is to make it possible to write an interpreter in a fairly idiomatic ruby subset. the aim is not to be used for directly executing end user code, but instead for making a maintainable interpreter written in this subset ruby, and for making it *much* easier for ruby coders to write fast extension modules without forcing them to code c :slight_smile:

--
ryand-ruby@zenspider.com - http://blog.zenspider.com/
http://rubyforge.org/projects/ruby2c/
http://rubyforge.org/projects/parsetree/

How about PreRuby? [1] :slight_smile:

Winking to welcome everyone-ly yours,
Michael

[1] "The PreScheme compiler makes use of type inference, partial
evaluation and Scheme and Lisp compiler technology to compile the
problematic features of Scheme, such as closures, into C code without
significant run-time overhead."

···

On Wed, 2 Feb 2005 22:15:47 +0900, Benedikt Huber <benjovi@gmx.net> wrote:

On Wed, 02 Feb 2005 20:40:39 +0900, Alexander Kellett wrote:

> On Feb 2, 2005, at 12:00 PM, Benedikt Huber wrote:
>> I guess the name Ruby2C and its goals are not well choosen...
> for making it *much* easier for ruby coders to write fast extension
> modules without forcing them to code c :slight_smile:
I understood this point. So RubyC would be a better name (i.e. a high
level description language for C with automatic type inference). If
this is the _main_ goal, i can see some benefits. Also, you would
have to supply some low-level IO mechanism if you want to write e.g.
hardware related extensions.