Compiler for ruby

n/a wrote:

Hi,

Complete newbie here; is there anything in the works as far as a compiler
for Ruby?

Thanks.

http://www.catb.org/~esr/faqs/smart-questions.html

···

--
Phillip "CynicalRyan" Gawlowski
http://cynicalryan.110mb.com/

Rule of Open-Source Programming #37:

Duplicate effort is inevitable. Live with it.

Yes there are a variety of approaches to compiling Ruby to either Java
bytecodes (e.g. XRuby) or bytecodes more specifically tuned to Ruby
semantics (e.g. YARV which is now in Ruby 1.9). I think most people
these days think that bytecode == Java bytecode, but the idea preceded
Java.

As of today YARV seems to be the best performing, at least according
to the benchmarks I've seen.

As for compiling directly to machine code, it could be done I suppose,
but it's not clear that it would be the best approach. Why?

   * The dynamic nature of Ruby means that methods can be dynamically
created at run-time and would therefore need to be compiled at
run-time. Additional bookkeeping would be required to make all the
semantic effects on the compiled code would be properly implemented.

  * Previous experience with compiling dynamic OO languages has shown
that the much smaller code representation of byte codes compared to
machine code can actually lead to better performance on machines with
virtual memory (almost all machines these days) due to the smaller
working set. Digitalk tried direct compilation of Smalltalk to
machine code, because they were sick of getting blasted for being
'interpreted' and found that the byte coded stuff ran significantly
faster. The practice these days is to do two-stage compilation, first
to byte-codes, and then to machine code for selected code when the
run-time detects that that code is frequently executed.

···

On 3/30/07, n/a <na@nomail.invalid> wrote:

I meant under windows and/or Linux environments.

And source code to machine code.

I had done a search of this group but I guess didn't download enough
headers to see the previous threads.

I see there are various forms of compilers that work at different levels of
code, e.g. XRuby to Java Bytecode, etc.

--
Rick DeNatale

My blog on Ruby
http://talklikeaduck.denhaven2.com/

> n/a wrote:
>> Hi,
>>
>> Complete newbie here; is there anything in the works as far as a compiler
>> for Ruby?
>>

Np I guess your question was very nicely answered by a nice and
competent member I do not agree with Phil's welcome message.

Welcome to the group
Robert

···

On 3/30/07, n/a <na@nomail.invalid> wrote:

On Sat, 31 Mar 2007 07:46:49 +0900, Phillip Gawlowski wrote:

>> Thanks.
>>
> How To Ask Questions The Smart Way
>

I meant under windows and/or Linux environments.

And source code to machine code.

I had done a search of this group but I guess didn't download enough
headers to see the previous threads.

I see there are various forms of compilers that work at different levels of
code, e.g. XRuby to Java Bytecode, etc.

--
You see things; and you say Why?
But I dream things that never were; and I say Why not?
-- George Bernard Shaw

Rick DeNatale wrote:

Yes there are a variety of approaches to compiling Ruby to either Java
bytecodes (e.g. XRuby) or bytecodes more specifically tuned to Ruby
semantics (e.g. YARV which is now in Ruby 1.9). I think most people
these days think that bytecode == Java bytecode, but the idea preceded
Java.

I first encountered the idea of a "virtual machine" for reasons of portability in the early 1960s. However, the idea probably predates that and goes back to the very early days of computer languages.

As of today YARV seems to be the best performing, at least according
to the benchmarks I've seen.

As for compiling directly to machine code, it could be done I suppose,
but it's not clear that it would be the best approach. Why?

  * The dynamic nature of Ruby means that methods can be dynamically
created at run-time and would therefore need to be compiled at
run-time. Additional bookkeeping would be required to make all the
semantic effects on the compiled code would be properly implemented.

* Previous experience with compiling dynamic OO languages has shown
that the much smaller code representation of byte codes compared to
machine code can actually lead to better performance on machines with
virtual memory (almost all machines these days) due to the smaller
working set. Digitalk tried direct compilation of Smalltalk to
machine code, because they were sick of getting blasted for being
'interpreted' and found that the byte coded stuff ran significantly
faster. The practice these days is to do two-stage compilation, first
to byte-codes, and then to machine code for selected code when the
run-time detects that that code is frequently executed.

The prototype for a lot of this is (most implementations of) Forth. There is an "inner interpreter", which was originally indirect threaded for portability. However, it can be direct threaded, which is faster, subroutine threaded, which is still faster, or "token" threaded, which is the most compact. This last corresponds most closely to what we think of as "byte code".

Yes, compactness of code is indeed a virtue on "modern machines", although I suspect it's more an issue of caching than virtual memory. By the way, in "reality", I don't think Ruby is any more "dynamic" than languages we normally think of as "static". Almost any decent-sized program or collection of programs is going to have things that are bound early and things that aren't bound till run time, regardless of what languages the implementors used.

···

--
M. Edward (Ed) Borasky, FBG, AB, PTA, PGS, MS, MNLP, NST, ACMC(P)
http://borasky-research.blogspot.com/

If God had meant for carrots to be eaten cooked, He would have given rabbits fire.

>>
>> > n/a wrote:
>> >> Hi,
>> >>
>> >> Complete newbie here; is there anything in the works as far as a compiler
>> >> for Ruby?
>> >>
> Np I guess your question was very nicely answered by a nice and
> competent member I do not agree with Phil's welcome message.
>
> Welcome to the group
> Robert
>

Robert,
It's not always easy being brand new to programming AND to Ruby so
a bit of a friendly attitude such as yours goes a long way, IMHO. And is
very much appreciated.

Oh do not mention it if it had not been me somebody else would have
said the same, it is the group really...glad you feel well here.

Cheers
Robert

···

On 3/31/07, n/a <na@nomail.invalid> wrote:

On Sun, 01 Apr 2007 01:12:09 +0900, Robert Dober wrote:
> On 3/30/07, n/a <na@nomail.invalid> wrote:
>> On Sat, 31 Mar 2007 07:46:49 +0900, Phillip Gawlowski wrote:

--
You see things; and you say Why?
But I dream things that never were; and I say Why not?
-- George Bernard Shaw

Rick DeNatale wrote:

* Previous experience with compiling dynamic OO languages has shown
that the much smaller code representation of byte codes compared to
machine code can actually lead to better performance on machines with
virtual memory (almost all machines these days) due to the smaller
working set.

Absolutely correct... but there's even more to this
than meets the eye. The working set that matters most
is the cache, not the RAM. When you have an instruction
cycle time that's more than 50x the RAM cycle time, you
can do a lot of work on something that's already in the
cache while you're waiting for the next cache line to
fill.

Reducing the working set on boxes with GB of RAM typically
has more effect through decreasing cache spills than via
reductions in page faults. The byte-codes also go in
d-cache while the interpreter itself is in I-cache.

Clifford Heath.

Clifford Heath wrote:

Reducing the working set on boxes with GB of RAM typically
has more effect through decreasing cache spills than via
reductions in page faults. The byte-codes also go in
d-cache while the interpreter itself is in I-cache.

You need a very carefully designed inner interpreter for this to be useful. See http://dec.bournemouth.ac.uk/forth/euro/ef03/ertl-gregg03.pdf and http://dec.bournemouth.ac.uk/forth/euro/ef02/ertl02.pdf for some interesting ways this can be done with the inner interpreter still in C (although it does exploit some features of GCC that not all C compilers know about.

···

--
M. Edward (Ed) Borasky, FBG, AB, PTA, PGS, MS, MNLP, NST, ACMC(P)
http://borasky-research.blogspot.com/

If God had meant for carrots to be eaten cooked, He would have given rabbits fire.

M. Edward (Ed) Borasky wrote:

The byte-codes also go in
d-cache while the interpreter itself is in I-cache.

You need a very carefully designed inner interpreter for this to be useful.

Good stuff, Ed, but not really what I meant.
They're modifying direct-threaded code to
aggregate common sequences of functions AIUI,
where I wasn't really talking about threaded
code at all, but byte-code. I've used aggressive
inlining to build an interpreter with nearly all
the primitives in one function, leaving normal
C register variables available as registers, and
found that worked quite well (for emulating a
small microprocessor on a 386, rather than for
byte code). The interesting thing is what a good
compiler can do with such a large function if
it's built this way. You can avoid most call
overhead and have a compact switch table if you
have a well-designed byte-code. Even if the byte
code is highly dense, so that each code needs to
be looked at several times to be executed, that
isn't a problem once it's in cache, as the very
next thing you're often going to do is to fetch
more data or byte-code, and you'll have to wait
for that - so using some of those CPU cycles
decoding the byte-code doesn't hurt much.

Clifford Heath.