Psyco

There’s an interesting article on IBM developerWorks about a new program
called Psyco, which hooks into the guts of Python’s interpreter and does
a kind of just-in-time compilation substituting bytecode for native machine
code. Apparently, this allows python to achieve near C like speeds without
having to rewrite any code, and it sounds like a really cool project.
Unfortunately, I don’t care much for Python, but I do love Ruby. This leads
me to the question of whether there are any similar programs available for
Ruby. If not, would such a project be possible?

Here is the URL of the article (broken by a backslash)
http://www-106.ibm.com/developerworks/linux/library
/l-psyco.html?t=gr,lnxw07=PsycoC

···

Travis Whitton whitton@atlantic.net

In article slrnaqpbo0.73l.whitton@grub.atlantic.net,

···

Travis Whitton whitton@atlantic.net wrote:

There’s an interesting article on IBM developerWorks about a new program
called Psyco, which hooks into the guts of Python’s interpreter and does
a kind of just-in-time compilation substituting bytecode for native machine
code. Apparently, this allows python to achieve near C like speeds without
having to rewrite any code, and it sounds like a really cool project.
Unfortunately, I don’t care much for Python, but I do love Ruby. This leads
me to the question of whether there are any similar programs available for
Ruby. If not, would such a project be possible?

It’s an interesting idea, but first we need a bytecode interpreter. As it
stands now, Ruby is interpreted by walking an AST.

Phil

It’s already been talked about in this list, in
[ruby-talk:25949] and [ruby-talk:52296].

If Rite ever becomes real, we should definitely explore Psyco’s way.
Of course, it needs an underlying VM, and Ruby’s current implementation
walks the AST tree…

···

On Wed, Oct 16, 2002 at 09:21:47AM +0900, Travis Whitton wrote:

There’s an interesting article on IBM developerWorks about a new program
called Psyco, which hooks into the guts of Python’s interpreter and does
a kind of just-in-time compilation substituting bytecode for native machine
code. Apparently, this allows python to achieve near C like speeds without
having to rewrite any code, and it sounds like a really cool project.
Unfortunately, I don’t care much for Python, but I do love Ruby. This leads
me to the question of whether there are any similar programs available for
Ruby. If not, would such a project be possible?


_ _

__ __ | | ___ _ __ ___ __ _ _ __
'_ \ / | __/ __| '_ _ \ / ` | ’ \
) | (| | |
__ \ | | | | | (| | | | |
.__/ _,
|_|/| || ||_,|| |_|
Running Debian GNU/Linux Sid (unstable)
batsman dot geo at yahoo dot com

MS-DOS, you can’t live with it, you can live without it.
– from Lars Wirzenius’ .sig

Why would it not be possible to compile the AST to native code? That’s
what a traditional compiler does, after all.

Cheers,
Nat.

···

On Wed, 2002-10-16 at 02:21, Phil Tomson wrote:

In article slrnaqpbo0.73l.whitton@grub.atlantic.net,
Travis Whitton whitton@atlantic.net wrote:

There’s an interesting article on IBM developerWorks about a new program
called Psyco, which hooks into the guts of Python’s interpreter and does
a kind of just-in-time compilation substituting bytecode for native machine
code. Apparently, this allows python to achieve near C like speeds without
having to rewrite any code, and it sounds like a really cool project.
Unfortunately, I don’t care much for Python, but I do love Ruby. This leads
me to the question of whether there are any similar programs available for
Ruby. If not, would such a project be possible?

It’s an interesting idea, but first we need a bytecode interpreter. As it
stands now, Ruby is interpreted by walking an AST.


Dr. Nathaniel Pryce, Technical Director, B13media Ltd.
Studio 3a, 22-24 Highbury Grove, London N5 2EA, UK
http://www.b13media.com

“Phil Tomson” ptkwt@shell1.aracnet.com wrote in message
news:aoichg02ir8@enews2.newsguy.com

It’s an interesting idea, but first we need a bytecode interpreter. As it
stands now, Ruby is interpreted by walking an AST.

I’d rather see a proper native code compiler. JIT’s are not that great,
there is overhead in the compilation process and there is no time nor
opportunity for decent optimizations. For tight lops they may be good, but
then we already have inline C.

I think the OCaml model based on the Zinc compiler is great. I generates
bytecode which can execute, or it generates native assembler from the
compile time internal byte code representation. Thus your source is
portable, you have portable bytecode which executes fast, and you have the
option for native execution on selected platforms. All without involving a
bloated JIT compiler at runtime. In fact I think it would be cool to compile
Ruby code into the OCaml runtime.

Mikkel

The real problem is that Ruby is so dynamic. This is great for programmers
but gives compilers a hard time.

Naive example:

(1 + 2)

Most languages wouldn’t have any problem recognizing this as a number
addition and generating quick instructions. Any good compiler would
precompute this as 3 and use the result in the compiled code directly.

In Ruby, the ‘+’ is a method call on a number which could be overridden at
anytime, any number of times. It’s impossible for a compiler to predict
just what ‘+’ is going to do from one moment to the next. JIT has a better
chance.

You can generate bytecode but it’s not going to give that much of a speed
increase, IMHO, because it would still have to be doing dynamic method
lookup etc.

···


Justin Johnson

“Nat Pryce” nat.pryce@b13media.com wrote in message
news:1034769293.1864.2.camel@ballsup…

On Wed, 2002-10-16 at 02:21, Phil Tomson wrote:

In article slrnaqpbo0.73l.whitton@grub.atlantic.net,
Travis Whitton whitton@atlantic.net wrote:

There’s an interesting article on IBM developerWorks about a new
program
called Psyco, which hooks into the guts of Python’s interpreter and
does
a kind of just-in-time compilation substituting bytecode for native
machine
code. Apparently, this allows python to achieve near C like speeds
without
having to rewrite any code, and it sounds like a really cool project.
Unfortunately, I don’t care much for Python, but I do love Ruby. This
leads
me to the question of whether there are any similar programs available
for
Ruby. If not, would such a project be possible?

It’s an interesting idea, but first we need a bytecode interpreter. As
it
stands now, Ruby is interpreted by walking an AST.

Why would it not be possible to compile the AST to native code? That’s
what a traditional compiler does, after all.

Cheers,
Nat.


Dr. Nathaniel Pryce, Technical Director, B13media Ltd.
Studio 3a, 22-24 Highbury Grove, London N5 2EA, UK
http://www.b13media.com

Justin Johnson wrote:

Why would it not be possible to compile the AST to native code? That’s
what a traditional compiler does, after all.

You can generate bytecode but it’s not going to give that much of a speed
increase, IMHO, because it would still have to be doing dynamic method
lookup etc.

It may be faster to JIT-compile bytecode into native code than doing it
directly from the AST. This is useful if you want to distribute bytecode
instead of source.

/Anders

···

A n d e r s B e n g t s s o n | ndrsbngtssn@yahoo.se
Stockholm, Sweden |


Gratis e-mail resten av livet på www.yahoo.se/mail
Busenkelt!

Totally agree.

As I am not really into “power programming” but more into
“performance” (considering my Fortran background and my current project),
I am not too interested in adding features to Ruby, but really wish that
something can be done to increase Ruby efficiency. Although currently
several solutions exist to varying degrees (such as Inline, rb2c, and
SWIG), I think it will be nice not having to leave the Ruby framework (by
having to deal explicitly with C/C++).

Python already has Psyco, which can increase Python execution speed (at
the cost of large memory consumption) without any change to the Python
code itself. Python also already has Pyrex, which lets the programmer to
fine-tune the Python code for increased performance.

I think that kind of “extension” will really give a programmer a freedom
of choice. For “rapid development”, one can stay within the current Ruby
syntax; on the other hand, for “performance coding”, one can fine-tune a
Ruby code ala Pyrex (instead of reopening that old C book :slight_smile: ).

Regards,

Bill

···

============================================================================
Justin Johnson justinj@mobiusent.com wrote:

The real problem is that Ruby is so dynamic. This is great for programmers
but gives compilers a hard time.

Naive example:

(1 + 2)

Most languages wouldn’t have any problem recognizing this as a number
addition and generating quick instructions. Any good compiler would
precompute this as 3 and use the result in the compiled code directly.

Sorry, I wasn’t clear. I meant why would it not be possible to
JIT-compile from the AST (rather than from bytecode)?

The technique of Polymorphic Inlince Caches (aka PICs) can be used to
optimise method dispatch in dynamic, JIT-compiled languages by virtue of
the fact that a single message send is usually directed at objects of
just one type, or one of a small number of types, rather than at objects
of any type that can process the message. More info at:
http://www.cs.ucsb.edu/labs/oocsb/papers/pics.html

Cheers,
Nat.

···

On Wed, 2002-10-16 at 17:03, Justin Johnson wrote:

The real problem is that Ruby is so dynamic. This is great for programmers
but gives compilers a hard time.

Naive example:

(1 + 2)

Most languages wouldn’t have any problem recognizing this as a number
addition and generating quick instructions. Any good compiler would
precompute this as 3 and use the result in the compiled code directly.

In Ruby, the ‘+’ is a method call on a number which could be overridden at
anytime, any number of times. It’s impossible for a compiler to predict
just what ‘+’ is going to do from one moment to the next. JIT has a better
chance.

You can generate bytecode but it’s not going to give that much of a speed
increase, IMHO, because it would still have to be doing dynamic method
lookup etc.


Justin Johnson

“Nat Pryce” nat.pryce@b13media.com wrote in message
news:1034769293.1864.2.camel@ballsup…

On Wed, 2002-10-16 at 02:21, Phil Tomson wrote:

In article slrnaqpbo0.73l.whitton@grub.atlantic.net,
Travis Whitton whitton@atlantic.net wrote:

There’s an interesting article on IBM developerWorks about a new
program
called Psyco, which hooks into the guts of Python’s interpreter and
does
a kind of just-in-time compilation substituting bytecode for native
machine
code. Apparently, this allows python to achieve near C like speeds
without
having to rewrite any code, and it sounds like a really cool project.
Unfortunately, I don’t care much for Python, but I do love Ruby. This
leads
me to the question of whether there are any similar programs available
for
Ruby. If not, would such a project be possible?

It’s an interesting idea, but first we need a bytecode interpreter. As
it
stands now, Ruby is interpreted by walking an AST.

Why would it not be possible to compile the AST to native code? That’s
what a traditional compiler does, after all.

Cheers,
Nat.


Dr. Nathaniel Pryce, Technical Director, B13media Ltd.
Studio 3a, 22-24 Highbury Grove, London N5 2EA, UK
http://www.b13media.com


Dr. Nathaniel Pryce, Technical Director, B13media Ltd.
Studio 3a, 22-24 Highbury Grove, London N5 2EA, UK
http://www.b13media.com

“Justin Johnson” justinj@mobiusent.com wrote in message
news:1034783660.29726.0@demeter.uk.clara.net

In Ruby, the ‘+’ is a method call on a number which could be overridden at
anytime, any number of times. It’s impossible for a compiler to predict
just what ‘+’ is going to do from one moment to the next. JIT has a
better
chance.

First the example (3 + 4) can always be optimized because the values are
constants. For constants such as (A + B) the problem is that they are not
really constant, but that is solved with a compiler switch allowing
optimizations of constants (with error on const assignment).

Second, a “real” compiler can inspect the active scope of an expression
using time consuming reachability analysis that a Jitter can not afford to
do. Having a Jitter recompile a function for every new call would be rather
expensive and space consuming, except for very long processes (which is the
only place where a see a sensible use of Jitters), both because the compile
time is small compared to runtime, and because you can gather statistics
before starting the compilation.

You can generate bytecode but it’s not going to give that much of a speed
increase, IMHO, because it would still have to be doing dynamic method
lookup etc.

Bytecode can cache lookups, getting benefits similar to JIT, but not quite
JIT. Isn’t this what Ruby is already doing in AST?
Also, bytecode isn’t necessarily slow when representing “thick” operators
such as Array::sort, which is also why there is a limit to the benefits of
native code. Inlining and type analysis is propably a more important
optimization strategy.

Mikkel

“Anders Bengtsson” ndrsbngtssn@yahoo.se wrote in message
news:1034785403.1141.29.camel@spinoza…

Justin Johnson wrote:

Why would it not be possible to compile the AST to native code?
That’s
what a traditional compiler does, after all.

You can generate bytecode but it’s not going to give that much of a
speed
increase, IMHO, because it would still have to be doing dynamic method
lookup etc.

It may be faster to JIT-compile bytecode into native code than doing it
directly from the AST. This is useful if you want to distribute bytecode
instead of source.

AST may actually be faster if the AST is optimized.
Instead of storing the opcode, you can store the adress of the code to
execute directly in the node.
Of course the more the AST is optimized, the more blurred the difference
between bytecode and AST becomes. Bytecode is also sometimes optimized by
replacing the opcode with direct addresses during load. OCaml bytecode is
30% faster on gcc compared to Visual C++ because gcc has a non-standard goto
functionality that allows this optimization in C. But apart from that Visual
C++ generates faster code. Of course this argument does not work well for
dynamic dispatch, but even then you can store the address of the dispatch
code directly.

Mikkel

“MikkelFJ” mikkelfj-anti-spam@bigfoot.com writes:

First the example (3 + 4) can always be optimized because the values are
constants.

Not really…

class Fixnum
def +(other)
“boo!”
end
end

puts 3 + 4

Now admittedly, the runtime could do some work to handle this, but it
could be hard, particularly in the face of ‘eval’ and friends. You
might have to keep the old bytecodes lying around, and reactivate them
if you ever determine any operators such as ‘+’ had been changed.

Cheers

Dave

Why allow the user to change the definition of Fixnum#+ to begin with?

Paul

···

On Thu, Oct 17, 2002 at 03:10:38AM +0900, Dave Thomas wrote:

Now admittedly, the runtime could do some work to handle this, but it
could be hard, particularly in the face of ‘eval’ and friends. You
might have to keep the old bytecodes lying around, and reactivate them
if you ever determine any operators such as ‘+’ had been changed.

“Dave Thomas” Dave@PragmaticProgrammer.com wrote in message
news:m2vg42i7a9.fsf@zip.local.thomases.com

“MikkelFJ” mikkelfj-anti-spam@bigfoot.com writes:

You
might have to keep the old bytecodes lying around, and reactivate them
if you ever determine any operators such as ‘+’ had been changed.

True.
I actually believe this is the key to highperformance Ruby. There are many
ways you can optimize ruby. Most or all of these optimizations may become
invalidated, but rarely do. The solution is to always have a backup plan but
use the optimized version whereever possible.

For example a guarded compiled codesection that first tests for versioning
or involved classes. Or code compiled for special argument types like C++
template specialization.

Less aggressive is using dispatch tables that can be updated, but such that
bytecode - or assembler calls into the dispatch without hash lookups.
Calling methods not accounted for in the dispatch table goes via the
ordinary hash lookup.

Mikkel

Paul Brannan pbrannan@atdesk.com writes:

Now admittedly, the runtime could do some work to handle this, but it
could be hard, particularly in the face of ‘eval’ and friends. You
might have to keep the old bytecodes lying around, and reactivate them
if you ever determine any operators such as ‘+’ had been changed.

Why allow the user to change the definition of Fixnum#+ to begin with?

I don’t know about ‘+’, but I quite often change definitions of
built-in operations. One of my current favorites is

class NilClass
def empty?
true
end
end

p nil.empty?

Very useful for Web systems.

If I can productively change operations in NilClass, then it would
seem consistent to give me equivalent access to Fixnum.

Or, or course, we could implement C# or Java where ints are native and
have to be boxed before use. I wouldn’t like to see this.

Cheers

Dave

···

On Thu, Oct 17, 2002 at 03:10:38AM +0900, Dave Thomas wrote:

I don’t know about ‘+’, but I quite often change definitions of
built-in operations. One of my current favorites is

class NilClass
def empty?
true
end
end

p nil.empty?

Very useful for Web systems.

  1. I don’t think that modifying any class that you didn’t write is EVER
    a good idea (it may be useful and it may even be the best solution
    available, but it’s still not a good solution).

  2. You are not changing the definition of a built-in method; you are
    adding a new method to a class that already exists. This is slightly
    different from redefining Fixnum#+.

Or, or course, we could implement C# or Java where ints are native and
have to be boxed before use. I wouldn’t like to see this.

That depends on what you are using the int for. In order to call
5.times, you don’t have to box 5, because Fixnums are objects and
respond to the same methods any other object does. However, not all
methods in Object work for Fixnums. For example, I can’t call
Object#extend on an Integer; instead I must write a delegate class,
instantiate a delegate, and call Object#extend on the delegate.

Fixnums looks like objects, they smell like objects, and most of the
time they even act like objects, but they are really very very special
objects.

Paul

···

On Thu, Oct 17, 2002 at 04:53:40AM +0900, Dave Thomas wrote:

  1. I don’t think that modifying any class that you didn’t write is EVER
    a good idea (it may be useful and it may even be the best solution
    available, but it’s still not a good solution).

I don’t agree, but I’m not passionate about it either way.

Fixnums looks like objects, they smell like objects, and most of the
time they even act like objects, but they are really very very special
objects.

Fixnums will become less special as time
goes by. Good or bad in your opinion?

Hal

···

----- Original Message -----
From: “Paul Brannan” pbrannan@atdesk.com
To: “ruby-talk ML” ruby-talk@ruby-lang.org
Sent: Wednesday, October 16, 2002 3:21 PM
Subject: Re: Psyco

Hi,

As I am more into “performance” than “purity” (such as everything should
be an object), in this case I think I favor the Java-like approach. Ruby
is already “ahead” of Java in the sense that even a number like 5 is an
object, so we can write code such as “5.times”.

I think it is good to be able to add methods; I myself added a factorial
method to Integer. However, in my opinion, if we really want to be
pragmatic, I think we should not allow anyone to modify the “+” method of
a Fixnum, if it can result in increased performance. Can anyone give a
practical example on why we ever want to modify the “+” method of a
Fixnum? (Beside an excuse to the IRS, just like the previous Pentium
calculation flaw? :slight_smile: )

I think this is why C++ provides both virtual and non-virtual
functions; Java approach is having “native” data type and “object”. I am
sorry that I have to refer to these 2 languages often; at least they have
been commonly used for “programming in the large” projects.

Regards,

Bill

···

=============================================================================
Hal E. Fulton hal9000@hypermetrics.com wrote:

  1. I don’t think that modifying any class that you didn’t write is EVER
    a good idea (it may be useful and it may even be the best solution
    available, but it’s still not a good solution).

I don’t agree, but I’m not passionate about it either way.

Fixnums looks like objects, they smell like objects, and most of the
time they even act like objects, but they are really very very special
objects.

Fixnums will become less special as time
goes by. Good or bad in your opinion?

Hal