Seven new VMs, all in a row

I've been giving this some thought. There's no reason why we can't create ST instance vars at runtime. However, it will mean deferring compilation. The suggestion to mask its creation is a good one from a performance standpoint, and is certainly doable.

--Peter

···

On Apr 8, 2005, at 7:54 AM, Glenn Parker wrote:

Avi Bryant wrote:

- instance variables that don't need to be predeclared with the class

Not exactly as Ruby does, but you can simply add new instance variables
to the class definition whenever you compile a method that uses a new
one.

Isn't that jumping the gun just a bit? An instance variable (in Ruby) should not exist in an object until the line that assigns/creates it is actually executed. It's a subtle point, but it could impact some types of reflective programming. Maybe you can mask its existence somehow?

--
There's neither heaven nor hell, save what we grant ourselves.
There's neither fairness nor justice, save what we grant each other.

Sorry, it was late at night, and that came off harsher than it should have. It's just that I'm finding myself essentially repeating myself. The feedback has informed my thinking and has been valuable, but I should just tell people to go and find and read docs. I will get around to posting them on the RubyForge site. This project is named Alumina. Ruby is just one crystalline form of Alumina.

--Peter

···

On Apr 8, 2005, at 8:19 AM, George wrote:

Avi / Peter -- thanks for providing the feedback I was looking for.
Ruby and Smalltalk do indeed sound like a good match. I remember Robert
Feldt saying a few years ago that he'd done some experiments with Ruby
on Smallscript's VM that looked promising. > > Peter said:

From now on, if you are completely ignorant of Smalltalk,
I will cease answering these questions here.

Yikes! I _am_ almost completely ignorant of Smalltalk (I'm know the
basic concepts of the language, but I've never programmed in it), but
I've done some work on VMs and was genuinely interested in the answers.

--
There's neither heaven nor hell, save what we grant ourselves.
There's neither fairness nor justice, save what we grant each other.

"Lothar Scholz" <mailinglists@scriptolutions.com> wrote in message

I would like to add

- singleton methods

to this list as i think this does not exist in SmallTalk, and this is
a feature that could knock out the whole method compilation algorithm.

I doubt it will. Even in Ruby, singleton methods are implemented with
classes under the covers.

You get the "no-prize"! Yes, this one is easy.

The fact that Ruby instance variables appear on demand, and so references to
instance-vars cannot be directly mapped to integer offsets from a known
class definition, will likely mean some performance trade-off.

Already covered this.

--Peter

···

On Apr 8, 2005, at 2:14 PM, itsme213 wrote:

--
There's neither heaven nor hell, save what we grant ourselves.
There's neither fairness nor justice, save what we grant each other.

Hello Lothar,

Hello Avi,

Sorry, my last post was meant to be in reply to Lothar's request for a
VM spec. I still haven't got the hang of this new google groups UI...

Yes, nice but this is Squeak. It seems that this is just a bytecode
machine without a JIT. Is there any document available vor VisualWorks ?

just found this paragraph and then stopped reading the document:

···

---------------------------
In a typical system it often turns out that the same message is sent to instances of the
same class again and again; consider how often we use arrays of SmallInteger or
Character or String. To improve average performance, the VM can cache the
found method. If the same combination of the method and the receiver's class are
found in the cache, we avoid a repeat of the full search of the MethodDictionary
chain. See the method Interpreter > lookupInMethodCacheSel:class:
for the implementation.
VisualWorks and some other commercial Smalltalks use inline cacheing, whereby the
cached target and some checking information is included inline with the dynamically
translated methods. Although more efficient, it is more complex and has strong
interactions with the details of the cpu instruction and data caches.
---------------------------

So message dispatching with squeak is not much more efficent then what
i expect from YARV, a simple bytecode dispatcher with dynamic method
lookup tables and a small lookup cache. It now seems to be a simple question
of the GC and there i would again vote for the Boehm Weisser GC which
is a quite fast incremental GC working on all popular platforms and much easier
to integrate (even with typing hints) into "gc.c" then your project.

Did you ever thought about the legal problems of your problems when
using the visual works engine ? I did a short look at the cincom webpage and
i guess it is as expensive as ever, which means > 5000 US$ per
license. I hate it when the guys are not even publishing there prices
without a contact form.

--
Best regards, emailto: scholz at scriptolutions dot com
Lothar Scholz http://www.ruby-ide.com
CTO Scriptolutions Ruby, PHP, Python IDE 's

What about systems like Rails where members are added to the class
definition at runtime from a database definition? ( I assume that's how
it works. )

Avi Bryant wrote:

Glenn Parker wrote:

Avi Bryant wrote:

Not exactly as Ruby does, but you can simply add new instance variables
to the class definition whenever you compile a method that uses a new
one.

Isn't that jumping the gun just a bit? An instance variable (in Ruby)
should not exist in an object until the line that assigns/creates it is
actually executed. It's a subtle point, but it could impact some types
of reflective programming. Maybe you can mask its existence somehow?

Yes, pretty easily I'd think. For example: when you create a new
instance in Smalltalk, all the instance variables start out initialized
to Smalltalk's nil value (of class UndefinedObject). As soon as it was
referenced or assigned to, that would get replaced with some Ruby value
(possibly Ruby's nil, of class NilClass). So you could always tell for
a given instance which instance variables already "exist" inside the
Ruby semantics.

You could also just use a hashtable for each instance to hold all of
the variables, like Ruby does, but the fact that Smalltalk can do
direct instance variable access ends up being a nice speed and memory
gain, so I'd rather not give that up.

The Smalltalk Ruby will still need to handle more dynamic methods of instance variable creation:

class MyClass
   def add_ivar(name)
     instance_variable_set(name, nil)
   end
end

What happens to instances that have already been created when a new instance variable is seen by the compiler?

There are also issues with using more memory than necessary if the interpreter creates every instance variable the moment it is observed by the compiler.

I'm guessing Ruby instance variables will have to be created dynamically.

···

--
Glenn Parker | glenn.parker-AT-comcast.net | <http://www.tetrafoil.com/&gt;

Avi / Peter -- thanks for providing the feedback I was looking for.
Ruby and Smalltalk do indeed sound like a good match. I remember Robert
Feldt saying a few years ago that he'd done some experiments with Ruby

From now on, if you are completely ignorant of Smalltalk,
I will cease answering these questions here.

Yikes! I _am_ almost completely ignorant of Smalltalk (I'm know the
basic concepts of the language, but I've never programmed in it), but
I've done some work on VMs and was genuinely interested in the answers.

Sorry, it was late at night, and that came off harsher than it should
have. It's just that I'm finding myself essentially repeating myself.
The feedback has informed my thinking and has been valuable, but I
should just tell people to go and find and read docs. I will get around
to posting them on the RubyForge site. This project is named Alumina.
Ruby is just one crystalline form of Alumina.

Thank you for that apology! Please understand you are speaking to
a group of (mostly) interested _ruby_ developers, many of whom, I,
for instance, have little or no experience with Smalltalk let alone
its VM infrastructure. You announced the topic so I think it is fair
to expect you to answer any reasonable questions from your audience
considering our collective background.

The project certainly seems interesting to me; I am somewhat reserved
about it as I would rather see as much of the available talent to go
to YARV development rather than various different projects but, then
again, there seems to be a lot of talent to go around. Just produce
code and ideas that can be reused! :slight_smile:

Good luck!

--Peter

E

···

Le 8/4/2005, "Peter Suk" <peter.kwangjun.suk@mac.com> a écrit:

On Apr 8, 2005, at 8:19 AM, George wrote:

on Smallscript's VM that looked promising. >> >> Peter said:

--
No-one expects the Solaris POSIX implementation!

"Peter Suk" <peter.kwangjun.suk@mac.com> wrote in message

> The fact that Ruby instance variables appear on demand, and so
> references to
> instance-vars cannot be directly mapped to integer offsets from a known
> class definition, will likely mean some performance trade-off.

Already covered this.

Replacing every existing object of a class with a new static layout has a
very different performance profile from updating the dynamic layout of a
single object. Silently creating a singleton class might result in subtle
changes in Ruby semantics. Going all-dynamic layout takes a big performance
hit.

Sounds like a performance trade-off area to me.

Did you ever thought about the legal problems of your problems when
using the visual works engine ?

Yes. I used to work for them.

I did a short look at the cincom webpage and
i guess it is as expensive as ever, which means > 5000 US$ per
license. I hate it when the guys are not even publishing there prices
without a contact form.

Yes. I wish they'd cut that out, but they can be quite "old-school" software industry-wise.

The main purpose for the VisualWorks version is for commercial server images that have to run fast. Certain organizations will prefer to have a product with official support and will pay a premium for this and speed. If/When this happens, I'll license their Object Engine as a VAR, and pass on the licensing costs. (Also, they are looking for new products to sell.) Otherwise, the Ruby image can just be a fast platform for education and academic research in Ruby just as it is for Smalltalk. Also, it is the platform that I know best, so it is a good place for me to start. For purposes of just having the Refactoring, Debugging, Browsing environment, people who want free as in speech will probably opt for hosting on Squeak.

--Peter

···

On Apr 8, 2005, at 5:39 AM, Lothar Scholz wrote:

--
There's neither heaven nor hell, save what we grant ourselves.
There's neither fairness nor justice, save what we grant each other.

Patrick Down wrote:

What about systems like Rails where members are added to the class
definition at runtime from a database definition? ( I assume that's

how

it works. )

If you're talking about methods, that's no problem at all - like Ruby,
Smalltalk makes no distinction between compile-time and runtime (in
fact, Smalltalk environments don't even make a distinction between
edit-time and runtime), so adding (or removing, or modifying) methods
at "runtime" is the usual case.

If you're talking about instance variables: when you add a new instance
variable to a class, the system walks through all of the old instances
of that class and makes a new copy with space allocated for the new
inst var. The next step, which requires VM support, is to do an atomic
swap of all of the old instances for all of the new instances (some
Smalltalks can do this faster than others, but in the worst case it
requires a full garbage collection).

My guess is that for almost all applications, the occasional added cost
when adding inst vars would be more than made up for by the increased
speed when accessing them, and the reduced memory consumption (and thus
reduced load on the GC).

Avi

Glenn Parker wrote:

The Smalltalk Ruby will still need to handle more dynamic methods of
instance variable creation:

class MyClass
   def add_ivar(name)
     instance_variable_set(name, nil)
   end
end

What happens to instances that have already been created when a new
instance variable is seen by the compiler?

Just posted on that...

There are also issues with using more memory than necessary if the
interpreter creates every instance variable the moment it is observed

by

the compiler.

Only in truly pathological cases - for normal numbers of instance
variables, the overhead of an external lookup table would be higher
than that of keeping a few unused slots in the body of the object. But
of course if the compiler saw that instances of a particular class
might have up to 50 instance variables, it could choose to implement
that class with a hashtable for variables. I'm not convinced that case
is likely enough to be worth checking and optimizing for, but who
knows.

Avi

You missed the second to last sentence, "... the cached target and some checking information is included inline with the dynamically translated methods."

This can be a big boost because you don't need to look up the method or even dispatch to it anymore, its right there where it needs to be.

PGP.sig (194 Bytes)

···

On 08 Apr 2005, at 03:39, Lothar Scholz wrote:

---------------------------
In a typical system it often turns out that the same message is sent to instances of the
same class again and again; consider how often we use arrays of SmallInteger or
Character or String. To improve average performance, the VM can cache the
found method. If the same combination of the method and the receiver's class are
found in the cache, we avoid a repeat of the full search of the MethodDictionary
chain. See the method Interpreter > lookupInMethodCacheSel:class:
for the implementation.
VisualWorks and some other commercial Smalltalks use inline cacheing, whereby the
cached target and some checking information is included inline with the dynamically
translated methods. Although more efficient, it is more complex and has strong
interactions with the details of the cpu instruction and data caches.
---------------------------

So message dispatching with squeak is not much more efficent then what
i expect from YARV, a simple bytecode dispatcher with dynamic method
lookup tables and a small lookup cache.

--
Eric Hodel - drbrain@segment7.net - http://segment7.net
FEC2 57F1 D465 EB15 5D6E 7C11 332A 551C 796C 9F04

Saynatkari wrote:

The project certainly seems interesting to me; I am somewhat reserved
about it as I would rather see as much of the available talent to go
to YARV development rather than various different projects but, then
again, there seems to be a lot of talent to go around. Just produce
code and ideas that can be reused! :slight_smile:

I don't think it is harmful in any way. Consider web frameworks. Everyone and their mothers seem to have written one for Ruby, yet still there are no shortage of good ones (Rails etc).

Thank you for that apology! Please understand you are speaking to
a group of (mostly) interested _ruby_ developers, many of whom, I,
for instance, have little or no experience with Smalltalk let alone
its VM infrastructure. You announced the topic so I think it is fair
to expect you to answer any reasonable questions from your audience
considering our collective background.

It's fair I answer them once. However, it is amusing that I'm getting a lot of questions in the vein of "Does Smalltalk have X?" where the answer is really: "Funny you should mention that, but arguably Smalltalk was X's first widespread commercial implementation. Smalltalk has had X for (over a decade | since the beginning in 1972.)"

The project certainly seems interesting to me; I am somewhat reserved
about it as I would rather see as much of the available talent to go
to YARV development rather than various different projects but, then
again, there seems to be a lot of talent to go around. Just produce
code and ideas that can be reused! :slight_smile:

Tell me that there will be a full-blown Object-image external to the VM, and that all of the language-specific meta-stuff will be expressed as first class Rite/Ruby Objects, to the point that *only* executing bytecodes and allocating/GC-ing Objects is done by the VM, and I will consider it! (This means that even the language itself exists only in the image as 1st class Ruby/Rite Objects.)

--Peter

···

On Apr 8, 2005, at 1:27 PM, Saynatkari wrote:

--
There's neither heaven nor hell, save what we grant ourselves.
There's neither fairness nor justice, save what we grant each other.

If you want to influence the project, join it and contribute code. Otherwise, it's pointless to discuss this now.

--Peter

···

On Apr 9, 2005, at 2:14 PM, itsme213 wrote:

"Peter Suk" <peter.kwangjun.suk@mac.com> wrote in message

The fact that Ruby instance variables appear on demand, and so
references to
instance-vars cannot be directly mapped to integer offsets from a known
class definition, will likely mean some performance trade-off.

Already covered this.

Replacing every existing object of a class with a new static layout has a
very different performance profile from updating the dynamic layout of a
single object. Silently creating a singleton class might result in subtle
changes in Ruby semantics. Going all-dynamic layout takes a big performance
hit.

Sounds like a performance trade-off area to me.

--
There's neither heaven nor hell, save what we grant ourselves.
There's neither fairness nor justice, save what we grant each other.

Hello Eric,

···

On 08 Apr 2005, at 03:39, Lothar Scholz wrote:

You missed the second to last sentence, "... the cached target and some
checking information is included inline with the dynamically translated
methods."

This can be a big boost because you don't need to look up the method or
even dispatch to it anymore, its right there where it needs to be.

No i did not miss it, it says that this is _NOT_ implemented in squeak,
only in the commercial VM's which i pointed out are still extrem
expensive for all of us who use it for some scripts in there
companies.

So i doubt that the Squeak Engine can do much better then YARV when
both use the same technologie (at least for message calling)

--
Best regards, emailto: scholz at scriptolutions dot com
Lothar Scholz http://www.ruby-ide.com
CTO Scriptolutions Ruby, PHP, Python IDE 's

Yikes, I walked back from the car when I realized...LISP! I don't want to argue that Smalltalk was the first widespread commericial implementation of X, because some Lisp-er will inevitably tell me that it was done in 1970-something, and further come back to me with photos from an archaeological dig where paleo-lispers were doing semi Aspect-Oriented things with "around" methods and CDR was implemented by bashing the end off a rock with a large teak club.

Lots of neat stuff was invented awhile back, and it's only just *really* hitting the mainstream. I'm betting on Ruby to be that horse!

--Peter

···

On Apr 8, 2005, at 1:59 PM, Peter Suk wrote:

On Apr 8, 2005, at 1:27 PM, Saynatkari wrote:

You announced the topic so I think it is fair
to expect you to answer any reasonable questions from your audience
considering our collective background.

It's fair I answer them once. However, it is amusing that I'm getting a lot of questions in the vein of "Does Smalltalk have X?" where the answer is really: "Funny you should mention that, but arguably Smalltalk was X's first widespread commercial implementation. Smalltalk has had X for (over a decade | since the beginning in 1972.)"

--
There's neither heaven nor hell, save what we grant ourselves.
There's neither fairness nor justice, save what we grant each other.

Having your variables in a Hashtable is something I've heard of as a means of giving Java objects more reflective & dynamic capabilities. I've also done it as a quick and dirty way of storing Smalltalk objects as serialized files, but still being able to read the old ones even in after modifying their classes and adding instance variables. (Which is not the right way to do it, but from an implementation standpoint is slightly quicker than the right way.)

However, when you do this, you are taking a *big* performance hit. I suspect that this is one of the reasons why the Ruby VMs and the earlier Python VMs are so slow -- a trade-off has been made for programmer/VM-implementor convenience. Alumina will maintain speed and programmer convenience in exchange for VM-implementor inconvenience.

--Peter

···

On Apr 8, 2005, at 11:05 AM, Avi Bryant wrote:

Glenn Parker wrote:

There are also issues with using more memory than necessary if the
interpreter creates every instance variable the moment it is observed

by

the compiler.

Only in truly pathological cases - for normal numbers of instance
variables, the overhead of an external lookup table would be higher
than that of keeping a few unused slots in the body of the object. But
of course if the compiler saw that instances of a particular class
might have up to 50 instance variables, it could choose to implement
that class with a hashtable for variables. I'm not convinced that case
is likely enough to be worth checking and optimizing for, but who
knows.

--
There's neither heaven nor hell, save what we grant ourselves.
There's neither fairness nor justice, save what we grant each other.

"Peter Suk" <peter.kwangjun.suk@mac.com> wrote in message

> Sounds like a performance trade-off area to me.

If you want to influence the project, join it and contribute code.
Otherwise, it's pointless to discuss this now.

I've read this thread with considerable interest, and am certain many of us
here who would love to see your plans bear fruit. I do think you're more
likely to succeed it you take on board the questions people ask or possible
gotcha's they point out.

The above documentation is, in part, wrong.

ftp://st.cs.uiuc.edu/Smalltalk/Squeak/docs/OOPSLA.Squeak.html

See the last paragraph of Smalltalk to C Translation.

See also the last paragraph of Performance and Optimization.

PGP.sig (194 Bytes)

···

On 08 Apr 2005, at 11:19, Lothar Scholz wrote:

Hello Eric,

> On 08 Apr 2005, at 03:39, Lothar Scholz wrote:

> You missed the second to last sentence, "... the cached target and some
> checking information is included inline with the dynamically translated
> methods."

> This can be a big boost because you don't need to look up the method or
> even dispatch to it anymore, its right there where it needs to be.

No i did not miss it, it says that this is _NOT_ implemented in squeak,
only in the commercial VM's which i pointed out are still extrem
expensive for all of us who use it for some scripts in there
companies.

So i doubt that the Squeak Engine can do much better then YARV when
both use the same technologie (at least for message calling)

--
Eric Hodel - drbrain@segment7.net - http://segment7.net
FEC2 57F1 D465 EB15 5D6E 7C11 332A 551C 796C 9F04