Seven new VMs, all in a row

I think the "wal-mart argument" is quite an important one. Apart from explicitly creating threads, it would be nice if the Ruby system could be taught to automatically recognize parallelizable code and optimally distribute it across a multiprocessor system -- implicitly. That would be a big advange for high-level programming in general! I do not know the state of the art in this, I only remember that the Atari/Inmos guys failed do do this in Occam, back in the 1980s. Do you think there is a serious chance to get such a thing working?

My first thought concerning "explicit" threads was: Have a look at Erlang (http://www.erlang.org), which is explicitly designed for distributed applications. >20'000 threads per application are not uncommon in large Erlang programs, so I take it for granted that their implementation must be pretty highly optimized. (Don't you ask me which kind of threads is underlying the Erlang VM, though -- concurrency is a grammar-level feature of Erlang). However, it's all about CSPs. No shared memory. So this won't be much of a help, will it?

-- Ruediger Marcus

···

Am Samstag, 9. April 2005 22:38 schrieb Glenn Parker:

Avi Bryant wrote:
> Glenn Parker wrote:
>>Yes, I want synchronous I/O and true multi-cpu execution.
>
> And shared memory - right?

Of course, otherwise threads are no more interesting than (heavy-weight)
processes.

> Because with Ruby or Smalltalk, there's
> nothing stopping you from using multiple CPUs by having multiple
> instances of the VM, communicating through DRb or whatever.

Right, nothing except the overhead and occasional confusion that results
from passing objects between processes.

> The engineers that build their networking and database
> systems always claim that the green threads scale much, much better -
> porting such code from VisualWorks to ObjectStudio always involves a
> lot of headaches in getting it to perform decently under the
> disadvantage of native threads.

Yup, I've been there (using C++, not Smalltalk). If you write code
assuming that you have "unlimited" threads (which green-style threads
encourages), some OS threading implementations will really knock you
flat on your kiester. If you can redesign to treat threads as limited
resources (more like processes), native threads excell. I'm not saying
one threading style is necessarily "better", they each have their place
and they each require compromises.

> The best setup would probably be to have many green threads to each (of
> multiple) native thread, but that's a huge engineering effort.

Sort of like what Solaris already did, and rather nicely at that.
Sadly, nobody followed in their tracks, or I would have had a lot less
code to redesign.

> I think
> Matz made the right decision: keep things simple and under the direct
> control of the VM.

I assert that a VM can be threaded and still maintain direct control, it
just won't be quite as simple.

--
Chevalier Dr Dr Ruediger Marcus Flaig
   Institute for Immunology
   University of Heidelberg
   INF 305, D-69121 Heidelberg

"Drain you of your sanity,
Face the Thing That Should Not Be."

--
Diese E-Mail wurde mit http://www.mail-inspector.de verschickt
Mail Inspector ist ein kostenloser Service von http://www.is-fun.net
Der Absender dieser E-Mail hatte die IP: 129.206.124.135

IIRC the Erlang threads are green threads and OS level parallelism is achieved through running several Erlang processes.

Re: green threads vs native threads, if a green threads implementation is 30 times faster, that's like having 29 extra cpus, no?

Ilmari Heikkinen

···

On 11.4.2005, at 12:04, flaig@sanctacaris.net wrote:

My first thought concerning "explicit" threads was: Have a look at Erlang (http://www.erlang.org), which is explicitly designed for distributed applications. >20'000 threads per application are not uncommon in large Erlang programs, so I take it for granted that their implementation must be pretty highly optimized. (Don't you ask me which kind of threads is underlying the Erlang VM, though -- concurrency is a grammar-level feature of Erlang). However, it's all about CSPs. No shared memory. So this won't be much of a help, will it?

--
66. The regions beyond these places are either difficult of access because of their excessive winters and great cold, or else cannot be sought out because, of some divine influence of the gods.

flaig@sanctacaris.net wrote:

I think the "wal-mart argument" is quite an important one.

I'm not sure exactly what the "wal-mart argument" is. Wal-Mart can be seen as a big U.S. conglomerate that moves into a town and drives all the mom-and-pop stores out of business, eventually drying up downtown business districts. Or it can be seen as a big discount retailer that provides cheap imported goods using its massive warehousing and distribution networks, while undercutting domestic manufacturers.

Obviously, I'm not a big Wal-Mart fan, but maybe their brutal retail success strategy has lessons for Ruby? :slight_smile:

Apart from explicitly creating threads, it would be nice if
the Ruby system could be taught to automatically recognize
parallelizable code and optimally distribute it across a
multiprocessor system -- implicitly. That would be a big
advange for high-level programming in general! I do not know
the state of the art in this, I only remember that the
Atari/Inmos guys failed do do this in Occam, back in the 1980s.
Do you think there is a serious chance to get such a thing working?

The only programming environment I'm familiar with where somebody implemented automatic parallel optimization is Fortran (although I'm sure there are others). Fortran's branching and memory models are constrained enough to allow for some clever analysis. Loops where each iteration has no impact on the next can be discovered and converted into short-term fine-grained parallel execution. In that case, the original code has no concept of threading, it just runs faster during the inner loops.

None of that would carry over to a thread-aware language with a dynamic type system.

Ilmari Heikkinen wrote:

Re: green threads vs native threads, if a green threads implementation
is 30 times faster, that's like having 29 extra cpus, no?

"You can't get blood from a stone." The only thing that is "like having 29 extra cpus" is actually having 29 extra CPUs. :slight_smile:

···

--
Glenn Parker | glenn.parker-AT-comcast.net | <http://www.tetrafoil.com/&gt;

I know that people have been agitating for a shared "Perm Space" between different instances of running VisualWorks VMs. This may have happened already while I wasn't looking. (Perm Space contains objects which are exempt from Garbage Collection. When deploying Smalltalk applications, one often does a "Perm Save" so that the application classes are never scanned. I also have heard that there are other uses of inter-process shared memory for the VM planned.

--Peter

···

On Apr 11, 2005, at 4:04 AM, flaig@sanctacaris.net wrote:

My first thought concerning "explicit" threads was: Have a look at Erlang (http://www.erlang.org), which is explicitly designed for distributed applications. >20'000 threads per application are not uncommon in large Erlang programs, so I take it for granted that their implementation must be pretty highly optimized. (Don't you ask me which kind of threads is underlying the Erlang VM, though -- concurrency is a grammar-level feature of Erlang). However, it's all about CSPs. No shared memory. So this won't be much of a help, will it?

-- Ruediger Marcus

--
There's neither heaven nor hell, save what we grant ourselves.
There's neither fairness nor justice, save what we grant each other.

For OO languages running on top of a Virtual Machine, Object semantics may give rise to automatic parallel optimization.

You could deploy an application with certain objects declared "immutable." (Like it's base & application Classes.) Then you could divide the image into independent partitions where the objects in each partition have nothing to do with each other except through these immutable objects. If you can enforce that no one writes into a partition from outside of it (including the immutables), then you can safely give each one its own thread. Something akin to a GC sweep could do this automatically.

--Peter

···

On Apr 11, 2005, at 7:56 AM, Glenn Parker wrote:

flaig@sanctacaris.net wrote:

Apart from explicitly creating threads, it would be nice if
the Ruby system could be taught to automatically recognize
parallelizable code and optimally distribute it across a
multiprocessor system -- implicitly. That would be a big
advange for high-level programming in general! I do not know
the state of the art in this, I only remember that the
Atari/Inmos guys failed do do this in Occam, back in the 1980s.
Do you think there is a serious chance to get such a thing working?

The only programming environment I'm familiar with where somebody implemented automatic parallel optimization is Fortran (although I'm sure there are others). Fortran's branching and memory models are constrained enough to allow for some clever analysis. Loops where each iteration has no impact on the next can be discovered and converted into short-term fine-grained parallel execution. In that case, the original code has no concept of threading, it just runs faster during the inner loops.

None of that would carry over to a thread-aware language with a dynamic type system.

--
There's neither heaven nor hell, save what we grant ourselves.
There's neither fairness nor justice, save what we grant each other.

Re: green threads vs native threads, if a green threads implementation
is 30 times faster, that's like having 29 extra cpus, no?

It's just the context switch that gets faster... :slight_smile:

(Well, and as has been stated, maybe also the garbage
collector can be faster too with green threads. But it's
not like native machine code goes any faster through the
CPU :slight_smile:

Regards,

Bill

···

From: "Ilmari Heikkinen" <kig@misfiring.net>

I know that people have been agitating for a shared "Perm Space"
between different instances of running VisualWorks VMs. This may

have

happened already while I wasn't looking.

Maybe at year end:
Shared Perm Space - ability to have multiple images share common
object

http://www.cincomsmalltalk.com/CincomSmalltalkWiki/Cincom+Smalltalk+Winter+2005

My intent (and I'm making this up as I type :slight_smile: was something like:
If making a fast VM with native thread support is hard enough to seriously bog down development speed, maybe it shouldn't be the number one priority.

···

On 11.4.2005, at 21:10, Bill Kelly wrote:

From: "Ilmari Heikkinen" <kig@misfiring.net>

Re: green threads vs native threads, if a green threads implementation
is 30 times faster, that's like having 29 extra cpus, no?

It's just the context switch that gets faster... :slight_smile:

(Well, and as has been stated, maybe also the garbage
collector can be faster too with green threads. But it's
not like native machine code goes any faster through the
CPU :slight_smile:

"Peter Suk" <peter.kwangjun.suk@mac.com> wrote in message

If you can enforce that no one writes into a
partition from outside of it (including the immutables), then you can
safely give each one its own thread.

IIRC, this is an extremely difficult thing to check. My recollection is
based on work from a while ago on aliasing, islands, etc.

If the VM is doing this, then you can do this efficiently by comparing pointers. I've tried constructing write barriers above the level of the VM in Smalltalk (manipulating compilation), and yes, it's difficult. You only need one "leak" and this invalidates the whole barrier. But the VM can partition the object-space very efficiently and by doing it at such a low level, it is much more secure. VMs do precisely this sort of partitioning to implement garbage collection algorithms.

You could do this in Squeak by using the VM simulator and collections for your partitions. It wouldn't be as efficient, but it should be as secure. (So if we succeed in moving Ruby on top of Squeak VM with the Alumina project, we could do this for Ruby!)

--Peter

···

On Apr 13, 2005, at 8:49 PM, itsme213 wrote:

"Peter Suk" <peter.kwangjun.suk@mac.com> wrote in message

If you can enforce that no one writes into a
partition from outside of it (including the immutables), then you can
safely give each one its own thread.

IIRC, this is an extremely difficult thing to check. My recollection is
based on work from a while ago on aliasing, islands, etc.

--
There's neither heaven nor hell, save what we grant ourselves.
There's neither fairness nor justice, save what we grant each other.