Ruby for massively multi-core chips?

BIl_Kleb1 · 23 January 2007 07:25

How to best evolve Ruby to accommodate 80-core
CPU programming?

http://www.ddj.com/dept/architect/196901985

Or does it already?

Later,

···

--
Bil Kleb
http://fun3d.larc.nasa.gov

Eric_Hodel1 · 23 January 2007 07:58

Keep Koichi employed?

···

On Jan 22, 2007, at 23:25, Bil Kleb wrote:

How to best evolve Ruby to accommodate 80-core
CPU programming?

With An 80-Core Chip On The Way, Software Needs To Start Changing | Dr Dobb's

Or does it already?

--
Eric Hodel - drbrain@segment7.net - http://blog.segment7.net

I LIT YOUR GEM ON FIRE!

Daniel_Berger2 · 23 January 2007 17:00

Bil Kleb wrote:

How to best evolve Ruby to accommodate 80-core
CPU programming?

http://www.ddj.com/dept/architect/196901985

Or does it already?

Possible but not easy with fork + ipc I think. Otherwise, no. Neither
does Perl or Python.

So far the only language I've seen specifically designed for multiple
cpus/cores is Fortress, and it's alpha.

Regards,

Dan

Francis_Cianfrocca · 23 January 2007 17:15

Decompose large tasks into a number of cooperating processes (not threads)
that approximates the number of cores available. Knit them together with
asynchronous messaging and write your programs in an event-driven style.

Best practices for doing the above? Not quite there yet, but a lot of people
are working on them.

···

On 1/23/07, Bil Kleb <Bil.Kleb@nasa.gov> wrote:

How to best evolve Ruby to accommodate 80-core
CPU programming?

http://www.ddj.com/dept/architect/196901985

Or does it already?

Ron_M · 23 January 2007 18:39

Bil Kleb wrote:

How to best evolve Ruby to accommodate 80-core
CPU programming?

A version of NArray that parallelizes it's
work (a task that could be made easier using
OpenMP or similar) would work especially well
if your CPU-intensive part of your application
is math intensive.

For more mundane tasks (web serving) an
obvious answer would be to simply fork
off 80 ruby processes which would efficiently
use the 80 cores.

M_Edward_Ed_Borasky1 · 23 January 2007 15:57

Eric Hodel wrote:

···

On Jan 22, 2007, at 23:25, Bil Kleb wrote:

How to best evolve Ruby to accommodate 80-core
CPU programming?

With An 80-Core Chip On The Way, Software Needs To Start Changing | Dr Dobb's

Or does it already?

Keep Koichi employed?

I think it's time I posted my "we've been here before" rant about concurrency and massively parallel computers on my blog. For starters, do a Google search for the writings of Dr. John Gustafson, who is now a senior researcher at Sun Microsystems.

--
M. Edward (Ed) Borasky, FBG, AB, PTA, PGS, MS, MNLP, NST, ACMC(P)
http://borasky-research.blogspot.com/

If God had meant for carrots to be eaten cooked, He would have given rabbits fire.

gga · 23 January 2007 20:30

Daniel Berger ha escrito:

Bil Kleb wrote:
> How to best evolve Ruby to accommodate 80-core
> CPU programming?
>
> With An 80-Core Chip On The Way, Software Needs To Start Changing | Dr Dobb's
>
> Or does it already?

Possible but not easy with fork + ipc I think. Otherwise, no. Neither
does Perl or Python.

So far the only language I've seen specifically designed for multiple
cpus/cores is Fortress, and it's alpha.

Fortress, eh? Never heard of it...

Actually, there's a couple of languages you could use on that machine
that are far, far from beta.

Your best bet for that machine is LUA at this point in time. LUA is
multi-thread ready and pretty stable. As long as you don't do any OO
and blink a little, Lua's syntax looks like Ruby.
Nowhere near as nice to do OO in it as in Ruby (or Python, for that
matter), but doable. It is a tiny little bit nicer than Perl's OO (but
not by much).

And good old and somewhat dusty TCL has always been thread friendly.
TCL's OO is kind of a big mess, as it is not native to the language and
there are 2 or 3 frameworks for doing so. However, TCL's big plus is
that it has been around the block for a long, long time.

M_Edward_Ed_Borasky1 · 24 January 2007 04:34

Ron M wrote:

Bil Kleb wrote:


How to best evolve Ruby to accommodate 80-core
CPU programming?

A version of NArray that parallelizes it's
work (a task that could be made easier using
OpenMP or similar) would work especially well
if your CPU-intensive part of your application
is math intensive.

For more mundane tasks (web serving) an
obvious answer would be to simply fork
off 80 ruby processes which would efficiently
use the 80 cores.

Uh ... be careful ... processes take up space in cache and in RAM. The only thing that would be sharable is the memory used for code ("text" in Linux terms). I think what you want is *lightweight* processes a la Erlang, which Ruby doesn't have yet. It does have *threads*, though.

···

--
M. Edward (Ed) Borasky, FBG, AB, PTA, PGS, MS, MNLP, NST, ACMC(P)
http://borasky-research.blogspot.com/

If God had meant for carrots to be eaten cooked, He would have given rabbits fire.

Daniel_Berger2 · 24 January 2007 04:35

I just remembered a couple of Perl modules that may be of interest:

Dan

···

On Jan 23, 9:56 am, "Daniel Berger" <djber...@gmail.com> wrote:

Bil Kleb wrote:
> How to best evolve Ruby to accommodate 80-core
> CPU programming?

> With An 80-Core Chip On The Way, Software Needs To Start Changing | Dr Dobb's

> Or does it already?Possible but not easy with fork + ipc I think.

M_Edward_Ed_Borasky1 · 24 January 2007 04:42

Francis Cianfrocca wrote:

How to best evolve Ruby to accommodate 80-core
CPU programming?

With An 80-Core Chip On The Way, Software Needs To Start Changing | Dr Dobb's

Or does it already?

Decompose large tasks into a number of cooperating processes (not threads)
that approximates the number of cores available. Knit them together with
asynchronous messaging and write your programs in an event-driven style.

Best practices for doing the above? Not quite there yet, but a lot of people
are working on them.

1. Lightweight processes, please

2. A lot of people have been working on them for *decades*. We were forced into hiding by increasing clock speeds, huge caches and multiple copies of all the register sets on chip, DSP chips to do the audio, graphics chips to do the video, and the lure of other technologies like the Internet and databases.

I just wonder how long we'll be out of hiding this time.

···

On 1/23/07, Bil Kleb <Bil.Kleb@nasa.gov> wrote:

--
M. Edward (Ed) Borasky, FBG, AB, PTA, PGS, MS, MNLP, NST, ACMC(P)
http://borasky-research.blogspot.com/

If God had meant for carrots to be eaten cooked, He would have given rabbits fire.

Matt_Lawrence1 · 23 January 2007 15:59

Or, somebody could port Ruby to Erlang!

-- Matt
It's not what I know that counts. It's what I can remember in time to use.

···

On Wed, 24 Jan 2007, M. Edward (Ed) Borasky wrote:

Eric Hodel wrote:

On Jan 22, 2007, at 23:25, Bil Kleb wrote:

How to best evolve Ruby to accommodate 80-core
CPU programming?

http://www.ddj.com/dept/architect/196901985

Or does it already?

Keep Koichi employed?

I think it's time I posted my "we've been here before" rant about concurrency and massively parallel computers on my blog. For starters, do a Google search for the writings of Dr. John Gustafson, who is now a senior researcher at Sun Microsystems.

Eric_Hodel1 · 23 January 2007 18:55

See also Koichi's 2005 RubyConf presentation.

···

On Jan 23, 2007, at 07:57, M. Edward (Ed) Borasky wrote:

Eric Hodel wrote:

On Jan 22, 2007, at 23:25, Bil Kleb wrote:

How to best evolve Ruby to accommodate 80-core
CPU programming?

http://www.ddj.com/dept/architect/196901985

Or does it already?

Keep Koichi employed?

I think it's time I posted my "we've been here before" rant about concurrency and massively parallel computers on my blog. For starters, do a Google search for the writings of Dr. John Gustafson, who is now a senior researcher at Sun Microsystems.

--
Eric Hodel - drbrain@segment7.net - http://blog.segment7.net

I LIT YOUR GEM ON FIRE!

Tom_Pollard · 24 January 2007 03:12

SUN'S GUSTAFSON ON ENVISIONING HPC ROADMAPS FOR THE FUTURE
http://www.taborcommunications.com/hpcwire/hpcwireWWW/05/0114/109060.html

[...]
You may recall that Sun acquired the part of Cray that used to be Floating Point Systems. When I was at FPS in the 1980s, I managed the development of a machine called the FPS-164/MAX, where MAX stood for Matrix Algebra Accelerator. It was a general scientific computer with special-purpose hardware optimized for matrix multiplication (hence, dense matrix factoring as well). One of our field analysts, a well-read guy named Ed Borasky, pointed out to me that our architecture had precedent in this machine developed a long time ago in Ames, Iowa. He showed me a collection of original papers reprinted by Brian Randell, and when I read Atanasoff's monograph I just about fell off my chair. It was a SIMD architecture, with 30 multiply-add units operating in parallel. The FPS-164/MAX used 31 multiply-add units, made with Weitek parts that were about a billion times faster than vacuum tubes, but the architectural similarity was uncanny. It gave me a new respect for historical computers, and Atanasoff's work in particular. And I realized I shouldn't have been such a cynic about the historical display at Iowa State.
[...]

I can see why you're a fan.

Tom

···

On Jan 23, 2007, at 10:57 AM, M. Edward (Ed) Borasky wrote:

I think it's time I posted my "we've been here before" rant about concurrency and massively parallel computers on my blog. For starters, do a Google search for the writings of Dr. John Gustafson, who is now a senior researcher at Sun Microsystems.

Joel_VanderWerf1 · 24 January 2007 05:11

M. Edward (Ed) Borasky wrote:

Ron M wrote:

Bil Kleb wrote:

How to best evolve Ruby to accommodate 80-core
CPU programming?

A version of NArray that parallelizes it's
work (a task that could be made easier using
OpenMP or similar) would work especially well
if your CPU-intensive part of your application
is math intensive.

For more mundane tasks (web serving) an
obvious answer would be to simply fork
off 80 ruby processes which would efficiently
use the 80 cores.

Uh ... be careful ... processes take up space in cache and in RAM. The only thing that would be sharable is the memory used for code ("text" in Linux terms). I think what you want is *lightweight* processes a la Erlang, which Ruby doesn't have yet. It does have *threads*, though.

Part of the heap may be sharable, but GC makes that part small:

http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/186561

···

--
vjoel : Joel VanderWerf : path berkeley edu : 510 665 3407

Ron_M · 24 January 2007 11:28

M. Edward (Ed) Borasky wrote:

Uh ... be careful ... processes take up space in cache and in RAM.

I said process intentionally.

only thing that would be sharable is the memory used for code ("text" in
Linux terms).

Nope. Many (all?) OS's do copy-on-write. If one parent
forked the other children after a lot of initialization
was done, all that data initialized by the parent (including,
for example, loaded ruby modules, etc) would be shared too.

I think a lot of highly scalable servers (Oracle, Postgresql,
Apache 1.x, etc) use this approach.

Fundamentally the difference between threads and processes seems
to be the following. With processes, most memory is unshared
unless you explicitly create a shared memory segment. With
threads, most memory is shared unless you explicitly create
thread-local storage. It's often easier to explicitly specify
the shared memory parts, since it makes you very aware of which
data structures need the special care of locking. And since
the VM system will protect the private memory of processes
and AFAIK you'd have to go through some hoops to make the
OS enforce access to thread local storage, you'd be safer
with the multi-process model too.

M_Edward_Ed_Borasky1 · 24 January 2007 04:22

Tom Pollard wrote:

I think it's time I posted my "we've been here before" rant about concurrency and massively parallel computers on my blog. For starters, do a Google search for the writings of Dr. John Gustafson, who is now a senior researcher at Sun Microsystems.

SUN'S GUSTAFSON ON ENVISIONING HPC ROADMAPS FOR THE FUTURE
http://www.taborcommunications.com/hpcwire/hpcwireWWW/05/0114/109060.html

[...]
You may recall that Sun acquired the part of Cray that used to be Floating Point Systems. When I was at FPS in the 1980s, I managed the development of a machine called the FPS-164/MAX, where MAX stood for Matrix Algebra Accelerator. It was a general scientific computer with special-purpose hardware optimized for matrix multiplication (hence, dense matrix factoring as well). One of our field analysts, a well-read guy named Ed Borasky, pointed out to me that our architecture had precedent in this machine developed a long time ago in Ames, Iowa. He showed me a collection of original papers reprinted by Brian Randell, and when I read Atanasoff's monograph I just about fell off my chair. It was a SIMD architecture, with 30 multiply-add units operating in parallel. The FPS-164/MAX used 31 multiply-add units, made with Weitek parts that were about a billion times faster than vacuum tubes, but the architectural similarity was uncanny. It gave me a new respect for historical computers, and Atanasoff's work in particular. And I realized I shouldn't have been such a cynic about the historical display at Iowa State.
[...]

I can see why you're a fan.

Tom

Yeah, John and I worked together at FPS. But what I'm getting at is that John and I (and others within FPS and elsewhere in the supercomputing segment) would have endless discussions about the future of high-performance computing, with some saying it just *had* to be massively parallel SIMD, others saying it just *had* to be moderately parallel MIMD, and others saying, "programming parallel vector machines is just too hard -- the guys over at Intel are doubling the uniprocessor clock speed every 18 months -- in five years you'll have a Cray on your desktop".

That was "only" about 20 years ago ... I've got a 1.3 gigaflop Athlon Tbird that's still more horsepower than I need, but back then if you wanted 1.3 gigaflops you had to chain together multiple vector machines. But my real point is that no matter what solution you proposed, "the programmers weren't ready", "the languages weren't ready", "the compilers weren't ready", "the architectures weren't ready", "the components weren't ready", etc. I hear the same whining today about dual-cores, clusters, scripting languages and today's generation of programmers. And it's just as bogus now as it was then. Except that there's 20 years more practical experience and theoretical knowledge about how to do parallel and concurrent computing. So actually it's *more* bogus now!

···

On Jan 23, 2007, at 10:57 AM, M. Edward (Ed) Borasky wrote:

--
M. Edward (Ed) Borasky, FBG, AB, PTA, PGS, MS, MNLP, NST, ACMC(P)
http://borasky-research.blogspot.com/

If God had meant for carrots to be eaten cooked, He would have given rabbits fire.

M_Edward_Ed_Borasky1 · 24 January 2007 05:47

Joel VanderWerf wrote:

M. Edward (Ed) Borasky wrote:

Ron M wrote:

Bil Kleb wrote:

How to best evolve Ruby to accommodate 80-core
CPU programming?

A version of NArray that parallelizes it's
work (a task that could be made easier using
OpenMP or similar) would work especially well
if your CPU-intensive part of your application
is math intensive.

For more mundane tasks (web serving) an
obvious answer would be to simply fork
off 80 ruby processes which would efficiently
use the 80 cores.

Uh ... be careful ... processes take up space in cache and in RAM. The only thing that would be sharable is the memory used for code ("text" in Linux terms). I think what you want is *lightweight* processes a la Erlang, which Ruby doesn't have yet. It does have *threads*, though.

Part of the heap may be sharable, but GC makes that part small:

http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/186561

IIRC the Erlang has a single shared heap per node (a node is a physical entity that contains the lightweight processes).

···

--
M. Edward (Ed) Borasky, FBG, AB, PTA, PGS, MS, MNLP, NST, ACMC(P)
http://borasky-research.blogspot.com/

If God had meant for carrots to be eaten cooked, He would have given rabbits fire.

Joel_VanderWerf1 · 24 January 2007 19:21

Ron M wrote:

M. Edward (Ed) Borasky wrote:

...

only thing that would be sharable is the memory used for code ("text" in
Linux terms).

Nope. Many (all?) OS's do copy-on-write. If one parent
forked the other children after a lot of initialization
was done, all that data initialized by the parent (including,
for example, loaded ruby modules, etc) would be shared too.

That doesn't play well with ruby's gc, which touches reachable objects (parse trees are another matter).

Maybe there's a way to tell GC that all objects existing at the time of fork should be considered permanently reachable, so that their memory is never copied in the child due to mark(). This way you could set up a basic set of objects that children could add to but never collect. Might be useful for special purposes, but not in general.

···

--
vjoel : Joel VanderWerf : path berkeley edu : 510 665 3407

Topic		Replies	Views
Using multicore CPUs in parallel tasks ruby-talk	18	138	3 November 2009
Ruby Threads ruby-talk	39	187	29 May 2006
Concurrent Ruby? ruby-talk	13	74	30 July 2008
Beyond threads? Better concurrency methods? ruby-talk	32	189	22 July 2006
Ruby Scales just fine! ruby-talk	23	126	27 September 2007

Ruby for massively multi-core chips?

Related topics