There doesn't seem to be any EASY way of doing a parallel computation
in Ruby.
I would like to do something like this :
array.map do |i|
fork do
i + 1
end
end
Process.waitall
wich would give back the array with one added to each element in an
array, and it would perform this "calculation" in parallel. However,
this doesn't work since fork runs a subprocess which is another Ruby
interpreter and I can't get anything back from that black hole, except
some exit status.
Actually, it would be really nice if there was a 'forkmap' method that
could do this:
There doesn't seem to be any EASY way of doing a parallel computation
in Ruby.
I would like to do something like this :
array.map do |i|
fork do
i + 1
end
end
Process.waitall
wich would give back the array with one added to each element in an
array, and it would perform this "calculation" in parallel. However,
this doesn't work since fork runs a subprocess which is another Ruby
interpreter and I can't get anything back from that black hole, except
some exit status.
Actually, it would be really nice if there was a 'forkmap' method that
could do this:
array.forkmap do |i|
i + 1
end
But there isn't, right?
Ruby uses green threads. All your threads would run within the Ruby
process, and aren't running parallel in the sense you seem to imply. If
you want to use threads, maybe JRuby and Java are what you seek.
JRuby uses Java threads, which are OS threads.
If I'm on the wrong tangent, just ignore this reply.
You thought I was taking your woman away from you. You're jealous.
You tried to kill me with your bare hands. Would a Kelvan do that?
Would he have to? You're reacting with the emotions of a human.
You are human.
~ -- Kirk, "By Any Other Name," stardate 4657.5
On Tue, Apr 15, 2008 at 11:35 PM, Fredrik <fredjoha@gmail.com> wrote:
There doesn't seem to be any EASY way of doing a parallel computation
in Ruby.
I would like to do something like this :
array.map do |i|
fork do
i + 1
end
end
Process.waitall
wich would give back the array with one added to each element in an
array, and it would perform this "calculation" in parallel. However,
this doesn't work since fork runs a subprocess which is another Ruby
interpreter and I can't get anything back from that black hole, except
some exit status.
Actually, it would be really nice if there was a 'forkmap' method that
could do this:
a @http://codeforpeople.com/
--
we can deny everything, except that we have the possibility of being
better. simply reflect on that.
h.h. the 14th dalai lama
Actually, I'll change it a bit. I added Process.waitall since there
are otherwise some dead(?) processes left.
module Enumerable
def fmap &b
result = map do |*a|
r, w = IO.pipe
fork do
r.close
w.write( Marshal.dump( b.call(*a) ) )
end
[ w.close, r ].last
end
Process.waitall
result.map{|r| Marshal.load [ r.read, r.close ].first}
end
end
it's damn tricky - it *excepts* get get an exception marshaled up a dedicated pipe, if this does *not* occur we know the child process started successfully. you can adapt.
I added an argument to limit the number of concurrent processes (my
workstation practically died when I ran all the processes I wanted to
run):
module Enumerable
def forkmap n, &b
result = map do |*a|
nproc = 0
r, w = IO.pipe
fork do
r.close
w.write( Marshal.dump( b.call(*a) ) )
end
if (nproc+=1) >= n
Process.wait ; nproc -= 1
end
[ w.close, r ].last
end
Process.waitall
result.map{|r| Marshal.load [ r.read, r.close ].first}
end
end
It seems to be doing its job correctly :
Benchmark.realtime { [1,2,3].forkmap(3){|i| sleep(1) ; i * 2} }
=> 1.01134896278381
Benchmark.realtime { [1,2,3].forkmap(1){|i| sleep(1) ; i * 2} }
it's damn tricky - it *excepts* get get an exception marshaled up a
dedicated pipe, if this does *not* occur we know the child process
started successfully. you can adapt.
I'm not sure I understand what you mean. But are you saying that open4
can solve all my problems?
It's really great. I just see one thing to improve:
The new "n" parameter is mandatory since it's the first parameter. It would be
nice if it could be not defined (so = infinite):
forkmap(4) { code } --> max 4 process
forkmap { code } --> max infinite
Do you think your code can be feasible for production enviroments? maybe it
envolves some danger or risk? If not I suggest you to publish it in any way
since it's really cool and a missing feature of Ruby.
···
El Jueves, 17 de Abril de 2008, Fredrik escribió:
I added an argument to limit the number of concurrent processes (my
workstation practically died when I ran all the processes I wanted to
run):
module Enumerable
def forkmap n, &b
result = map do |*a|
nproc = 0
r, w = IO.pipe
fork do
r.close
w.write( Marshal.dump( b.call(*a) ) )
end
if (nproc+=1) >= n
Process.wait ; nproc -= 1
end
[ w.close, r ].last
end
Process.waitall
result.map{|r| Marshal.load [ r.read, r.close ].first}
end
end
Oops, my mistake (didn't the 1.0 series require only JRE 1.4.2, or so?).
Yes, JRuby 1.0 worked on Java 1.4.2, but there were too many benefits moving to Java 5 to keep it that way, especially availability of annotations and the concurrency APIs.
i've got something close to gem'ing... there is nothing wrong with the concept - this is precisly how objects are returned from drb: marshaled data over a socket/pipe.
On Apr 17, 2008, at 1:02 PM, Iñaki Baz Castillo wrote:
Do you think your code can be feasible for production enviroments? maybe it
envolves some danger or risk? If not I suggest you to publish it in any way
since it's really cool and a missing feature of Ruby.
--
we can deny everything, except that we have the possibility of being better. simply reflect on that.
h.h. the 14th dalai lama
The new "n" parameter is mandatory since it's the first parameter. It would be
nice if it could be not defined (so = infinite):
forkmap(4) { code } --> max 4 process
forkmap { code } --> max infinite
I was thinking about that too, but as far as I understand it Ruby only
allows optional arguments to be the last arguments - i.e. the "n"
parameter would have to appear after the code block. And that would
look strange : forkmap{ code }(4).