Ruby & Threads

Hello all,

I'm building an application that has to branch out and call about 10
other Ruby scripts. Since each script will run for a few seconds,
waiting for each one to finish will take a while, which is too much. So,
I've been looking into threads and I have a system that's working (in
tests), but I have a few questions. First of all, the system:

require 'enumerator'

holder = []

array = (1..10).to_a
puts array.inspect

array.each_slice(3) do |group|
  group.each do |number|
    @thread = Thread.new do
      puts "Starting #{number}...\n"
      sleep(5)
      holder << number
    end
  end
end

@thread.join
puts holder.inspect

In theory, the output should look something like this:

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
Starting 1...
Starting 2...
Starting 3...
Starting 4...
Starting 5...
Starting 6...
Starting 7...
Starting 8...
Starting 9...
Starting 10...
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

However, sometimes the second thread may finish before the first, etc.,
but the order doesn't matter. What matters is that the script's
execution time just went from over 20 seconds to under two! However, as
you can see, I have to call '@thread.join'. I do this because if I
don't, the script will exit before all of the threads are done
executing, so holder is always an emtpy array. Am I right? Or is there
some other way to keep the main script from exiting until all the
threads are done? Is there anything else I'm doing wrong?

Thanks,
Michael Boutros

···

--
Posted via http://www.ruby-forum.com/.

Michael Boutros wrote:

require 'enumerator'

holder =

array = (1..10).to_a
puts array.inspect

array.each_slice(3) do |group|
  group.each do |number|
    @thread = Thread.new do
      puts "Starting #{number}...\n"
      sleep(5)
      holder << number
    end
  end
end

@thread.join

#join is definitely a good idea, because otherwise (as you observed) the main thread will exit before the others have finished, but you are overwriting the @thread variable on each iteration through the loop.

The usual idiom for this is something like:

threads = array.map { Thread.new {...} }
threads.each {|th| th.join}

···

--
       vjoel : Joel VanderWerf : path berkeley edu : 510 665 3407

In plain Ruby, you might want to rewrite this to a more
functional style:

holder =
array.collect do |number|
   Thread.new do
     puts "Starting #{number}...\n"
     sleep(5)
     number
   end
end.collect do |thread|
   thread.value
end

And, using ThreadLimiter [1,2], you can reduce it to:

require "threadlimiter"

holder =
array.threaded_collect do |number|
   puts "Starting #{number}...\n"
   sleep(5)
   number
end

gegroet,
Erik V. - http://www.erikveen.dds.nl/

[1] http://www.erikveen.dds.nl/threadlimiter/doc/index.html
[2] http://rubyforge.org/projects/threadlimiter/

Joel VanderWerf wrote:

Michael Boutros wrote:

      puts "Starting #{number}...\n"
      sleep(5)
      holder << number
    end
  end
end

@thread.join

#join is definitely a good idea, because otherwise (as you observed) the
main thread will exit before the others have finished, but you are
overwriting the @thread variable on each iteration through the loop.

The usual idiom for this is something like:

threads = array.map { Thread.new {...} }
threads.each {|th| th.join}

Joel,

Initially I meant to do that because I thought that I would only need to
"join" one thread to get them all to continue, until I realized that
some might finish before others, so I altered the code to the method
that you described.

···

--
Posted via http://www.ruby-forum.com/\.