Ruby threads and the system call

Hello !

  I've recently had quite a fair bit of trouble with Ruby threads (on a
Linux box). I wrote three different programs (attached) that write
numbers from within threads:

  * one with a standard puts
  * one with a system "echo ..."
  * one with a fork do system "echo ..." end

  The first one behaves as expected (although the numbers are perfectly
ordered, which looks suspicious). In the second one, ruby never manages
to make a second system call in a thread (and finishes before the
subprograms are terminated). The third behaves a little better, but
crashes after, say, 3 to 4 system calls in a thread...

  Is this true on other boxes ? Is it a flaw of my understanding of
threads, or a limitation inherent to the way threads are coded in Ruby ?
If that is so, it's really annoying - no way to delegate heavy tasks to
other well-written programs...

  Thoughts ?

  Vince

test_threads_no_system.rb (121 Bytes)

test_threads_system.rb (129 Bytes)

test_threads_system_fork.rb (155 Bytes)

Vincent Fourmond wrote:

/ ...

  Thoughts ?

Try this code:

···

----------------------------------------

#!/usr/bin/ruby -w

system "rm temp.txt" if File.exists? "temp.txt"

10.times { |i|
   x = Thread.new(i) {
      10.times { |j|
         fork {
            system "echo #{i}, #{j} >> temp.txt"
         }
         Process::wait
      }
   }
   x.join
}

system "wc -l temp.txt" # should print "100 temp.txt"

----------------------------------------

According to the documentation for "fork", "Process::wait" is needed to
avoid the creation of zombie processes. "x.join" assures that the log file
isn't written by multiple processes in a way that corrupts it.

If you want to test simultaneous independent threads, you could write to a
separate file per thread, then combine the files after all threads have
completed as an accuracy check. But I don't see this as anything but
multiple writes to a single file (or console output) that creates I/O
inconsistencies, in an otherwise consistent program.

The third behaves a little better, but
crashes after, say, 3 to 4 system calls in a thread...

As to "crash", what error messages did you see? How do you define "crash"?

This may be system-related. You might not have the resources required for
this program, if all threads and forks run at once. Just a guess.

--
Paul Lutus
http://www.arachnoid.com

Both fork and the system call have the effect of creating a subprocess which
then interacts with your main process through open file descriptors and
through signals. Both interact badly with threads. I suggest you consider a
non-threaded design, fork off your subprocesses, and use waitpid2 to
distinguish the various returns. If you need to capture data generated by
print statements in the subprocesses, then you can play painful games like
redirecting the standard file descriptors to other file numbers, or you can
simply make sure you only have one subprocess in flight at a time.

···

On 8/27/06, Vincent Fourmond <vincent.fourmond@9online.fr> wrote:

  Hello !

  I've recently had quite a fair bit of trouble with Ruby threads (on a
Linux box). I wrote three different programs (attached) that write
numbers from within threads:

  * one with a standard puts
  * one with a system "echo ..."
  * one with a fork do system "echo ..." end

Hello !

10.times { |i|
   x = Thread.new(i) {
      10.times { |j|
         fork {
            system "echo #{i}, #{j} >> temp.txt"
         }
         Process::wait
      }
   }
   x.join

  Great thanks, I had completely forgotten to call join for the
threads... By the way, the version with fork is around 4 times faster on
my box:

ruby test_threads_system.rb > /dev/null 0.04s user 0.10s system 21% cpu
0.639 total
ruby test_threads_system_fork.rb > /dev/null 0.00s user 0.04s system
25% cpu 0.171 total

  Cheers and thanks !!

  Vince