Is flock broken on solaris?

I’ve tested this program on linux boxes with 1 and 2 processors, but on
a SunOS 5.7 quad cpu there are apparently problems with flock.

$ ruby -v
1.7.3 (2002-10-30) [sparc-solaris2.7]

The program forks n child processes which attempt to read and write a
marshalled object in /tmp/foo.

It’s fine on sunos with n=1. It’s fine on linux (single cpu and dual cpu
SMP) with n=1,2,20. But it fails with n=2 on sunos 5.7.

The only explanation I can think of is that flock is not preventing the
two processes from simultaneously accessing the file. Hope I’m wrong…

···

====================

class TestObject
attr_accessor :x
def initialize
@x = 0
@random_stuff = (0…rand(10)).map {|i| i}
# so the object has variable size
end
end

$path = '/tmp/foo’
process_count = 2
rep_count = 1000

File.open($path, “w”) do |f|
Marshal.dump(TestObject.new, f)
end

(0…process_count).map do
fork do
rep_count.times do
File.open($path, “r+”) do |f|
begin
f.flock(File::LOCK_EX)

       str = f.read
       begin
         tester = Marshal.load(str)
       rescue ArgumentError => e
         puts e
         p str
         exit!
       end

       case 2  # 1 or 2 both cause the problem
       when 1
         File.read($path)

         new_tester = TestObject.new
           # will probably have different size

         tester = new_tester
           # commenting out this line fixes problem

         f.rewind
         Marshal.dump(tester, f)
         f.truncate(f.pos)

       when 2
         File.read($path)

         f.rewind
         f.truncate(0)
           # This ^^^ line (*) seems to cause the problem
         Marshal.dump(tester, f)
         #f.truncate(f.pos)
           # doing this ^^^ instead of (*) fixes it

       end

     ensure
       f.flush
       f.flock(File::LOCK_UN)
     end
   end
 end

end
end

monitor_thread = Thread.new do
i = 0
loop do
puts "Clock: #{i} seconds"
sleep 1
i += 1
end
end

(0…process_count).each do
Process.wait
end

Joel VanderWerf wrote:

        File.read($path)

I should have pointed out that the File.read call (within the lock
context) is necessary to the problem. I wonder if that gives up the
lock? Because doing the following immediately after the read seems to
fix the problem:

         f.flock(File::LOCK_EX)

So this suggests that file locks are nestable on linux but not on
solaris. The man pages say that solaris flock is implemented in terms of
fcntl, so maybe that’s where the semantic difference comes from…

I wonder if there is a way ruby can insulate us from this difference?