I have have several forked processes simultateously accessing the data
area with the following function, and it all goes horribly wrong.
def extract(dest,name,mode)
User.asroot {
start=0
fn = path(dest,name)
File.open(fn,File::CREAT|File::WRONLY,mode) { |file|
DATA.rewind
DATA.each { |line|
if start==0
start=1 if line =~ %r{^=begin #{name} -}
else
break if line =~ %r{^=end -}
yield line if block_given?
file.puts line
end
}
}
}
end
Each process gets partial or corrupt data. Now it obviously has
something to do with the DATA object being shared between processes
somehow (there is presumably a file handle in there somewhere), but how
can I fix it?
Andrew Walrond
Yes I know, so the first thing I tried was
def extract(dest,name,mode)
data = DATA.dup
User.asroot {
start=0
fn = path(dest,name)
File.open(fn,File::CREAT|File::WRONLY,mode) { |file|
data.rewind
data.each { |line|
if start==0
start=1 if line =~ %r{^=begin #{name} -*}
else
break if line =~ %r{^=end -*}
yield line if block_given?
file.puts line
end
}
}
}
end
Which didn’t work.
Futher investigation proved to be confusing. Can anyone explain whats
going on?
(I’ve appended the output from each line to make it easier)
#!/bin/ruby -w
#Hello cat
#Hello dog
#hello canary
puts DATA.gets #-> one Ok, as expected.
puts DATA.gets #-> two Ok, as expected.
a=DATA.dup
puts a.gets #-> nil Huh? I was expecting ‘three’
a.rewind # Try rewinding…
puts a.gets #-> #!/bin/ruby -w Ok, I can live with that
b=DATA.dup
puts b.gets #-> nil Hmmm. Ok, same as before
b.rewind # Rewind…
puts b.gets #-> #!/bin/ruby -w As expected
Mix it up a bit…
puts a.gets #-> #Hello cat Ok
puts a.gets #-> #Hello dog Yep
puts b.gets #-> #Hello cat Ok
puts a.gets #-> #Hello canary Great!
Ok, lets simplify it a bit
c=DATA.dup
c.rewind
d=DATA.dup
d.rewind
puts c.gets #-> #!/bin/ruby -w As expected
puts d.gets #-> nil What the hell is going on here?
Try again, reordering stuff a bit…
c=DATA.dup
d=DATA.dup
c.rewind
puts c.gets #-> #!/bin/ruby -w As expected
d.rewind
puts d.gets #-> #!/bin/ruby -w It works! But why???
END
one
two
three
···
nobu.nokada@softhome.net wrote:
Shared file descriptors share the position even between
processes. You can duplicate DATA.
Hi,
Futher investigation proved to be confusing. Can anyone explain whats
going on?
(I’ve appended the output from each line to make it easier)
It looks position mismatch between stdio and IO descriptor.
Although this would be a bug, try with #seek before #dup.
#!/bin/ruby -w
#Hello cat
#Hello dog
#hello canary
puts DATA.gets #-> one Ok, as expected.
puts DATA.gets #-> two Ok, as expected.
DATA.seek(0, IO::SEEK_CUR)
a=DATA.dup
puts a.gets #-> nil Huh? I was expecting ‘three’
a.rewind # Try rewinding…
puts a.gets #-> #!/bin/ruby -w Ok, I can live with that
DATA.seek(0, IO::SEEK_CUR)
b=DATA.dup
puts b.gets #-> nil Hmmm. Ok, same as before
b.rewind # Rewind…
puts b.gets #-> #!/bin/ruby -w As expected
But it doesn’t work with the below, so more investigation is
needed.
···
At Thu, 12 Jun 2003 17:52:02 +0900, Andrew Walrond wrote:
Ok, lets simplify it a bit
c=DATA.dup
c.rewind
d=DATA.dup
d.rewind
puts c.gets #-> #!/bin/ruby -w As expected
puts d.gets #-> nil What the hell is going on here?
–
Nobu Nakada
OK;
#!/bin/ruby -w
#Hello cat
#Hello dog
#hello canary
puts DATA.gets #-> one Ok, as expected.
puts DATA.gets #-> two Ok, as expected.
DATA.seek(0, IO::SEEK_CUR)
a=DATA.dup
puts a.gets #-> three Yes!
a.rewind # Try rewinding…
puts a.gets #-> #!/bin/ruby -w Yes!
DATA.seek(0, IO::SEEK_CUR)
b=DATA.dup
puts b.gets #-> three Yes!
b.rewind # Rewind…
puts b.gets #-> #!/bin/ruby -w As expected
So as you suggested, the first part works fine with the sync
I’ll leave it with you then 
Andrew Walrond
···
nobu.nokada@softhome.net wrote:
Hi,
It looks position mismatch between stdio and IO descriptor.
Although this would be a bug, try with #seek before #dup.
Hi,
So as you suggested, the first part works fine with the sync
I’ll leave it with you then 
It was normal. As I wrote at [ruby-talk:73295],
Shared file descriptors share the position even between
processes.
Therefore, when first fd reached EOF, second fd points at EOF
too. This is UNIX I/O model.
At least, however, seek before dup probably should be done
automatically.
Index: io.c
···
At Fri, 13 Jun 2003 20:30:48 +0900, Andrew Walrond wrote:
RCS file: /cvs/ruby/src/ruby/io.c,v
retrieving revision 1.213
diff -u -2 -p -r1.213 io.c
— io.c 7 Jun 2003 15:33:40 -0000 1.213
+++ io.c 22 Jun 2003 01:07:46 -0000
@@ -2491,8 +2491,12 @@ rb_io_init_copy(dest, io)
if (orig->f2) {
io_fflush(orig->f2, orig);
-
fseeko(orig->f, 0L, SEEK_CUR);
}
else if (orig->mode & FMODE_WRITABLE) {
io_fflush(orig->f, orig);
}
-
else {
-
fseeko(orig->f, 0L, SEEK_CUR);
-
}
/* copy OpenFile structure */
–
Nobu Nakada