IO#clone and 1.6 -> 1.8 question

Hi –

Just tinkering with IO#clone, in connection with some irc
chatting, and found the following behavior which puzzles
me. Here’s the test program:

fh = File.new(“nums”) # ‘nums’ is 1…20, one integer per line
puts fh.gets

fd = fh.clone
puts fd.gets

puts fh.gets

and here’s output from 1.6.8 and 1.8.0:

ruby 1.6.8 (2002-12-16) [i686-linux]
1
nil
2

ruby 1.8.0 (2003-08-04) [i686-linux]
1
2
nil

Actually I don’t understand either of these; I thought that both runs
would produce 1\n2\n3\n.

Can anyone shed light on these examples?

David

···


David Alan Black
home: dblack@superlink.net
work: blackdav@shu.edu
Web: http://pirate.shu.edu/~blackdav

Hi,

Just tinkering with IO#clone, in connection with some irc
chatting, and found the following behavior which puzzles
me. Here’s the test program:

It’s combination of the two:

  • file position is shared among IO clones. It’s defined stdio
    behavior.

  • IO are buffered so that actual file pointer
    may advance more than you expected (i.e. entire “num” file was
    swallowed this case) by a single “gets”.

    					matz.
    
···

In message “IO#clone and 1.6 → 1.8 question” on 03/09/09, dblack@superlink.net dblack@superlink.net writes:

Hi –

Hi,

Just tinkering with IO#clone, in connection with some irc
chatting, and found the following behavior which puzzles
me. Here’s the test program:

It’s combination of the two:

  • file position is shared among IO clones. It’s defined stdio
    behavior.

That’s what I was expecting, but:

fh = File.new(“nums”)
fd = fh.clone
fh.gets
puts fh.pos
puts fd.pos

=>
2
0

  • IO are buffered so that actual file pointer
    may advance more than you expected (i.e. entire “num” file was
    swallowed this case) by a single “gets”.

I’m really puzzled by that. If that’s the case, how is it possible to
iterate through the lines of a file? Doesn’t it have to be guaranteed
that gets will return the next line?

Also, the thing I’m seeing only seems to happen when a file handle is
cloned. If I just do:

fh = File.new(“nums”)
20.times { puts fh.gets }

I always get 20 lines of output. But when I clone fh, I start to get
the behavior where one fh advances to EOF and the other doesn’t. I’m
afraid I’m still not seeing why cloning a file handle would or should
cause any of this to happen.

David

···

On Tue, 9 Sep 2003, Yukihiro Matsumoto wrote:

In message “IO#clone and 1.6 → 1.8 question” > on 03/09/09, dblack@superlink.net dblack@superlink.net writes:


David Alan Black
home: dblack@superlink.net
work: blackdav@shu.edu
Web: http://pirate.shu.edu/~blackdav

I always get 20 lines of output. But when I clone fh, I start to get
the behavior where one fh advances to EOF and the other doesn't. I'm
afraid I'm still not seeing why cloning a file handle would or should
cause any of this to happen.

With your original example

  fh = File.new("nums") # 'nums' is 1..20, one integer per line
  puts fh.gets

  fd = fh.clone
  puts fd.gets

  puts fh.gets

it do this (well with some more tests :-)))

svg% cat nums
1
2
3
4
5
6
7
8
9
svg%

svg% cat a.c
#include <stdio.h>

main()
{
    FILE *f1, *f2;
    char line[1024];

    f1 = fopen("nums", "r");
    line[0] = line[1] = line[2] = 0;
    fread(line, 1, 2, f1);
    printf("%s", line);
    fseeko(f1, 0L, SEEK_CUR);
    f2 = fdopen(dup(fileno(f1)), "r");
    line[0] = line[1] = 0;
    fread(line, 1, 2, f2);
    printf("%s", line);
    line[0] = line[1] = 0;
    if (fread(line, 1, 2, f1) == 0) {
        printf("EOF\n");
    }
    else {
        printf("%s", line);
    }
}
svg%

svg% a.out
1
2
EOF
svg%

Guy Decoux

Hi,

It’s combination of the two:

  • file position is shared among IO clones. It’s defined stdio
    behavior.

That’s what I was expecting, but:

fh = File.new(“nums”)
fd = fh.clone
fh.gets
puts fh.pos
puts fd.pos

=>
2
0

Ah, I have to tell you something. There’s two file positions for an
IO. One is a file position for a file descriptor. This is shared
among duped file descriptors. The other is a file descriptor for a
stdio (i.e. FILE*), this is not shared. For example, say we name
the former position as rpos (stands for real position), and the latter
as bpos (stands for buffered position).

When you have “1\n2\n3\n” in a file “num”, and you get an IO by
opening it.

fh = File.new(“nums”)

Its rpos and bpos are 0. You clone fh.

fd = fh.clone

Naturally, clone’s rpos and bpos are 0 too. Then you call gets on the
original IO.

fh.gets

It returns “1\n”, bpos is right after the first newline, but rpos is
at the end of file, since whole file is read in the fh’s buffer.

puts fd.pos

“pos” method returns IO’s bpos, so that it returns 0. When you try to
read from fd, since its internal buffer is empty, it tries to read
from the file descriptor. Its rpos is at the end of file (remember
rpos is shared). Thus “gets” returns nil. Weird? I agree, but it’s
how stdio defined to work.

So what this means? It means mixing buffered read with shared file
descriptor cause real confusion, just like the one you face. Do not
use buffered read with IO clones. Use sysread if you want to work
with IO clones.

						matz.
···

In message “Re: IO#clone and 1.6 → 1.8 question” on 03/09/09, dblack@superlink.net dblack@superlink.net writes:

Also, the thing I'm seeing only seems to happen when a file handle is
cloned. If I just do:

Well, if you have access to a Solaris system read carefully stdio(3s), you
have

[...]
     Handles can be created or destroyed by user action without
     affecting the underlying open file description. Some of the
     ways to create them include fcntl(2), dup(2), fdopen(3S),
[...]
     If two or more handles are used, and any one of them is a
     stream, their actions shall be coordinated as described
     below. If this is not done, the result is undefined.
[...]
        10.
           For the second handle: if any previous active handle
           has called a function that explicitly changed the file
           offset, except as required above for the first handle,
           the application shall perform an lseek() or an
           fseek(2) (as appropriate to the type of the handle) to
           an appropriate location.
[...]

Guy Decoux