IO#lineno= doesn't work the way I expected

I'm working on something that operates on each line of a file
individually. So far, the relevant part looks something like this:

    in_fh = File.open(infile, 'r')
    out_fh = File.open(outfile, 'w')

    in_fh.each_line do |l|
      out_fh << l
    end

Obviously not the actual working code (since it doesn't do anything other
than copy the file), but it's the simplified version I'm using to test a
feature, anyway.

This works exactly as expected. I get a duplicate of the input file,
with the name assigned to the outfile variable.

So . . . I want to be able to skip a number of lines at the beginning of
the input file. It seemed like IO#lineno= would be the obvious solution:

    in_fh = File.open(infile, 'r')
    out_fh = File.open(outfile, 'w')

    in_fh.lineno = 1

    in_fh.each_line do |l|
      out_fh << l
    end

Unfortunately, that doesn't give me what I expected at all. In fact,
what I end up with is an empty file. Am I misunderstanding what
IO#lineno= does? Is there something about the way IO#each_line works
that is interacting badly with an incremented line number? What do I
need to do differently to make this work?

···

--
Chad Perrin [ content licensed PDL: http://pdl.apotheon.org ]
Quoth Dennis Miller: "Bill Gates is a monocle and a Persian Cat away
from being the villain in a James Bond movie."

Chad Perrin wrote:

I'm working on something that operates on each line of a file
individually. So far, the relevant part looks something like this:

    in_fh = File.open(infile, 'r')
    out_fh = File.open(outfile, 'w')

    in_fh.each_line do |l|
      out_fh << l
    end

Obviously not the actual working code (since it doesn't do anything other
than copy the file), but it's the simplified version I'm using to test a
feature, anyway.

This works exactly as expected. I get a duplicate of the input file,
with the name assigned to the outfile variable.

So . . . I want to be able to skip a number of lines at the beginning of
the input file. It seemed like IO#lineno= would be the obvious solution:

    in_fh = File.open(infile, 'r')
    out_fh = File.open(outfile, 'w')

    in_fh.lineno = 1

    in_fh.each_line do |l|
      out_fh << l
    end

Unfortunately, that doesn't give me what I expected at all. In fact,
what I end up with is an empty file. Am I misunderstanding what
IO#lineno= does? Is there something about the way IO#each_line works
that is interacting badly with an incremented line number? What do I
need to do differently to make this work?

The way I read the ri doc for lineno=, it appears that all it does is determine the value that lineno returns the next time you call it. That is, it doesn't move the read position in the file.

I think what you're looking for is IO#seek, but notice that it seek doesn't operate on lines, only on byte offsets.

···

--
RMagick: http://rmagick.rubyforge.org/

The way I read the ri doc for lineno=, it appears that all it does is
determine the value that lineno returns the next time you call it. That
is, it doesn't move the read position in the file.

I think you're confusing lineno= with lineno (which are two separate
methods). IO#lineno does this:

    ios.lineno
    => 0

IO#lineno= does this:

    ios.lineno = 3
    => 3

The class is specced here, with the screen scrolled to where IO#lineno
and IO#lineno= are listed:

    class IO - RDoc Documentation

I think what you're looking for is IO#seek, but notice that it seek
doesn't operate on lines, only on byte offsets.

That would make IO#seek not what I want, unfortunately. Line lengths are
not known in advance.

···

On Sun, Nov 16, 2008 at 02:27:45AM +0900, Tim Hunter wrote:

--
Chad Perrin [ content licensed PDL: http://pdl.apotheon.org ]
Quoth Larry Wall: "You can never entirely stop being what you once were.
That's why it's important to be the right person today, and not put it
off till tomorrow."

The description of the method is somewhat ambiguous if you ask me.

My view of the docs is inline with what Tim was describing.

------------------------------------------------------------- IO#lineno=
     ios.lineno = integer => integer

···

On Sat, Nov 15, 2008 at 2:08 PM, Chad Perrin <perrin@apotheon.com> wrote:

On Sun, Nov 16, 2008 at 02:27:45AM +0900, Tim Hunter wrote:

The way I read the ri doc for lineno=, it appears that all it does is
determine the value that lineno returns the next time you call it. That
is, it doesn't move the read position in the file.

I think you're confusing lineno= with lineno (which are two separate
methods). IO#lineno does this:

   ios.lineno
   => 0

IO#lineno= does this:

   ios.lineno = 3
   => 3

The class is specced here, with the screen scrolled to where IO#lineno
and IO#lineno= are listed:

   class IO - RDoc Documentation

------------------------------------------------------------------------
     Manually sets the current line number to the given value. +$.+ is
     updated only on the next read.

        f = File.new("testfile")
        f.gets #=> "This is line one\n"
        $. #=> 1
        f.lineno = 1000
        f.lineno #=> 1000
        $. # lineno of last read #=> 1
        f.gets #=> "This is line two\n"
        $. # lineno of last read #=> 1001

I would think if it had the behavior you described, the second time
f.gets is called, we would see: "This is line one thousand and one\n"
not "This is line two\n"

Michael Guterl

Or maybe even "This is line one thousand\n". I'm not sure...

Also, here is the specs for IO#lineno and IO#lineno= from RubySpec:
http://github.com/rubyspec/rubyspec/tree/master/1.8/core/io/lineno_spec.rb

HTH,
Michael Guterl

···

On Sat, Nov 15, 2008 at 2:54 PM, Michael Guterl <mguterl@gmail.com> wrote:

On Sat, Nov 15, 2008 at 2:08 PM, Chad Perrin <perrin@apotheon.com> wrote:

On Sun, Nov 16, 2008 at 02:27:45AM +0900, Tim Hunter wrote:

The way I read the ri doc for lineno=, it appears that all it does is
determine the value that lineno returns the next time you call it. That
is, it doesn't move the read position in the file.

I think you're confusing lineno= with lineno (which are two separate
methods). IO#lineno does this:

   ios.lineno
   => 0

IO#lineno= does this:

   ios.lineno = 3
   => 3

The class is specced here, with the screen scrolled to where IO#lineno
and IO#lineno= are listed:

   class IO - RDoc Documentation

The description of the method is somewhat ambiguous if you ask me.

My view of the docs is inline with what Tim was describing.

------------------------------------------------------------- IO#lineno=
    ios.lineno = integer => integer
------------------------------------------------------------------------
    Manually sets the current line number to the given value. +$.+ is
    updated only on the next read.

       f = File.new("testfile")
       f.gets #=> "This is line one\n"
       $. #=> 1
       f.lineno = 1000
       f.lineno #=> 1000
       $. # lineno of last read #=> 1
       f.gets #=> "This is line two\n"
       $. # lineno of last read #=> 1001

I would think if it had the behavior you described, the second time
f.gets is called, we would see: "This is line one thousand and one\n"
not "This is line two\n"

The way I read the ri doc for lineno=, it appears that all it does is
determine the value that lineno returns the next time you call it. That
is, it doesn't move the read position in the file.

Exactly.

The class is specced here, with the screen scrolled to where IO#lineno
and IO#lineno= are listed:

   class IO - RDoc Documentation

The description of the method is somewhat ambiguous if you ask me.

I don't think so.

My view of the docs is inline with what Tim was describing.

Same here.

------------------------------------------------------------- IO#lineno=
     ios.lineno = integer => integer
------------------------------------------------------------------------
     Manually sets the current line number to the given value. +$.+ is
     updated only on the next read.

There is no talk about read position in the file - just about "current line number". Also:

        f = File.new("testfile")
        f.gets #=> "This is line one\n"
        $. #=> 1
        f.lineno = 1000
        f.lineno #=> 1000
        $. # lineno of last read #=> 1
        f.gets #=> "This is line two\n"
        $. # lineno of last read #=> 1001

The sample makes it very clear that the read position is not affected by lineno= because file reading obviously continues at the position where it was before.

I would think if it had the behavior you described, the second time
f.gets is called, we would see: "This is line one thousand and one\n"
not "This is line two\n"

Right (if by "you" you do not mean Tim, somehow part of the thread is missing in Usenet).

Kind regards

  robert

···

On 15.11.2008 20:51, Michael Guterl wrote:

On Sat, Nov 15, 2008 at 2:08 PM, Chad Perrin <perrin@apotheon.com> wrote:

On Sun, Nov 16, 2008 at 02:27:45AM +0900, Tim Hunter wrote:

Okay, so . . .

  1. What the hell is the point of IO#lineno= if it does the same thing
  as IO#lineno except that it lets you specify a number that doesn't do
  anything?

  2. Is there a way to do what I actually need -- specifically, to
  iterate over an entire file *except* the first line, only reading one
  line at a time into RAM -- without writing a C extension for Ruby?

···

On Sun, Nov 16, 2008 at 04:54:01AM +0900, Michael Guterl wrote:

>
> I would think if it had the behavior you described, the second time
> f.gets is called, we would see: "This is line one thousand and one\n"
> not "This is line two\n"
>
Or maybe even "This is line one thousand\n". I'm not sure...

--
Chad Perrin [ content licensed PDL: http://pdl.apotheon.org ]
Quoth Paul Graham: "Object-oriented programming offers a sustainable way
to write spaghetti code."

>>The class is specced here, with the screen scrolled to where IO#lineno
>>and IO#lineno= are listed:
>>
>> class IO - RDoc Documentation
>>
>The description of the method is somewhat ambiguous if you ask me.

I don't think so.

The more I look at it, the more ambiguous it appears to be.

>------------------------------------------------------------- IO#lineno=
> ios.lineno = integer => integer
>------------------------------------------------------------------------
> Manually sets the current line number to the given value. +$.+ is
> updated only on the next read.

There is no talk about read position in the file - just about "current
line number". Also:

. . . which, to someone who isn't assuming "line number" is just a
magical number plucked out of the air, makes it sound like it moves the
read position to a line whose ordinal position is that of the specified
line number. In other words, that's how it "sounded" to me.

> f = File.new("testfile")
> f.gets #=> "This is line one\n"
> $. #=> 1
> f.lineno = 1000
> f.lineno #=> 1000
> $. # lineno of last read #=> 1
> f.gets #=> "This is line two\n"
> $. # lineno of last read #=> 1001

The sample makes it very clear that the read position is not affected by
lineno= because file reading obviously continues at the position where
it was before.

It only makes that clear if you assume a lot of things about what's in
the file in question. I can see now, in retrospect, how you came to that
conclusion -- but the fact that the second use of `f.gets` returns "This
is line two\n" doesn't *necessarily* mean that the return value is from
the second line of the file. I read it, initially, as meaning that
whatever line of the file it was, it just happened to say "This is line
two\n" because that made for some convenient text to have in the example.

Since the contents of the file were not made clear in advance, the
assumption that only the second line of the file can possibly say "This
is line two\n" does not clarify anything for the reader except by
accident. It could just mean "This is the second line of output from
this code."

>I would think if it had the behavior you described, the second time
>f.gets is called, we would see: "This is line one thousand and one\n"
>not "This is line two\n"

Right (if by "you" you do not mean Tim, somehow part of the thread is
missing in Usenet).

I don't see why everyone has to assume that the second line of the file
necessarily contains the text "This is line two\n". It's really very
ambiguous. If you want a program that outputs "This is line one\nThis is
line two\n", and for some reason lines 0 and 1000 of the file contain
"This is line one\n" and "This is line two\n" respectively, the alternate
interpretation of the way the method works makes *perfect* sense.

What *doesn't* make any sense to me is the idea that, for some reason,
it's important and common enough an operation to misnumber line numbers
that there has to be a `lineno=` method that counterfeits line numbers.
What the hell is the point of that? Please explain that to me.

···

On Sun, Nov 16, 2008 at 08:36:50PM +0900, Robert Klemme wrote:

On 15.11.2008 20:51, Michael Guterl wrote:
>On Sat, Nov 15, 2008 at 2:08 PM, Chad Perrin <perrin@apotheon.com> wrote:

--
Chad Perrin [ content licensed PDL: http://pdl.apotheon.org ]
My first programming koan: If a lambda has the ability to access its
context, but there isn't any context to access -- is it still a closure?

Chad Perrin wrote:

  2. Is there a way to do what I actually need -- specifically, to
  iterate over an entire file *except* the first line, only reading one
  line at a time into RAM -- without writing a C extension for Ruby?

Why not iterate over the entire file and just ignore the first line? IO#readline or IO#gets will get you a line at a time.

···

--
RMagick: http://rmagick.rubyforge.org/

Now that I agree with. I have no idea why this method exits. It's just as easy to do:

   f.lineno + 1000

James Edward Gray II

···

On Nov 16, 2008, at 1:51 PM, Chad Perrin wrote:

What *doesn't* make any sense to me is the idea that, for some reason,
it's important and common enough an operation to misnumber line numbers
that there has to be a `lineno=` method that counterfeits line numbers.
What the hell is the point of that? Please explain that to me.

>>The class is specced here, with the screen scrolled to where IO#lineno
>>and IO#lineno= are listed:
>>
>> class IO - RDoc Documentation
>>
>The description of the method is somewhat ambiguous if you ask me.

I don't think so.

The more I look at it, the more ambiguous it appears to be.

That's the usual effect of staring at a sentence for too long. :slight_smile: Relax.

>------------------------------------------------------------- IO#lineno=
> ios.lineno = integer => integer
>------------------------------------------------------------------------
> Manually sets the current line number to the given value. +$.+ is
> updated only on the next read.

There is no talk about read position in the file - just about "current
line number". Also:

. . . which, to someone who isn't assuming "line number" is just a
magical number plucked out of the air, makes it sound like it moves the
read position to a line whose ordinal position is that of the specified
line number. In other words, that's how it "sounded" to me.

Yes, but the example makes it pretty clear that this is not the way it is:

> f = File.new("testfile")
> f.gets #=> "This is line one\n"
> $. #=> 1
> f.lineno = 1000
> f.lineno #=> 1000
> $. # lineno of last read #=> 1
> f.gets #=> "This is line two\n"
> $. # lineno of last read #=> 1001

The sample makes it very clear that the read position is not affected by
lineno= because file reading obviously continues at the position where
it was before.

It only makes that clear if you assume a lot of things about what's in
the file in question. I can see now, in retrospect, how you came to that
conclusion -- but the fact that the second use of `f.gets` returns "This
is line two\n" doesn't *necessarily* mean that the return value is from
the second line of the file.

Of course not. But what sense would it make to create a file with a
different content that would return "This is line two\n" when
explaining how lineno= works? The most obvious explanation is that
someone created a file where "This is line two" is actually placed in
the second line to demonstrate the non effect on file position.

I read it, initially, as meaning that
whatever line of the file it was, it just happened to say "This is line
two\n" because that made for some convenient text to have in the example.

Actually I believe the other interpretation is much more
straightforward and reasonable.

Since the contents of the file were not made clear in advance, the
assumption that only the second line of the file can possibly say "This
is line two\n" does not clarify anything for the reader except by
accident. It could just mean "This is the second line of output from
this code."

See above. IMHO only a bit application of common sense will show you
that your reasoning goes a bit astray here - although from a formal
point of view you are right.

>I would think if it had the behavior you described, the second time
>f.gets is called, we would see: "This is line one thousand and one\n"
>not "This is line two\n"

Right (if by "you" you do not mean Tim, somehow part of the thread is
missing in Usenet).

I don't see why everyone has to assume that the second line of the file
necessarily contains the text "This is line two\n". It's really very
ambiguous.

... for you.

If you want a program that outputs "This is line one\nThis is
line two\n", and for some reason lines 0 and 1000 of the file contain
"This is line one\n" and "This is line two\n" respectively, the alternate
interpretation of the way the method works makes *perfect* sense.

Formally speaking yes, with a bit of common sense, no.

What *doesn't* make any sense to me is the idea that, for some reason,
it's important and common enough an operation to misnumber line numbers
that there has to be a `lineno=` method that counterfeits line numbers.
What the hell is the point of that? Please explain that to me.

I do not know this. IO#lineno= can help implementing ARGF although it
is not needed. Maybe ARGF is completely implemented in Ruby and
delegates to C code for the IO handling - including line counting. In
that case it's handy to have this setter so you can offset the line
number of the next opened file.

Kind regards

robert

···

2008/11/16 Chad Perrin <perrin@apotheon.com>:

On Sun, Nov 16, 2008 at 08:36:50PM +0900, Robert Klemme wrote:

On 15.11.2008 20:51, Michael Guterl wrote:
>On Sat, Nov 15, 2008 at 2:08 PM, Chad Perrin <perrin@apotheon.com> wrote:

--
remember.guy do |as, often| as.you_can - without end

How do you propose I "ignore" the first line? That's what I was trying
to do -- by starting the iteration over lines in the file with the second
line. Are you saying I should just have an orphan gets line in the code,
then have a block in which I use gets each line until I run out of file?
I guess that would probably work, but seems kinda . . . ugly. I just
need to figure out an efficient way to calculate the number of lines in
the file (minus one) now so I can use that number to control the number
of iterations.

I really hope you aren't suggesting I have a conditional in every single
iteration, going through all the lines in the file, to test whether it's
the first line so the first line can be ignored.

. . . and why is there a lineno and a lineno= if lineno= doesn't actually
do anything other than prompt you for a useless number? I still don't
understand that.

···

On Sun, Nov 16, 2008 at 05:42:18AM +0900, Tim Hunter wrote:

Chad Perrin wrote:
> 2. Is there a way to do what I actually need -- specifically, to
> iterate over an entire file *except* the first line, only reading one
> line at a time into RAM -- without writing a C extension for Ruby?

Why not iterate over the entire file and just ignore the first line?
IO#readline or IO#gets will get you a line at a time.

--
Chad Perrin [ content licensed PDL: http://pdl.apotheon.org ]
Baltasar Gracian: "A wise man gets more from his enemies than a fool from
his friends."

>> >------------------------------------------------------------- IO#lineno=
>> > ios.lineno = integer => integer
>> >------------------------------------------------------------------------
>> > Manually sets the current line number to the given value. +$.+ is
>> > updated only on the next read.
>>
>> There is no talk about read position in the file - just about "current
>> line number". Also:
>
> . . . which, to someone who isn't assuming "line number" is just a
> magical number plucked out of the air, makes it sound like it moves the
> read position to a line whose ordinal position is that of the specified
> line number. In other words, that's how it "sounded" to me.

Yes, but the example makes it pretty clear that this is not the way it is:

Only if you didn't read what I just said, and thus still think everyone
in the world made the same assumption you did.

>> > f = File.new("testfile")
>> > f.gets #=> "This is line one\n"
>> > $. #=> 1
>> > f.lineno = 1000
>> > f.lineno #=> 1000
>> > $. # lineno of last read #=> 1
>> > f.gets #=> "This is line two\n"
>> > $. # lineno of last read #=> 1001
>>
>> The sample makes it very clear that the read position is not affected by
>> lineno= because file reading obviously continues at the position where
>> it was before.
>
> It only makes that clear if you assume a lot of things about what's in
> the file in question. I can see now, in retrospect, how you came to that
> conclusion -- but the fact that the second use of `f.gets` returns "This
> is line two\n" doesn't *necessarily* mean that the return value is from
> the second line of the file.

Of course not. But what sense would it make to create a file with a
different content that would return "This is line two\n" when
explaining how lineno= works? The most obvious explanation is that
someone created a file where "This is line two" is actually placed in
the second line to demonstrate the non effect on file position.

If the point of the example was to demonstrate that, in order to get
"line two" of the output you want from line 1000 of the document, it
makes *perfect* sense. The most obvious explanation seems to me to be
the one that supports IO#lineno= actually being a needed method rather
than . . . whatever the hell it actually is.

> I read it, initially, as meaning that
> whatever line of the file it was, it just happened to say "This is line
> two\n" because that made for some convenient text to have in the example.

Actually I believe the other interpretation is much more
straightforward and reasonable.

Clearly, you believe that. I believe reasonable people can come to
different conclusions about what that meant. In other words, I believe
it's ambiguous. The fact you cannot imagine a different interpretation
than what immediately occurred to you is a failure of your imagination,
not mine. It seems to me that no matter how many times you look at it
you still cannot imagine how, with a different set of perfectly
reasonable starting assumptions, it means something different than you
initially thought it must mean is a result of confirmation bias rather
than a sign that I'm stupid -- especially since, as I've pointed out, I
can understand where you got your interpretation *and* where I got mine.

> Since the contents of the file were not made clear in advance, the
> assumption that only the second line of the file can possibly say "This
> is line two\n" does not clarify anything for the reader except by
> accident. It could just mean "This is the second line of output from
> this code."

See above. IMHO only a bit application of common sense will show you
that your reasoning goes a bit astray here - although from a formal
point of view you are right.

In my humble opinion, an application of "common sense" yields different
results based on different reasonable starting assumptions, and that you
believe my reasoning has gone a bit astray only because you are incapable
of understanding how someone can reasonably disagree with you about
initial assumptions.

You seem very invested in proving me "wrong" about the ambiguity of the
example. Do you feel insulted somehow because you wrote the example, and
thus interpret what I'm saying as an attack?

>> >I would think if it had the behavior you described, the second time
>> >f.gets is called, we would see: "This is line one thousand and one\n"
>> >not "This is line two\n"
>>
>> Right (if by "you" you do not mean Tim, somehow part of the thread is
>> missing in Usenet).
>
> I don't see why everyone has to assume that the second line of the file
> necessarily contains the text "This is line two\n". It's really very
> ambiguous.

... for you.

Indeed. It's not ambiguous for you because your initial assumption
turned out to be the same as that of the example's author. It's nice how
that works for you.

I'm not the only person who thinks the example is ambiguous as written.
In fact, I'm not even the first person in this thread who said so. I
only said so when I realized how an alternate set of initial assumptions
led to the alternate interpretation to which you subscribe.

> If you want a program that outputs "This is line one\nThis is
> line two\n", and for some reason lines 0 and 1000 of the file contain
> "This is line one\n" and "This is line two\n" respectively, the alternate
> interpretation of the way the method works makes *perfect* sense.

Formally speaking yes, with a bit of common sense, no.

There you go again. Apparently, "common sense" means "agrees with me" in
your little world. Thus, when I disagree with you, I lack "common
sense". I find that perspective somewhat limiting and egocentric, but I
guess it must reassure your self image somehow.

> What *doesn't* make any sense to me is the idea that, for some reason,
> it's important and common enough an operation to misnumber line numbers
> that there has to be a `lineno=` method that counterfeits line numbers.
> What the hell is the point of that? Please explain that to me.

I do not know this. IO#lineno= can help implementing ARGF although it
is not needed. Maybe ARGF is completely implemented in Ruby and
delegates to C code for the IO handling - including line counting. In
that case it's handy to have this setter so you can offset the line
number of the next opened file.

Wouldn't it be even more handy, in the general case, to be able to set
the offset in the file ahead a few hundred or thousand lines of text? It
certainly would have been more handy for me, especially since there does
not appear to be any way at all to do so in Ruby from what I've seen
without actually *reading* those lines from the IO stream, whereas what
you're talking about (implementing ARGF) is a one-time task that doesn't
actually need the existing IO#lineno= method at all.

···

On Mon, Nov 17, 2008 at 10:27:07PM +0900, Robert Klemme wrote:

2008/11/16 Chad Perrin <perrin@apotheon.com>:
> On Sun, Nov 16, 2008 at 08:36:50PM +0900, Robert Klemme wrote:
>> On 15.11.2008 20:51, Michael Guterl wrote:
>> >On Sat, Nov 15, 2008 at 2:08 PM, Chad Perrin <perrin@apotheon.com> wrote:

--
Chad Perrin [ content licensed PDL: http://pdl.apotheon.org ]
Quoth H. L. Mencken: "Democracy is the theory that the common people
know what they want and deserve to get it good and hard."

hi chad!

Chad Perrin [2008-11-16 01:32]:

Chad Perrin wrote:

2. Is there a way to do what I actually need -- specifically,
to iterate over an entire file *except* the first line, only
reading one line at a time into RAM -- without writing a C
extension for Ruby?

Why not iterate over the entire file and just ignore the first
line? IO#readline or IO#gets will get you a line at a time.

How do you propose I "ignore" the first line? That's what I was
trying to do -- by starting the iteration over lines in the file
with the second line. Are you saying I should just have an
orphan gets line in the code, then have a block in which I use
gets each line until I run out of file? I guess that would
probably work, but seems kinda . . . ugly.

not sure what tim was suggesting, but this will work:

  in_fh = File.open(infile, 'r')
  out_fh = File.open(outfile, 'w')

  # ignore first line
  in_fh.gets

  in_fh.each_line do |l|
    out_fh << l
  end

ok, it's not pretty, but not *that* ugly either, is it? :wink:

cheers
jens

···

On Sun, Nov 16, 2008 at 05:42:18AM +0900, Tim Hunter wrote:

[...]

Chad, let's agree to disagree and leave it at that.

robert

···

2008/11/17 Chad Perrin <perrin@apotheon.com>:

--
remember.guy do |as, often| as.you_can - without end

Chad, in your very long answer to Robert you forgot to quote the most
important part of his message:

Relax

I believe reasonable people can come to
different conclusions about what that meant. In other words, I believe
it's ambiguous.

Could you file a bug report on http://redmine.ruby-lang.org to make
the documentation less ambiguous to you? This way you'd help others,
who draw the same conclusions you did.

Wouldn't it be even more handy, in the general case, to be able to set
the offset in the file ahead a few hundred or thousand lines of text? It
certainly would have been more handy for me, especially since there does
not appear to be any way at all to do so in Ruby from what I've seen
without actually *reading* those lines from the IO stream (...)

Such a method could be handy, indeed, but it would have to do the
same: actually read all those lines from the IO. There's no other way
to do it.

Regards,
Pit

···

On Mon, Nov 17, 2008 at 10:27:07PM +0900, Robert Klemme wrote:

2008/11/17 Chad Perrin <perrin@apotheon.com>:

Jens Wille wrote:

hi chad!

Chad Perrin [2008-11-16 01:32]:

Chad Perrin wrote:

2. Is there a way to do what I actually need -- specifically,
to iterate over an entire file *except* the first line, only
reading one line at a time into RAM -- without writing a C
extension for Ruby?

Why not iterate over the entire file and just ignore the first
line? IO#readline or IO#gets will get you a line at a time.

How do you propose I "ignore" the first line? That's what I was
trying to do -- by starting the iteration over lines in the file
with the second line. Are you saying I should just have an
orphan gets line in the code, then have a block in which I use
gets each line until I run out of file? I guess that would
probably work, but seems kinda . . . ugly.

not sure what tim was suggesting, but this will work:

  in_fh = File.open(infile, 'r')
  out_fh = File.open(outfile, 'w')

  # ignore first line
  in_fh.gets

  in_fh.each_line do |l|
    out_fh << l
  end

ok, it's not pretty, but not *that* ugly either, is it? :wink:

cheers
jens

That's almost precisely what I was suggesting. For real code I'd want to handle the case of the file not having any lines, if that can occur. It doesn't look "ugly" to me at all.

···

On Sun, Nov 16, 2008 at 05:42:18AM +0900, Tim Hunter wrote:

--
RMagick: http://rmagick.rubyforge.org/

hi chad!

Hi.

not sure what tim was suggesting, but this will work:

  in_fh = File.open(infile, 'r')
  out_fh = File.open(outfile, 'w')

  # ignore first line
  in_fh.gets

  in_fh.each_line do |l|
    out_fh << l
  end

ok, it's not pretty, but not *that* ugly either, is it? :wink:

Actually, that looks pretty good, all things considered. I didn't
realize the IO#gets would increment the line number of the file so the
each_line iterator would start on the second line. If that's how it
works, I'm home free. Thank you for your help.

I'll go test this now to make sure it works.

···

On Sun, Nov 16, 2008 at 09:51:16AM +0900, Jens Wille wrote:

--
Chad Perrin [ content licensed PDL: http://pdl.apotheon.org ]
O'Rourke's Circumcision Precept: You can take 10 percent off the top of
anything.

It works fine and, to be clear, it has nothing to do with line numbers. gets() reads forward to a newline character, advancing the file pointer (represented by tell()/pos()) normally. each() works the same way, beginning at the current file pointer location. I think this is pretty standard I/O streaming logic in any language.

James Edward Gray II

···

On Nov 15, 2008, at 7:38 PM, Chad Perrin wrote:

On Sun, Nov 16, 2008 at 09:51:16AM +0900, Jens Wille wrote:

not sure what tim was suggesting, but this will work:

in_fh = File.open(infile, 'r')
out_fh = File.open(outfile, 'w')

# ignore first line
in_fh.gets

in_fh.each_line do |l|
   out_fh << l
end

ok, it's not pretty, but not *that* ugly either, is it? :wink:

Actually, that looks pretty good, all things considered. I didn't
realize the IO#gets would increment the line number of the file so the
each_line iterator would start on the second line. If that's how it
works, I'm home free. Thank you for your help.

I'll go test this now to make sure it works.

Chad Perrin [2008-11-16 02:38]:

Actually, that looks pretty good, all things considered. I
didn't realize the IO#gets would increment the line number of the
file so the each_line iterator would start on the second line.

maybe i should have said "throw away first line" in the comment? :wink:
because that's just what it does. but james already gave a fine
explanation.

and wrt tim's comment: since we're using IO#gets there's no need to
check for empty files. with IO#readline it's another matter. but why
bother? did you (tim) have something else in mind?

If that's how it works, I'm home free.

great!

Thank you for your help.

you're welcome.

cheers
jens