Popen and operations failing later

This is likely my brain missing something, but I'm having a hard
time figuring out the source of this error. I have code that
does:

class foo

  def get_next_line
    if the file isn't open
       @file = popen("zcat <file>")
    end
    begin
     line = @file.readline
     return line
    rescue => err
     print "eek: #{err}\n"
    end
  end

  def process
    line = get_next_line
    process_line(line)
  end

class bar < foo
  def process_line(line)
    .. do something with the line ..
    .. write to a file ..
  end
end

The problem is that the popen exits, typically with a
non-zero exitcode (garbage at the end of the gzipped file).
The next time the program does something in the
process_line function - typically writing to a file,
or reading/writing to/from the database - _that_
operation generates an error.

I assume I'm doing something wrong with respect to
trapping the errors from the popen, but I'm baffled
about exactly what I should do.

Clues _very_ appreciated; I've been bashing my head
on this for a while. Stuffing in lots and lots of
begin / end checks around the operations in process_line
typically does the trick, but it's very.. uhh, inelegant.

Thanks!

  -dave

···

--
work: dga@lcs.mit.edu me: dga@pobox.com
      MIT Laboratory for Computer Science http://www.angio.net/

not sure if it's important but

   irb(main):001:0> pipe = IO::popen 'bash -c exit'
   => #<IO:0xb73e0e1c>
   irb(main):002:0> pipe.read
   => ""
   irb(main):003:0> pipe.read
   => ""
   irb(main):004:0> pipe.read
   => ""
   irb(main):005:0> pipe.read
   => ""
   irb(main):006:0> pipe.read
   => ""

just because the process is dead - you still can 'read' from the pipe.

regards.

-a

···

On Sat, 16 Oct 2004, David G. Andersen wrote:

This is likely my brain missing something, but I'm having a hard
time figuring out the source of this error. I have code that
does:

class foo

def get_next_line
   if the file isn't open
      @file = popen("zcat <file>")
   end
   begin
    line = @file.readline
    return line
   rescue => err
    print "eek: #{err}\n"
   end
end

def process
   line = get_next_line
   process_line(line)
end

class bar < foo
def process_line(line)
   .. do something with the line ..
   .. write to a file ..
end
end

The problem is that the popen exits, typically with a
non-zero exitcode (garbage at the end of the gzipped file).
The next time the program does something in the
process_line function - typically writing to a file,
or reading/writing to/from the database - _that_
operation generates an error.

I assume I'm doing something wrong with respect to
trapping the errors from the popen, but I'm baffled
about exactly what I should do.

Clues _very_ appreciated; I've been bashing my head
on this for a while. Stuffing in lots and lots of
begin / end checks around the operations in process_line
typically does the trick, but it's very.. uhh, inelegant.

Thanks!

--

EMAIL :: Ara [dot] T [dot] Howard [at] noaa [dot] gov
PHONE :: 303.497.6469
When you do something, you should burn yourself completely, like a good
bonfire, leaving no trace of yourself. --Shunryu Suzuki

===============================================================================

What sort of errors? In other words, are they appropriate to the
place where they occur, assuming that garbage got through from the dead
pipe, or do they look like they are the uncaught pipe errors magically
reincarnated in a new place?

     Looking at your code, I suspect that "line" might be some unhappy
value if @file is questionable coming out of popen. Are you getting
"eek"ed at?

    -- Markus

···

On Fri, 2004-10-15 at 18:49, David G. Andersen wrote:

This is likely my brain missing something, but I'm having a hard
time figuring out the source of this error. I have code that
does:

class foo

  def get_next_line
    if the file isn't open
       @file = popen("zcat <file>")
    end
    begin
     line = @file.readline
     return line
    rescue => err
     print "eek: #{err}\n"
    end
  end

  def process
    line = get_next_line
    process_line(line)
  end

class bar < foo
  def process_line(line)
    .. do something with the line ..
    .. write to a file ..
  end
end

The problem is that the popen exits, typically with a
non-zero exitcode (garbage at the end of the gzipped file).
The next time the program does something in the
process_line function - typically writing to a file,
or reading/writing to/from the database - _that_
operation generates an error.

I assume I'm doing something wrong with respect to
trapping the errors from the popen, but I'm baffled
about exactly what I should do.

Clues _very_ appreciated; I've been bashing my head
on this for a while. Stuffing in lots and lots of
begin / end checks around the operations in process_line
typically does the trick, but it's very.. uhh, inelegant.

Thanks!

  -dave

A few more details about the problem I've been having with
a bad interaction between popen and various other calls. To
recap briefly:

  def get_next_line
    @file = popen("zcat next_file") if file not open..
OR
    @file = Zlib::GzipReader.new(next_file)

    line = @file.readline
    (check to see if line returned, go back and open new
     file if we reached the end of file
  end

  def do_some_work

     res = do a mysql query

    while (line = get_next_line)
** something = res.fetch_row
      .. do some work
    end
  end

The basic problem is that the res.fetch_row sometimes
returns nil when it shouldn't. But the same problem was
happening with some other things, like file write operations.
The problem always coincides with the end of one
file and opening a new one, though I don't know
if it happens before or after I open the new one.

If I have my MySQL dbh set to dbh.query_with_result = true
I don't experience the problem in the case mentioned above -
because all of the IO is handled before I start doing the
opens.

The problem is hard to replicate. Test runs with small datafiles
don't generate the problem, and adding error trapping around more
operations sometimes, but not always, makes the problem shift around.
The mysql death is more consistent than the writing to an ouput
file death.. This is under ruby 1.8.1 (2004-05-02) [i386-freebsd5].
The problem occurs under FreeBSD 4.10 and 5.2.1.

I'm not sure where the problem would be, since it seems to
occur both when I popen a zcat or when I use GzipReader.
But it _seems_ like the symptom is that a file descriptor
is getting trashed or closed when it shouldn't be...
are there known ways to accidentally close file descriptors
that I might be stepping on?

(is it correct to call

  begin
    file.readline
  end
  if (we got to the end of the file)
    file.close
  end
?)

Any further clues on this would appreciated again. :slight_smile:

  -Dave

···

--
work: dga@lcs.mit.edu me: dga@pobox.com
      MIT Laboratory for Computer Science http://www.angio.net/

On Tue, Oct 19, 2004 at 01:15:56PM +0900, David G. Andersen scribed:

    while (line = get_next_line)
** something = res.fetch_row
      .. do some work
    end
  end

btw, someone asked about any error message from this. Nothing
gets caught by a begin/rescue/end around the fetch_row operation,
but checking the mysql error explicitly indicates

  2013: Lost connection to MySQL server during query

... which sounds a lot like the file descriptor getting
closed out from underneath it. The question still
remains, though - my bug, or ruby's? :slight_smile:

Still mystified,

  -Dave

···

--
work: dga@lcs.mit.edu me: dga@pobox.com
      MIT Laboratory for Computer Science http://www.angio.net/

On Tue, Oct 19, 2004 at 01:15:56PM +0900, David G. Andersen scribed:

A few more details about the problem I've been having with
a bad interaction between popen and various other calls. To
recap briefly:

     res = do a mysql query

    while (line = get_next_line)
** something = res.fetch_row
      .. do some work

  Just as a followup - bug located. One of the tables ended
up blocking on read for a long time, causing the mysql server
to time out its connection after 60 seconds (net_write_timeout
variable in mysql server). Of course, there was no informative
message like "hey, my connection timed out." No ruby bug. Go ruby! :slight_smile:

  -Dave