PTY.spawn error-prone?

I've been playing with PTY and it seems to me that this
implementation is exceptionally prone to errors.

Can anyone comment? I'm not knowledgeable of the implementation
details, but I plan to look at it now.

Thanks,
Hal

Hal Fulton wrote:

I've been playing with PTY and it seems to me that this
implementation is exceptionally prone to errors.

Can anyone comment? I'm not knowledgeable of the implementation
details, but I plan to look at it now.

What kind of errors? Exceptions, like the child exiting, and the fun related to that, or real bug-type errors?

Ben

Ben Giddings wrote:

Hal Fulton wrote:

I've been playing with PTY and it seems to me that this
implementation is exceptionally prone to errors.

Can anyone comment? I'm not knowledgeable of the implementation
details, but I plan to look at it now.

What kind of errors? Exceptions, like the child exiting, and the fun related to that, or real bug-type errors?

The former, I suppose. Isn't it the case that the child may exit for
a variety of reasons? And doesn't this exception sometimes occur for
no apparent reason at all?

Forgive my ignorance. Only part of this is based on experience.

Allow me to quote from the rexpect docs on rubyforge:

     19 #
     20 # Suppose you tell it to spawn off a command that doesn't exist....
     21 # What do you think will happen?
     22 # 1) You get a sane error code?
     23 # 2) It raises an appropriate Exception?
     24 # 3) Returns quite happily and at some random time later, throws a
     25 # PTY::ChildExited exception?
     26 #
     27 # Yup. Option 3) is what happens.
     28 # Evil isn't it?!

Well, that doesn't sound *so* bad... but I would swear that sometimes
it is crapping out for no reason at all. (Yeah, I know, "select isn't
broken" -- thanks, Dave.)

I'll investigate more. If anyone has knowledge, please chime in.

Thanks,
Hal

Ben Giddings wrote:

Hal Fulton wrote:

I've been playing with PTY and it seems to me that this
implementation is exceptionally prone to errors.

Let me elaborate a little, since I wrote this comments below....

Allow me to quote from the rexpect docs on rubyforge:

     19 #
     20 # Suppose you tell it to spawn off a command that doesn't exist....
     21 # What do you think will happen?
     22 # 1) You get a sane error code?
     23 # 2) It raises an appropriate Exception?
     24 # 3) Returns quite happily and at some random time later, throws a
     25 # PTY::ChildExited exception?
     26 #
     27 # Yup. Option 3) is what happens.
     28 # Evil isn't it?!

Firstly, most people _won't_ notice this as a problem. If you working
on an unloaded system, the PTY::ChildExited exception happens
immediately. It is when you are driving the system to the edge by
using something like rexpect to drive hundreds of processes that the
problem hits.

Looking deep into the Pty code (cvs latest) we have...

a) fork()
b) if child
c) Do Lots of stuff
d) exec "sh"
e) shell does PATH lookup.
f) PATH lookup fails
g) shell exits "bad command"
h) ruby interpretor receives a "SIGCHLD".

Now from the moment we fork(), it is up to the OS scheduler, depending
on the current load, device(hard drive, keyboard,...) activity, phase
of moon, etc, etc how long it takes to get from a) to h).

Thus several events can get overloaded _indistinguishably_ onto that
one SIGCHLD....

    * Exec'd program couldn't be started. (Command not found,
      permission denied etc.)
    * Exec'd program failed to start. (Not a valid program)
    * Exec'd program started and died.
    * Exec'd program, started, did it's thing correctly, and exited before
      you noticed.

ie. In a sense the problem is not with Pty, it is with the notion of
invoking 'sh' to do our PATH lookups, IO redirection and command line
globbing. Then we could get much more sensible exceptions coming back
than just PTY::ChildExited

I have said it before in this forum, we need to be able to do what
"system()" and backtick and "open" and "exec" do _without_ invoking
"sh", that twenty year old downgrade to ruby.

John Carter Phone : (64)(3) 358 6639
Tait Electronics Fax : (64)(3) 359 4632
PO Box 1645 Christchurch Email : john.carter@tait.co.nz
New Zealand

The universe is absolutely plastered with the dashed lines exactly one
space long.

···

On Thu, 24 Jun 2004, Hal Fulton wrote:

Well, that doesn't sound *so* bad... but I would swear that sometimes
it is crapping out for no reason at all. (Yeah, I know, "select isn't
broken" -- thanks, Dave.)

Funny thing... A day or two after reading this, and smiling
at the "select isn't broken" part.... One of my Ruby servers
with months of uptime that has always been very stable, started
hanging in the recvfrom call below:

  resp = if select([sock], nil, nil, UDP_RECV_TIMEOUT)
    sock.recvfrom(65536)
  end

I'd restart the process, and within about 24 hours it would
hang again, same place.

Since I hadn't changed anything in the operating system or
the ruby code... I started to wonder if maybe select WAS
broken. :slight_smile:

My system (a linux box) had about 200 days uptime... so I
decided to try a reboot. The problem seems to have vanished.

I just thought it was funny, because "select isn't broken"
sound like VERY good advice. Only.... once in a blue moon....
Hehe...

Regards,

Bill

···

From: "Hal Fulton" <hal9000@hypermetrics.com>

John Carter wrote:

Ben Giddings wrote:

Hal Fulton wrote:

I've been playing with PTY and it seems to me that this
implementation is exceptionally prone to errors.

Let me elaborate a little, since I wrote this comments below....

Thanks, John. Didn't mean to quote you out of context.

I've just figured out one of my problems: Sometimes the spawned
process runs fine, it just runs to completion too fast for me to
do anything.

The 'read' object returned is nil apparently. I'm confused.

Example: I 'cat' a one-line file. Naturally it's very fast.

How should I handle this?

Hal

···

On Thu, 24 Jun 2004, Hal Fulton wrote: