Really confused about exceptions now

Lähettäjä: Mark Probert <probertm@acm.org>
Aihe: Re: Really confused about exceptions now

Hi ..

>
> When you do not have explicit "return" statement in your rescue, the
> execution falls down to the end of the method and returns whatever the
> last executed statement had returned (FYI: puts() returns nil).
>
Thank you. So, a more correct version of the method is:

  def alive?
      begin
          t = TCPSocket.new(@host, @port) ### line 143
          return true
          
      rescue Errno::ETIMEDOUT
          @exception = " Timed out (#{@host}:#{@port})"
      rescue SocketError => e
          @exception = " Socket error - #{e}"
      rescue Exception => e
          @exception = e
      end
      return false
  end

Just for clarity, you could restructure that one more time :slight_smile:

def alive?
  begin
    t = TCPSocket.new(@host, @port) ### line 143

  rescue Errno::ETIMEDOUT
    @exception = " Timed out (#{@host}:#{@port})"
    false
  rescue SocketError => e
    @exception = " Socket error - #{e}"
    false
  rescue Exception => e
    @exception = e
    false

  # Alive!
  else
    true

  # Clean up
  ensure
    # Close the socket if it's open
  end
end

However, I still have the problem that this works in my test harness and not
in my actual code at the TCPSocket() call:

14:30 (kant)$ ruby test.rb
  .. trying to go to Foo (Foo:10.10.10.5:foo:bar)
  --> failed Foo:10.10.10.5:foo:bar
  .. trying to go to Foo (Foo:10.10.10.5:foo:bar)
  --> failed Foo:10.10.10.5:foo:bar

14:31 (kant)$ ruby healthcollect.rb -g -n eeua.txt -c flow.txt -d data
  .. Running 10 commands on 2 nodes.
  .. processing the nodes... (thread count=35)
  .. threading now ...
  .. trying to go to Foo (Foo:10.10.10.5:foo:bar)
Exception `SocketError' at ./bsn_a.rb:143 - getaddrinfo: hostname nor
   servname provided, or not known
  --> failed Foo:10.10.10.5:foo:bar
  .. trying to go to Bar (Bar:10.10.10.6:foo:bar)
Exception `SocketError' at ./bsn_a.rb:143 - getaddrinfo: hostname nor
   servname provided, or not known
  --> failed Bar:10.10.10.6:foo:bar

I have no idea as to why the SocketError would be caught in one case and not
the other.

Are you doing anything else with 't' at all? Or maybe a duplicate
file you're including? You might work around it by doing
BasicSocket.do_not_reverse_lookup = true?

Regards,

--
-mark. (probertm at acm dot org)

E

···

On Thursday 10 February 2005 13:25, Gennady Bystritksy wrote:

"E S" <eero.saynatkari@kolumbus.fi> schrieb im Newsbeitrag
news:20050211034606.MREY15813.fep31-app.kolumbus.fi@mta.imail.kolumbus.fi...

> Lähettäjä: Mark Probert <probertm@acm.org>
> Aihe: Re: Really confused about exceptions now
>
> Hi ..
>
> >
> > When you do not have explicit "return" statement in your rescue, the
> > execution falls down to the end of the method and returns whatever

the

> > last executed statement had returned (FYI: puts() returns nil).
> >
> Thank you. So, a more correct version of the method is:
>
> def alive?
> begin
> t = TCPSocket.new(@host, @port) ### line 143
> return true
>
> rescue Errno::ETIMEDOUT
> @exception = " Timed out (#{@host}:#{@port})"
> rescue SocketError => e
> @exception = " Socket error - #{e}"
> rescue Exception => e
> @exception = e
> end
> return false
> end

Just for clarity, you could restructure that one more time :slight_smile:

def alive?
  begin
    t = TCPSocket.new(@host, @port) ### line 143

  rescue Errno::ETIMEDOUT
    @exception = " Timed out (#{@host}:#{@port})"
    false
  rescue SocketError => e
    @exception = " Socket error - #{e}"
    false
  rescue Exception => e
    @exception = e
    false

  # Alive!
  else
    true

  # Clean up
  ensure
    # Close the socket if it's open
  end
end

Hm, the usual idiom is this:

acquire_resource
begin
  use resource
ensure
  free resource
end

Which means ensure is never called if the acquire fails. So I would do it
like this:

def alive?
  # you might want to clear @exception here

  begin
    TCPSocket.new(@host, @port).close
    # we would have a separate begin-ensure-end block
    # here if we were using the socket
    true
  rescue Exception => e
    @exception = e
    false
  end
end

.... and defer the print formatting of the exception to a later point in
time. These are the advantages:

- smaller and more readable code

- the original exception is retained for later use

- faster, as there is no processing done on the exception,
   which might never occur if @exception is never printed
   or otherwise evaluated

You might want to consider an alternative approach:

def error?
  begin
    TCPSocket.new(@host, @port).close
    nil
  rescue Exception => e
    e
  end
end

This returns the exception directly and you can do whatever you want with
it or just use it in a boolean context.

Kind regards

    robert

···

> On Thursday 10 February 2005 13:25, Gennady Bystritksy wrote:

Hi ..

Which means ensure is never called if the acquire fails. So I would do it
like this:

  <excellent version removed>

Thank you.

Ara also mention, off-list, that I am catching the exception, otherwise the
app would be failing. He suggested that the issue could also be with how I
am using @exception.

I am not so sure that is the case. The basic driver that I use looks like:

    require 'bsn_a.rb'
    def run
        threads =
        @addr.each do |ip_addr|
            threads << Thread.new(ip_addr) do |node|
              name, ip, usr, pwd = node.split(/:/)
              
              bsn = BSN.new(node) #### deliberately fail for now
              bsn.user = usr
              bsn.pwd = pwd

              puts " .. trying to go to #{nn} (#{bsn.host})"
       f = bsn.login #### this is line 144
              puts " --> #{(f ? " got there" : " failed ")} #{bsn.host}"
            end
        end
        threads.each { |thr| thr.join }
    end

The program creates a BSN object then calls login().

  def login
      f = alive?
      if not f
          _pr "login --> host not alive (#{@exception})" ### log to file
          return false
      end
      ### go on with login stuff
  end

So, it is calling alive? to find out if we can connect, before we try and
connect for real.

My problem is that when I do this from my test harness, line 144 doesn't throw
an exception:

8:48 (kant)$ ruby test.rb
  .. trying to go to Foo (Foo:10.10.10.5:foo:bar)
  --> failed Foo:10.10.10.5:foo:bar
  .. trying to go to Foo (Foo:10.10.10.5:foo:bar)
  --> failed Foo:10.10.10.5:foo:bar

However, from the 'real' program it does throw one at line 144 ???

8:49 (kant)$ ruby bsncoll.rb -g -n eeua.txt -c flow.txt -d data
  .. trying to go to Foo (Foo:10.10.10.5:foo:bar)
  Exception `SocketError' at ./bsn_a.rb:144 - getaddrinfo: hostname
    nor servname provided, or not known
  --> failed Foo:10.10.10.5:foo:bar
  .. trying to go to Bar (Bar:10.10.10.6:foo:bar)
  Exception `SocketError' at ./bsn_a.rb:144 - getaddrinfo: hostname
    nor servname provided, or not known
  --> failed Bar:10.10.10.6:foo:bar
  - post-processing data ... please wait

Ara is right that the program doesn't die. However, it is, in the second case
throwing a message to stderr that the first program doesn't do. I am not
certain, but I think that this may be part of my memory leak problem that I
mentioned in another thread. And I don't seem to be able to make it go away!

This behaviour seems abnormal. If the exception is thrown at 144, then I
should be able to rescue the way that Robert points out. Except, it doesn't
in this case.

Still confused ...

···

On Friday 11 February 2005 00:20, Robert Klemme wrote:

--
-mark. (probertm at acm dot org)

"Mark Probert" <probertm@acm.org> schrieb im Newsbeitrag
news:200502110912.31423.probertm@acm.org...

Hi ..

>
> Which means ensure is never called if the acquire fails. So I would

do it

> like this:
>
> <excellent version removed>
>
Thank you.

Ara also mention, off-list, that I am catching the exception, otherwise

the

app would be failing. He suggested that the issue could also be with

how I

am using @exception.

I am not so sure that is the case. The basic driver that I use looks

like:

    require 'bsn_a.rb'
    def run
        threads =
        @addr.each do |ip_addr|
            threads << Thread.new(ip_addr) do |node|
              name, ip, usr, pwd = node.split(/:/)

              bsn = BSN.new(node) #### deliberately fail for now
              bsn.user = usr
              bsn.pwd = pwd

              puts " .. trying to go to #{nn} (#{bsn.host})"
       f = bsn.login #### this is line 144
              puts " --> #{(f ? " got there" : " failed ")}

#{bsn.host}"

            end
        end
        threads.each { |thr| thr.join }
    end

The program creates a BSN object then calls login().

  def login
      f = alive?
      if not f
          _pr "login --> host not alive (#{@exception})" ### log to

file

          return false
      end
      ### go on with login stuff
  end

So, it is calling alive? to find out if we can connect, before we try

and

connect for real.

My problem is that when I do this from my test harness, line 144 doesn't

throw

an exception:

8:48 (kant)$ ruby test.rb
  .. trying to go to Foo (Foo:10.10.10.5:foo:bar)
  --> failed Foo:10.10.10.5:foo:bar
  .. trying to go to Foo (Foo:10.10.10.5:foo:bar)
  --> failed Foo:10.10.10.5:foo:bar

However, from the 'real' program it does throw one at line 144 ???

8:49 (kant)$ ruby bsncoll.rb -g -n eeua.txt -c flow.txt -d data
  .. trying to go to Foo (Foo:10.10.10.5:foo:bar)
  Exception `SocketError' at ./bsn_a.rb:144 - getaddrinfo: hostname
    nor servname provided, or not known
  --> failed Foo:10.10.10.5:foo:bar
  .. trying to go to Bar (Bar:10.10.10.6:foo:bar)
  Exception `SocketError' at ./bsn_a.rb:144 - getaddrinfo: hostname
    nor servname provided, or not known
  --> failed Bar:10.10.10.6:foo:bar
  - post-processing data ... please wait

Ara is right that the program doesn't die. However, it is, in the

second case

throwing a message to stderr that the first program doesn't do. I am

not

certain, but I think that this may be part of my memory leak problem

that I

mentioned in another thread. And I don't seem to be able to make it go

away!

Did you try with "Thread.abort_on_exception = true"? Maybe you just don't
see the exception in one of the cases.

This behaviour seems abnormal. If the exception is thrown at 144, then

I

should be able to rescue the way that Robert points out. Except, it

doesn't

in this case.

Currently I have no explanation for your problem. Is it really the same
ruby etc.?

Personally I would not do a test beforehand to see whether I can connect.
I'd just connect, catch exceptions and handle them. Because even if
alive? succeeded your connect can still fail (for example if something
happened on the network between your test and the real connect).

Btw, you could as well implement something like BSN.open.

class BSN
  def self.open(node,user,pass)
    bsn = self.new
    bsn.user = user
    bsn.pass = pass

    bsn.connect # socket connect
    begin
      yield bsn
    ensure
      bsn.close # logout and disconnect
    end
  end
end

Then you can do

BSN.open(node,user,pass) do |bsn|
  bsn.login
  ...
end

and be sure that the bsn is always properly closed.

Kind regards

    robert

···

On Friday 11 February 2005 00:20, Robert Klemme wrote:

Hi ..

Did you try with "Thread.abort_on_exception = true"? Maybe you just don't
see the exception in one of the cases.

Yes, I did. No change in behaviour.

Currently I have no explanation for your problem. Is it really the same
ruby etc.?

Yup.

10:04 (kant)$ ruby -v
ruby 1.8.2 (2004-12-25) [i386-freebsd5.3]

and

13:11 (zcars12g)$ ruby -v
ruby 1.8.0 (2003-08-04) [sparc-solaris2.8]

Both show exactly the same behaviour.

Personally I would not do a test beforehand to see whether I can connect.
I'd just connect, catch exceptions and handle them. Because even if
alive? succeeded your connect can still fail (for example if something
happened on the network between your test and the real connect).

Thank you. I really like your implementation of open() and will do it. I'll
post back and see if there is a difference in behaviour.

Regards,

···

On Friday 11 February 2005 09:35, Robert Klemme wrote:

--
-mark. (probertm at acm dot org)