Really confused about exceptions now

On top of the memory leak issue, I have been trying to track down unhandled
exceptions in my code. I have run across a very strange behavior that I will
try and explain.

Problem(?) code (line numbers from bsn_a.rb)

  141 def alive?
  142 t = TCPSocket.new(@host, @port)
  143 return true
  144
  145 rescue Errno::ETIMEDOUT
  146 @exception = " Timed out (#{@host}:#{@port})"
  147 rescue SocketError => e
  148 @exception = " Socket error - #{e}"
  149 rescue Exception => e
  150 @exception = e
  151 return false
  152 end

So, in a test driver, it all works as expected with junk data:

11:42 (kant)$ ruby test.rb
  .. trying to go to Foo (Foo:10.10.10.5:foo:bar)
  --> failed Foo:10.10.10.5:foo:bar
  .. trying to go to Foo (Bar:10.10.10.6:foo:bar)
  --> failed Bar:10.10.10.6:foo:bar

However, in my actual program, something really bizarre happens:

11:43 (kant)$ ruby healthcollect.rb -g -n eeua.txt -c flow.txt -d data
  .. Running 10 commands on 2 nodes.
  .. Data going into directory --> data/20050210_1145_eeua
  .. processing the nodes... (thread count=35)
  .. threading now ...
  .. trying to go to Foo (Foo:10.10.10.5:foo:bar)
  Exception `SocketError' at ./bsn_a.rb:142 - getaddrinfo: hostname nor
  servname provided, or not known
  .. trying to go to Bar (Bar:10.10.10.6:foo:bar)
  Exception `SocketError' at ./bsn_a.rb:142 - getaddrinfo: hostname nor
  servname provided, or not known
  Exception `SocketError' at /usr/local/lib/ruby/1.8/net/telnet.rb:352 -
   getaddrinfo: hostname nor servname provided, or not known
  Exception `SocketError' at /usr/local/lib/ruby/1.8/net/telnet.rb:352 -
   getaddrinfo: hostname nor servname provided, or not known
  Exception `SocketError' at /usr/local/lib/ruby/1.8/net/telnet.rb:360 -
   getaddrinfo: hostname nor servname provided, or not known
  --> failed Foo:10.10.10.5:foo:bar
  Exception `SocketError' at /usr/local/lib/ruby/1.8/net/telnet.rb:360 -
   getaddrinfo: hostname nor servname provided, or not known
  --> failed Bar:10.10.10.6:foo:bar

Telnet is throwing a 'SocketError' and line 142 is throwing one, and neither
are being caught!

Now, if I comment out 147-148, I get the following from the program:

11:44 (kant)$ ruby healthcollect.rb -g -n eeua.txt -c flow.txt -d data
  .. Running 10 commands on 2 nodes.
  .. Data going into directory --> data/20050210_1144_eeua
  .. processing the nodes... (thread count=35)
  .. threading now ...
  .. trying to go to Foo (Foo:10.10.10.5:foo:bar)
  Exception `SocketError' at ./bsn_a.rb:142 - getaddrinfo: hostname nor
  servname provided, or not known
  .. trying to go to Bar (Bar:10.10.10.6:foo:bar)
  Exception `SocketError' at ./bsn_a.rb:142 - getaddrinfo: hostname nor
  servname provided, or not known
  --> failed Bar:10.10.10.6:foo:bar
  --> failed Foo:10.10.10.5:foo:bar

So, it throws the exception at line 142, but Telnet exception goes away!?!

Can anyone shed any light on what is happening here? I really have no clue on
how to proceed at this point.

As far as I can tell, the test driver is an accurate model of the 'real'
program -- it is threaded, it has the same class hierarchy, it includes the
same libraries, it just doesn't have all the pre- and post-processing in it.
They are both including the same 'bsn_a.rb'.

11:52 (kant)$ ruby -v
ruby 1.8.2 (2004-12-25) [i386-freebsd5.3]

Regards,

···

--
-mark. (probertm at acm dot org)

* Mark Probert <probertm@acm.org> [2005-02-11 04:56:09 +0900]:

On top of the memory leak issue, I have been trying to track down unhandled
exceptions in my code. I have run across a very strange behavior that I will
try and explain.

Problem(?) code (line numbers from bsn_a.rb)

  141 def alive?
  142 t = TCPSocket.new(@host, @port)
  143 return true
  144
  145 rescue Errno::ETIMEDOUT
  146 @exception = " Timed out (#{@host}:#{@port})"

# returns a String, which is true.

  147 rescue SocketError => e
  148 @exception = " Socket error - #{e}"

# returns a String, which is true.

  149 rescue Exception => e
  150 @exception = e
  151 return false
  152 end

Just a quick note, your exceptions are returning true.

···

--
Jim Freeze
Code Red. Code Ruby

Hi ..

Just a quick note, your exceptions are returning true.

That doesn't seem to be the case. Just checked with:

12:43 (kant)$ cat test2.rb
#! /usr/local/bin/ruby
require 'socket'

def alive?(host, port=80)
    t = TCPSocket.new(host, port)
    return true
  rescue Errno::ETIMEDOUT
    puts " ERR: Timed out (#{host}:#{port})"
  rescue SocketError => e
    puts " ERR: Socket problem: #{e}"
  rescue Exception => e
    puts " ERR: #{e}"
    return false
end

ip = ARGV[0]
f = alive?(ip)
if f
    puts "got to #{ip}"
else
    puts "failed (#{ip})"
end

12:45 (kant)$ ruby test2.rb "127.0.0.1"
got to 127.0.0.1

12:47 (kant)$ ruby test2.rb "foo"
ERR: Socket problem: getaddrinfo: hostname nor servname provided, or not
known
failed (foo)

I just commented out the 'return false', and that makes no difference to the
result. So, it seems that when the exception is called, the block returns
'nil'. Interesting.

Thanks for your help.

Regards,

···

On Thursday 10 February 2005 12:17, jim@freeze.org wrote:

--
-mark. (probertm at acm dot org)

Mark Probert wrote:

Hi ..

Just a quick note, your exceptions are returning true.

That doesn't seem to be the case. Just checked with:

12:43 (kant)$ cat test2.rb
#! /usr/local/bin/ruby
require 'socket'

def alive?(host, port=80)
    t = TCPSocket.new(host, port)
    return true
  rescue Errno::ETIMEDOUT
    puts " ERR: Timed out (#{host}:#{port})"
  rescue SocketError => e
    puts " ERR: Socket problem: #{e}"
  rescue Exception => e
    puts " ERR: #{e}"
    return false
end

ip = ARGV[0]
f = alive?(ip)
if f
    puts "got to #{ip}"
else
    puts "failed (#{ip})"
end

12:45 (kant)$ ruby test2.rb "127.0.0.1"
got to 127.0.0.1

12:47 (kant)$ ruby test2.rb "foo" ERR: Socket problem: getaddrinfo: hostname nor servname provided, or not known
failed (foo)

I just commented out the 'return false', and that makes no difference to the result. So, it seems that when the exception is called, the block returns 'nil'. Interesting.

Thanks for your help.

Regards,

When you do not have explicit "return" statement in your rescue, the execution falls down to the end of the method and returns whatever the last executed statement had returned (FYI: puts() returns nil).

And in 'if' operator treats nil and false as false, everything else is true.

Gennady.

···

On Thursday 10 February 2005 12:17, jim@freeze.org wrote:

Hi ..

When you do not have explicit "return" statement in your rescue, the
execution falls down to the end of the method and returns whatever the
last executed statement had returned (FYI: puts() returns nil).

Thank you. So, a more correct version of the method is:

  def alive?
      begin
          t = TCPSocket.new(@host, @port) ### line 143
          return true
          
      rescue Errno::ETIMEDOUT
          @exception = " Timed out (#{@host}:#{@port})"
      rescue SocketError => e
          @exception = " Socket error - #{e}"
      rescue Exception => e
          @exception = e
      end
      return false
  end

However, I still have the problem that this works in my test harness and not
in my actual code at the TCPSocket() call:

14:30 (kant)$ ruby test.rb
  .. trying to go to Foo (Foo:10.10.10.5:foo:bar)
  --> failed Foo:10.10.10.5:foo:bar
  .. trying to go to Foo (Foo:10.10.10.5:foo:bar)
  --> failed Foo:10.10.10.5:foo:bar

14:31 (kant)$ ruby healthcollect.rb -g -n eeua.txt -c flow.txt -d data
  .. Running 10 commands on 2 nodes.
  .. processing the nodes... (thread count=35)
  .. threading now ...
  .. trying to go to Foo (Foo:10.10.10.5:foo:bar)
Exception `SocketError' at ./bsn_a.rb:143 - getaddrinfo: hostname nor
   servname provided, or not known
  --> failed Foo:10.10.10.5:foo:bar
  .. trying to go to Bar (Bar:10.10.10.6:foo:bar)
Exception `SocketError' at ./bsn_a.rb:143 - getaddrinfo: hostname nor
   servname provided, or not known
  --> failed Bar:10.10.10.6:foo:bar

I have no idea as to why the SocketError would be caught in one case and not
the other.

Regards,

···

On Thursday 10 February 2005 13:25, Gennady Bystritksy wrote:

--
-mark. (probertm at acm dot org)