Stream closed (IOError)

This code:

        puts "closed?: %s" % @socket.closed?.to_s

        begin
            @socket.close
        rescue Exception => e
            return
        rescue IOError => e
            return
        rescue Errno => e
            return
        end

outputs this:

closed?: false
/Users/rakaur/Sources/wc/ruby/synapse/trunk/lib/xmppd/xmpp/stream.rb:231:in
`ioloop': stream closed (IOError)
  from /Users/rakaur/Sources/wc/ruby/synapse/trunk/lib/xmppd.rb:196:in `loop'
  from /Users/rakaur/Sources/wc/ruby/synapse/trunk/lib/xmppd.rb:196:in `ioloop'
  from ./bin/xmppd:20
praxis:trunk rakaur$

Latest stable Ruby (1.8.7-p72). I don't see how this is possible.
Google seems mostly silent on the matter unless you speak Japanese.

Any ideas?

Hi,

···

In message "Re: stream closed (IOError)" on Sun, 5 Oct 2008 12:51:52 +0900, "Eric Will" <rakaur@malkier.net> writes:

This code:

       puts "closed?: %s" % @socket.closed?.to_s

       begin
           @socket.close
       rescue Exception => e
           return
       rescue IOError => e
           return
       rescue Errno => e
           return
       end

outputs this:

closed?: false
/Users/rakaur/Sources/wc/ruby/synapse/trunk/lib/xmppd/xmpp/stream.rb:231:in
`ioloop': stream closed (IOError)
from /Users/rakaur/Sources/wc/ruby/synapse/trunk/lib/xmppd.rb:196:in `loop'
from /Users/rakaur/Sources/wc/ruby/synapse/trunk/lib/xmppd.rb:196:in `ioloop'
from ./bin/xmppd:20
praxis:trunk rakaur$

It's hard to tell without full executable script, but at least the
exception was raised _outside_ of your code snippet. I believe you
touched @socket after you closed it.

              matz.

Line 231 is the @socket.close line. Nothing gets done to the socket
after this, or if it is, it's not getting to it because it crashes on
@socket.close. Nothing should even be touching the entire object after
this, because it sets a flag saying "I'm dead, please GC me."

It's a large program, and the code is available at
http://synapse.malkier.net/browser. The socket closing is in
lib/xmppd/xmpp/stream.rb. The code right there doesn't have the
exception catching written in, because I removed the thing that causes
this to happen, which is this:

        # Check for connection timeouts.
        #Timer::Timer.new('check connection timeouts', 60, true) do
        # $connections.each do |c|
        # c.error('connection-timeout') if ($time - c.rtime) >= 30
        # end

I'll try putting the exception around this. One second.

No, that doesn't work. Nor does this:

        # Check for connection timeouts.
        Timer::Timer.new('check connection timeouts', 60, true) do
            $connections.each do |c|
                next if c.dead? or c.socket.closed?
                c.error('connection-timeout') if ($time - c.rtime) >= 30
            end

The socket.closed? returns false, then ONE LINE LATER when I call
socket.close, the IOError is raised and cannot seem to be rescued.

Ok, I've updated the code in svn. Relevant files are
http://synapse.malkier.net/browser/trunk/lib/xmppd.rb (which is where
the Timer is started),
http://synapse.malkier.net/browser/trunk/lib/xmppd/timer.rb (which is
the Timers themselves), and
http://synapse.malkier.net/browser/trunk/lib/xmppd/xmpp/stream.rb
(which is where the socket is closing.

This code outputs this:

/Users/rakaur/Sources/wc/ruby/synapse/trunk/lib/xmppd/xmpp/stream.rb:230:in
`ioloop': stream closed (IOError)
  from /Users/rakaur/Sources/wc/ruby/synapse/trunk/lib/xmppd.rb:197:in `loop'
  from /Users/rakaur/Sources/wc/ruby/synapse/trunk/lib/xmppd.rb:197:in `ioloop'
  from ./bin/xmppd:20

Perhaps it has to do with the timers running in a separate thread?
Putting begin/rescue in the tread doesn't catch it, either.

Thanks.

Hi,

Line 231 is the @socket.close line. Nothing gets done to the socket
after this, or if it is, it's not getting to it because it crashes on
@socket.close. Nothing should even be touching the entire object after
this, because it sets a flag saying "I'm dead, please GC me."

I don't think we can trust line numbers here, maybe due to bugs. The
back trace does not make sense at all.

/Users/rakaur/Sources/wc/ruby/synapse/trunk/lib/xmppd/xmpp/stream.rb:230:in
`ioloop': stream closed (IOError)
from /Users/rakaur/Sources/wc/ruby/synapse/trunk/lib/xmppd.rb:197:in `loop'
from /Users/rakaur/Sources/wc/ruby/synapse/trunk/lib/xmppd.rb:197:in `ioloop'
from ./bin/xmppd:20

stream.rb:230 is not within ioloop; xmppd:20 does not call ioloop.
The fact rescue does not catch the exception also suggest the wrong
line numbers. How about stopping using timer library, as a start?

              matz.

···

In message "Re: stream closed (IOError)" on Sun, 5 Oct 2008 23:54:36 +0900, "Eric Will" <rakaur@malkier.net> writes:

I don't think we can trust line numbers here, maybe due to bugs. The
back trace does not make sense at all.
[...]
stream.rb:230 is not within ioloop; xmppd:20 does not call ioloop.
The fact rescue does not catch the exception also suggest the wrong
line numbers. How about stopping using timer library, as a start?

Hey matz. Thanks for your help. Everything is within ioloop.
bin/xmppd:20 is `XMPPd.instance.ioloop`
When I don't use the timers, this doesn't happen. In fact, it doesn't
happen until I use that specific Timer to do connection timeouts in
lib/xmppd.rb.

                                                       matz.

-- Eric Will

···

On Sun, Oct 5, 2008 at 11:46 AM, Yukihiro Matsumoto <matz@ruby-lang.org> wrote:

Not sure if your problem could be related to this post:

···

El Domingo, 5 de Octubre de 2008, Eric Will escribió:

On Sun, Oct 5, 2008 at 11:46 AM, Yukihiro Matsumoto <matz@ruby-lang.org> wrote:
> I don't think we can trust line numbers here, maybe due to bugs. The
> back trace does not make sense at all.
> [...]
> stream.rb:230 is not within ioloop; xmppd:20 does not call ioloop.
> The fact rescue does not catch the exception also suggest the wrong
> line numbers. How about stopping using timer library, as a start?

Hey matz. Thanks for your help. Everything is within ioloop.
bin/xmppd:20 is `XMPPd.instance.ioloop`
When I don't use the timers, this doesn't happen. In fact, it doesn't
happen until I use that specific Timer to do connection timeouts in
lib/xmppd.rb.

--
Iñaki Baz Castillo

Not sure I grasp everything going on there. I don't use threads for
anything EXCEPT running my timers, which use Timeout. This could
potentially be the culprit, but I have no idea how that helps me.
Trying to rescue the IOError in my Timer library hasn't worked,
either.

matz, any input on this?

Thanks guys.

-- Eric Will

···

On Sun, Oct 5, 2008 at 12:03 PM, Iñaki Baz Castillo <ibc@aliax.net> wrote:

Not sure if your problem could be related to this post:

http://blog.headius.com/2008/02/rubys-threadraise-threadkill-timeoutrb.html

Hi,

timer.rb (820 Bytes)

···

In message "Re: stream closed (IOError)" on Mon, 6 Oct 2008 01:10:05 +0900, "Eric Will" <rakaur@malkier.net> writes:

matz, any input on this?

Since I still can not reproduce the problem, I can not say clearly, but:

* timer callback may touch the socket right after it's closed.
* you've mixed fork and threads, which may not work properly.
* you don't need timeout.rb. try attached simpler version.

              matz.

Since I still can not reproduce the problem, I can not say clearly, but:

* timer callback may touch the socket right after it's closed.
* you've mixed fork and threads, which may not work properly.
* you don't need timeout.rb. try attached simpler version.

Thanks. I should have thought of that to begin with. :slight_smile:

It doesn't have to do with fork. In all my tests I run it with
--nofork (which skips forking). The only thing using threads was
timer.rb.

Googling found me some Japanese posts, but that didn't help. The rest
seemed to be fixed by checking socket#closed? first.

Thanks so much.

                                                       matz.

-- Eric Will

···

On Sun, Oct 5, 2008 at 12:33 PM, Yukihiro Matsumoto <matz@ruby-lang.org> wrote:

* you don't need timeout.rb. try attached simpler version.

While simpler version, this causes the same error...

                                                       matz.

-- Eric Will

···

On Sun, Oct 5, 2008 at 12:33 PM, Yukihiro Matsumoto <matz@ruby-lang.org> wrote:

Hi,

···

In message "Re: stream closed (IOError)" on Mon, 6 Oct 2008 01:43:14 +0900, "Eric Will" <rakaur@malkier.net> writes:

Googling found me some Japanese posts, but that didn't help. The rest
seemed to be fixed by checking socket#closed? first.

Show me the URL if you need further info, since I can read some
Japanese. :wink:

              matz.

Hi,

···

In message "Re: stream closed (IOError)" on Mon, 6 Oct 2008 04:56:21 +0900, "Eric Will" <rakaur@malkier.net> writes:

On Sun, Oct 5, 2008 at 12:33 PM, Yukihiro Matsumoto <matz@ruby-lang.org> wrote:

* you don't need timeout.rb. try attached simpler version.

While simpler version, this causes the same error...

Hmm, let's step forward one by one. You have two timers, which one is
causing the problem? What if wrapping timer callbacks with begin and
rescue?

              matz.

Show me the URL if you need further info, since I can read some
Japanese. :wink:

I'm not sure if these are relevant, since Google now shows lots of
this thread. :slight_smile:

http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-dev/7000
http://133.44.98.95/cgi-bin/vframe.rb/ruby/ruby-list/31677?31634-32001
http://www.rubyist.net/~nobu/t/20031028.html
http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-list/31645

                                                       matz.

-- Eric Will

···

On Sun, Oct 5, 2008 at 1:15 PM, Yukihiro Matsumoto <matz@ruby-lang.org> wrote:

Hmm, let's step forward one by one. You have two timers, which one is
causing the problem? What if wrapping timer callbacks with begin and
rescue?

I've tried wrapping the start method in timer.rb, wrapping the Timer
callbacks outside of "Timer.new" and inside of "Timer.new" (ie, part
of the callback), and none of these rescue it. When I wrap the call to
ioloop (which is the ENTIRE program, basically) it catches it... which
doesn't really narrow it down much.

I'm trying as I'm writing this, so bare with me for a moment...
wrapping c.read in lib/xmppd.rb catches it... wrapping parse in
Stream#read with it does NOT catch it. So it's somewhere between
Stream#read's call to Stream#parse.

                                                       matz.

I'll keep playing with this, but I don't know that I'm getting very
far... I still think it has something to do with threads.

-- Eric Will

···

On Sun, Oct 5, 2008 at 4:44 PM, Yukihiro Matsumoto <matz@ruby-lang.org> wrote:

Yeah, that's as far as I can get. Wrapping different parts of
Stream#read doesn't catch it.

This seems impossible unless it's something to do with threads, which
I know next to nothing about. :slight_smile:

I was going to try with ruby1.9, but the IDN gem doesn't work.

-- Eric Will

···

On Sun, Oct 5, 2008 at 5:08 PM, Eric Will <rakaur@malkier.net> wrote:

I'll keep playing with this, but I don't know that I'm getting very
far... I still think it has something to do with threads.

Hi,

Show me the URL if you need further info, since I can read some
Japanese. :wink:

I'm not sure if these are relevant, since Google now shows lots of
this thread. :slight_smile:

http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-dev/7000

It's about the already fixed bugs.

http://133.44.98.95/cgi-bin/vframe.rb/ruby/ruby-list/31677?31634-32001

It's related to dRuby.

http://www.rubyist.net/~nobu/t/20031028.html

Exceptions "closed steam" and "stream closed" are different. A good
hint, I believe.

http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-list/31645

It's related to dRuby.

According to the third URL, "stream closed" error was raised when fd
(IO object) is closed while a thread is waiting for that fd,
i.e. you've closed a socket which other thread (timer callback) is
blocking on.

              matz.

···

In message "Re: stream closed (IOError)" on Mon, 6 Oct 2008 04:44:37 +0900, "Eric Will" <rakaur@malkier.net> writes:

On Sun, Oct 5, 2008 at 1:15 PM, Yukihiro Matsumoto <matz@ruby-lang.org> wrote:

According to the third URL, "stream closed" error was raised when fd
(IO object) is closed while a thread is waiting for that fd,
i.e. you've closed a socket which other thread (timer callback) is
blocking on.

I think it was happening because the timer would kill off the socket, and then select() would be trying to check it in the main thread (no clue what was up with the error reporting, thread confusion!). I worked around this by just moving the timeout check from a Timer to manually doing it in the ioloop, before the dead connections are removed. Now it never gets to select(), and it works just fine. Thanks for your personal attention in helping me figure this out, matz. :slight_smile:

              matz.

-- Eric Will

···

On Oct 5, 2008, at 9:21 PM, Yukihiro Matsumoto wrote: