My OpenSSL server crashes

I have an openssl server in my company which we use to upload data to in
order to store it in a database. There's also some client server
communication possible for different reasons.

Somehow, the openssl server sometimes "crashes" its server socket.
According to the logging (and according to my tests) connections can no
longer be made.

I have to manually restart the app to run again.
I've put the relevant code + logging in a pastie:
http://pastie.org/1597282

I hope someone can point me in the right direction here or has some
information about how to further examine this, because I cannot explain
the dying of the ssl socket and I am at a loss on how to proceed...

···

--
Posted via http://www.ruby-forum.com/.

I don't really have a solution to the problem, although you could get
rid of SSLServer and just do it directly:

while conn = server_socket.accept
  Thread.new(conn) do |c|
    s = OpenSSL::SSL::SSLSocket.new(c, context)
    s.sync_close = true
    s.accept
    ... etc
  end
end

However that's pretty much all that SSLServer#accept does anyway, so I'm
not sure it will solve anything. It might give a clearer picture of
where the error is though (e.g. corrupt SSL context? You could try
making a fresh one for each iteration)

···

--
Posted via http://www.ruby-forum.com/.

Hi candlerb,

Thanks for your suggestions. I will definately try it out because it is
indeed hard to see where the error happens...

I've also made a tcpdump capture to see what's going on, but need to
finetune that a bit cause a lot of weird things are happening there. I
will add some more debugging to help me point out what exactly is
happening and I'll get back on that.

BTW: I noticed I'm using ruby version 1.9.1p378. I'm now in the process
of getting rvm to work and upgrade to the latest 1.9.2 version. Who
knows if that may solve something :slight_smile:

···

--
Posted via http://www.ruby-forum.com/.

Thanks for your suggestions. After some more debugging I found out the
error always occurs on a specific time during a specific function...

I found out that a colleague changed things so that on a scheduled task
400 servers started to upload data... With the result that my
application died.
I still don't understand why the socket crashes, but I do know what
triggers this. 400 ssl connects through a scheduled task from different
servers on exactly the same time is not a very good thing to do :wink:

I still need to debug this, but at least I can do it when I have the
time for it... Thanks!

···

--
Posted via http://www.ruby-forum.com/.

Tom van Leeuwen wrote in post #983721:

BTW: I noticed I'm using ruby version 1.9.1p378. I'm now in the process
of getting rvm to work and upgrade to the latest 1.9.2 version. Who
knows if that may solve something :slight_smile:

If that doesn't work, then try 1.8.7 too and see if the problem goes
away. There are still bugs turning up in 1.9.x

···

--
Posted via http://www.ruby-forum.com/\.

Tom van Leeuwen wrote in post #983860:

400 ssl connects through a scheduled task from different
servers on exactly the same time is not a very good thing to do :wink:

Indeed.

Ruby 1.8 used to have a hard-coded limit of 1024 open files (due to its
use of FD_SET). Maybe this is not a problem in 1.9, but you could also
check whether there's a ulimit restriction.

···

--
Posted via http://www.ruby-forum.com/\.