Problem with timeout for DBI on Solaris

Hi all,

Ruby 1.6.8
Solaris 8/9
DBI 0.18

As a test, we took down one of our test database machines and tried to
connect to it. The problem is that, on Solaris boxes, it seems to
ignore the ‘timeout’ that we’ve wrapped the connection in. Here’s the
code:

Begin code

require "dbi"
require “timeout”

puts Time.now.to_s
dbh = nil
begin
begin
timeout(3){
dbh = DBI.connect(dsn,user,passwd)
}
rescue TimeoutError => e
puts "DBI timed out: #{e}"
exit
ensure
puts Time.now.to_s
end
rescue DBI::Error => e
puts "Oops: #{e}"
exit
rescue
puts "WTF? " + $!
exit
end

sth = dbh.prepare(“select sysdate from dual”) # oracle - replace as
needed
sth.execute

puts "Result: " + sth.fetch.to_s

sth.finish
dbh.disconnect

End code

On a linux box, this code works as expected. However, on both Solaris 8
and Solaris 9, it takes about 3 minutes to timeout. On the Solaris 9
box, I have the host hard coded in the /etc/hosts file, while the
Solaris 8 box uses DNS.

Checking netstat -a, it appears that the connection is just sitting in a
SYN_SENT state.

We also tried just shutting down the listener and the DB itself (while
leaving the host up) and that worked fine.

Any ideas?

Regards,

Dan

As a test, we took down one of our test database machines and tried to
connect to it. The problem is that, on Solaris boxes, it seems to
ignore the ‘timeout’ that we’ve wrapped the connection in. Here’s the
code:

timeout(3){
dbh = DBI.connect(dsn,user,passwd)
}

I think that the problem is that once DBI.connect has committed to calling
the Oracle C library, it won’t come back until it has finished one way or
the other. In other words, it doesn’t play nicely with Ruby’s cooperative
threading. If a SIGALRM is raised after 3 seconds, Ruby just takes a note of
it and lets the library call complete before taking action.

For Oracle, you can try using the ruby-oci8 library from RAA which supports
non-blocking queries (although I don’t know if it supports non-blocking
connect). It comes with its own DBD which you have to install manually into
/usr/local/lib/ruby/site_ruby/1.6/DBD/OCI8/OCI8.rb

Then modify it as follows:

require ‘dbi’
require ‘DBD/OCI8/OCI8’

module DBI
module DBD
module OCI8
class Driver
def connect( dbname, user, auth, attr )
handle = ::OCI8.new(user, auth, dbname, attr[‘Privilege’])
handle.non_blocking = true if attr[‘NonBlocking’] #<<<< add this line
return Database.new(handle, attr)
rescue OCIError => err
raise DBI::DatabaseError.new(err.message, err.code)
end
end
end
end
end

(i.e. either just run the above code at the start of your program, or add
the one extra line into the corresponding place in DBD/OCI8/OCI8.rb)

Then connect to your database as:

dbh = DBI.connect('dbi:OCI8:',user,password, {'NonBlocking'=>true} )

This at least allows a multi-threaded application to run multiple concurrent
queries on the same database, without a slow query locking out all other
Ruby threads; as I said, I’ve not tested it with connect.

Having said all that, I can’t explain why Oracle under Linux doesn’t have
this problem.

Checking netstat -a, it appears that the connection is just sitting in a
SYN_SENT state.

Normally a TCP socket has a 75-second connect timeout. So I also don’t know
why you’re getting three minutes; perhaps something is trying twice, or has
altered the default TCP timers.

Regards,

Brian.

···

On Wed, Apr 09, 2003 at 01:10:46AM +0900, Daniel Berger wrote:

Brian Candler wrote:

As a test, we took down one of our test database machines and tried to
connect to it. The problem is that, on Solaris boxes, it seems to
ignore the ‘timeout’ that we’ve wrapped the connection in. Here’s the
code:

timeout(3){
dbh = DBI.connect(dsn,user,passwd)
}

I think that the problem is that once DBI.connect has committed to calling
the Oracle C library, it won’t come back until it has finished one way or
the other. In other words, it doesn’t play nicely with Ruby’s cooperative
threading. If a SIGALRM is raised after 3 seconds, Ruby just takes a note of
it and lets the library call complete before taking action.

For Oracle, you can try using the ruby-oci8 library from RAA which supports
non-blocking queries (although I don’t know if it supports non-blocking
connect). It comes with its own DBD which you have to install manually into
/usr/local/lib/ruby/site_ruby/1.6/DBD/OCI8/OCI8.rb

I’ll give it a shot. FYI, I tried a similar script with Perl:

Begin code

use strict;
use DBI;

my $dbh;

eval{
alarm(5);
$dbh = DBI->connect(“dbi:Oracle:$db”,$login,$passwd,{
RaiseError => 1, PrintError => 1});
alarm(0);
};

if($@){
print “Error: $@\n”;
exit;
}

End code

This was the result:

“Error: DBI->connect(db) failed: ORA-12560: TNS:protocol adapter error
(DBD ERROR: OCIServerAttach)”

So, it looks like it was a driver issue (?). Perl’s driver seems to be
non-blocking by default, which I think would be nice.

Thanks for the info.

Regards,

Dan

···

On Wed, Apr 09, 2003 at 01:10:46AM +0900, Daniel Berger wrote:

Brian Candler wrote:

> > For Oracle, you can try using the ruby-oci8 library from RAA which supports > non-blocking queries (although I don't know if it supports non-blocking > connect). It comes with its own DBD which you have to install manually into > /usr/local/lib/ruby/site_ruby/1.6/DBD/OCI8/OCI8.rb > > Then modify it as follows: > > require 'dbi' > require 'DBD/OCI8/OCI8' > > module DBI > module DBD > module OCI8 > class Driver > def connect( dbname, user, auth, attr ) > handle = ::OCI8.new(user, auth, dbname, attr['Privilege']) > handle.non_blocking = true if attr['NonBlocking'] #<<< return Database.new(handle, attr) > rescue OCIError => err > raise DBI::DatabaseError.new(err.message, err.code) > end > end > end > end > end

I installed the oci8 package and made the changes you suggested, but no
luck.
In fact, it appears to hang in the ::OCI.new call. A simple “puts”
statement
after that call shows me that it is never invoked on my system.

Stepping through the debugger, I can see that it’s hanging on the
do_ocicall()
method in the oci8.rb file. It steps in the first time, yields back
out, tries
to step back in, and that’s where it hangs.

snippet of debugger session:

/usr/local/lib/ruby/site_ruby/1.6/dbi/dbi.rb:550: db =
@handle.connect(db_args, user, auth, new_params)
(rdb:1) s
/usr/local/lib/ruby/site_ruby/1.6/DBD/OCI8/OCI8.rb:25: handle =
::OCI8.new(user, auth, dbname, attr[‘Privilege’])
(rdb:1) s
/usr/local/lib/ruby/site_ruby/1.6/oci8.rb:64: if @@env.nil?
(rdb:1) n
/usr/local/lib/ruby/site_ruby/1.6/oci8.rb:65: if
OCIEnv.respond_to?(“create”)
(rdb:1)
n
/usr/local/lib/ruby/site_ruby/1.6/oci8.rb:66: @@env = OCIEnv.create()
(rdb:1)
n
/usr/local/lib/ruby/site_ruby/1.6/oci8.rb:73: case privilege
(rdb:1)
n
/usr/local/lib/ruby/site_ruby/1.6/oci8.rb:74: when nil
(rdb:1)
n
/usr/local/lib/ruby/site_ruby/1.6/oci8.rb:75: privilege =
OCI_DEFAULT
(rdb:1)
n
/usr/local/lib/ruby/site_ruby/1.6/oci8.rb:84: @svc =
@@env.alloc(OCISvcCtx)
(rdb:1)
n
/usr/local/lib/ruby/site_ruby/1.6/oci8.rb:85: @srv =
@@env.alloc(OCIServer)
(rdb:1)
n
/usr/local/lib/ruby/site_ruby/1.6/oci8.rb:86: @auth =
@@env.alloc(OCISession)
(rdb:1)
n
/usr/local/lib/ruby/site_ruby/1.6/oci8.rb:87: @ctx = [0, Mutex.new,
nil]
(rdb:1)
n
/usr/local/lib/ruby/site_ruby/1.6/oci8.rb:89:
@auth.attrSet(OCI_ATTR_USERNAME, uid)
(rdb:1) @ctx
[0, #<Mutex:0x3cdf78 @locked=false, @waiting=>, nil]
(rdb:1) n
/usr/local/lib/ruby/site_ruby/1.6/oci8.rb:90:
@auth.attrSet(OCI_ATTR_PASSWORD, pswd)
(rdb:1)
n
/usr/local/lib/ruby/site_ruby/1.6/oci8.rb:91: do_ocicall(@ctx) {
@srv.attach(conn) }
(rdb:1) @srv
#OCIServer:0x3ce788
(rdb:1) s
/usr/local/lib/ruby/site_ruby/1.6/oci8.rb:40: sleep_time = 0.01
(rdb:1) n
/usr/local/lib/ruby/site_ruby/1.6/oci8.rb:41: ctx[CTX_MUTEX].lock
(rdb:1)
n
/usr/local/lib/ruby/site_ruby/1.6/oci8.rb:42: ctx[CTX_THREAD] =
Thread.current
(rdb:1)
n
/usr/local/lib/ruby/site_ruby/1.6/oci8.rb:58: end
(rdb:1)
n
/usr/local/lib/ruby/site_ruby/1.6/oci8.rb:44: yield
(rdb:1)
n
/usr/local/lib/ruby/site_ruby/1.6/oci8.rb:91: do_ocicall(@ctx) {
@srv.attach(conn) }
(rdb:1) s

That’s where it hangs - trying to step into the do_ocicall() the second
time.

Any ideas? Some sort of Mutex issue perhaps?

Regards,

Dan

···


a = [74, 117, 115, 116, 32, 65, 110, 111, 116, 104, 101, 114, 32, 82]
a.push(117,98, 121, 32, 72, 97, 99, 107, 101, 114)
puts a.pack(“C*”)

Hi,

Brian Candler B.Candler@pobox.com writes:

For Oracle, you can try using the ruby-oci8 library from RAA which supports
non-blocking queries (although I don’t know if it supports non-blocking
connect). It comes with its own DBD which you have to install manually into
/usr/local/lib/ruby/site_ruby/1.6/DBD/OCI8/OCI8.rb

Then modify it as follows:

require ‘dbi’
require ‘DBD/OCI8/OCI8’

module DBI
module DBD
module OCI8
class Driver
def connect( dbname, user, auth, attr )
handle = ::OCI8.new(user, auth, dbname, attr[‘Privilege’])
handle.non_blocking = true if attr[‘NonBlocking’] #<<<< add this line

This doesn’t work for non-blocking connect because this changes
blocking mode to non-blocking after the connection is established.

Please try following patches:

— oci8.rb~ 2003-03-08 19:11:51.000000000 +0900
+++ oci8.rb 2003-04-12 15:20:10.000000000 +0900
@@ -60,7 +60,7 @@
end
include Util

  • def initialize(uid, pswd, conn = nil, privilege = nil)
  • def initialize(uid, pswd, conn = nil, privilege = nil, non_blocking = false)
    if @@env.nil?
    if OCIEnv.respond_to?(“create”)
    @@env = OCIEnv.create()
    @@ -86,6 +86,7 @@
    @auth = @@env.alloc(OCISession)
    @ctx = [0, Mutex.new, nil]

  • @svc.attrSet(OCI_ATTR_NONBLOCKING_MODE, nil) if non_blocking
    @auth.attrSet(OCI_ATTR_USERNAME, uid)
    @auth.attrSet(OCI_ATTR_PASSWORD, pswd)
    do_ocicall(@ctx) { @srv.attach(conn) }

— DBD/OCI8/OCI8.rb~ 2002-09-12 23:14:37.000000000 +0900
+++ DBD/OCI8/OCI8.rb 2003-04-12 15:21:43.000000000 +0900
@@ -22,7 +22,7 @@
end

def connect( dbname, user, auth, attr )

  • handle = ::OCI8.new(user, auth, dbname, attr[‘Privilege’])
  • handle = ::OCI8.new(user, auth, dbname, attr[‘Privilege’], attr[‘NonBlocking’])
    return Database.new(handle, attr)
    rescue OCIError => err
    raise DBI::DatabaseError.new(err.message, err.code)

I’ve not tested these patches and I don’t know whether it works for
non-blocking connect.
There is no running Oracle server in my Linux box. ;-(

Cheers.

···


KUBO Takehiro

eval{
alarm(5);
$dbh = DBI->connect(“dbi:Oracle:$db”,$login,$passwd,{
RaiseError => 1, PrintError => 1});
alarm(0);
};

This was the result:

“Error: DBI->connect(db) failed: ORA-12560: TNS:protocol adapter error
(DBD ERROR: OCIServerAttach)”

After 5 seconds?

It could also be a signal-handling issue. Ruby installs its own signal
handlers for a bunch of common signals, and installs them with SA_RESTART
set so that system calls don’t terminate prematurely with EINTR if a signal
comes along. If you change a signal handler with trap() {…} Ruby just
updates its internal data structures, it doesn’t touch the handlers it
installed.

Maybe perl just installs signal handlers in the O/S when required, which may
mean in this case EINTR is sufficient to terminate the connection attempt.

I had the same issue when looking at fcgi and Apache sending a SIGUSR1 to a
fastcgi process to do a ‘clean shutdown’

Regards,

Brian.

···

On Wed, Apr 09, 2003 at 01:48:33AM +0900, Daniel Berger wrote:

Daniel Berger wrote:

I installed the oci8 package and made the changes you suggested, but no
luck.
In fact, it appears to hang in the ::OCI.new call. A simple “puts”
statement
after that call shows me that it is never invoked on my system.

Stepping through the debugger, I can see that it’s hanging on the
do_ocicall()

More specifically:

oci.rb → 91: do_ocicall(@ctx) { @srv.attach(conn) }

Apparently, it doesn’t like the code inside the block. The variable
@srv’ is an OCIServer.

Looking over at the server.c file in the ruby-oci8 source directory,
it appears to be when OCIServerAttach() is called. That’s where I’m
stuck.

Regards,

Dan

No I don’t know, but it works for me for non-blocking queries rather than
opens. You can see from the above how it’s intended to work: it issues a
call, waits 10ms, polls for an answer, waits 20ms, polls again, and so on.

Have you tried it with a database which is ‘up’ rather than one which is
unreachable? If the rest of the functionality is OK on your system, then
maybe it’s just non-blocking connects which have a problem.

The author of ruby-oci8 was extremely helpful when I found a bug, although
says that he doesn’t actually use it himself any more and therefore doesn’t
have much time to work on it.

Regards,

Brian.

···

On Wed, Apr 09, 2003 at 04:42:21AM +0900, Daniel Berger wrote:

/usr/local/lib/ruby/site_ruby/1.6/oci8.rb:91: do_ocicall(@ctx) {
@srv.attach(conn) }
(rdb:1) @srv
#OCIServer:0x3ce788
(rdb:1) s
/usr/local/lib/ruby/site_ruby/1.6/oci8.rb:40: sleep_time = 0.01
(rdb:1) n
/usr/local/lib/ruby/site_ruby/1.6/oci8.rb:41: ctx[CTX_MUTEX].lock
(rdb:1)
n
/usr/local/lib/ruby/site_ruby/1.6/oci8.rb:42: ctx[CTX_THREAD] =
Thread.current
(rdb:1)
n
/usr/local/lib/ruby/site_ruby/1.6/oci8.rb:58: end
(rdb:1)
n
/usr/local/lib/ruby/site_ruby/1.6/oci8.rb:44: yield
(rdb:1)
n
/usr/local/lib/ruby/site_ruby/1.6/oci8.rb:91: do_ocicall(@ctx) {
@srv.attach(conn) }
(rdb:1) s

That’s where it hangs - trying to step into the do_ocicall() the second
time.

Any ideas? Some sort of Mutex issue perhaps?

KUBO Takehiro kubo@jiubao.org writes:

Please try following patches:

— oci8.rb~ 2003-03-08 19:11:51.000000000 +0900
+++ oci8.rb 2003-04-12 15:20:10.000000000 +0900
@@ -60,7 +60,7 @@
end
include Util

  • def initialize(uid, pswd, conn = nil, privilege = nil)
  • def initialize(uid, pswd, conn = nil, privilege = nil, non_blocking = false)
    if @@env.nil?
    if OCIEnv.respond_to?(“create”)
    @@env = OCIEnv.create()
    @@ -86,6 +86,7 @@
    @auth = @@env.alloc(OCISession)
    @ctx = [0, Mutex.new, nil]

  • @svc.attrSet(OCI_ATTR_NONBLOCKING_MODE, nil) if non_blocking
    @auth.attrSet(OCI_ATTR_USERNAME, uid)
    @auth.attrSet(OCI_ATTR_PASSWORD, pswd)
    do_ocicall(@ctx) { @srv.attach(conn) }

oops.
fix

  • @svc.attrSet(OCI_ATTR_NONBLOCKING_MODE, nil) if non_blocking
    to
  • @srv.attrSet(OCI_ATTR_NONBLOCKING_MODE, nil) if non_blocking

Cheers.

···


KUBO Takehiro