DRB and threads

I wonder if anyone can give me some hints on the interactions between dRuby
(drb) and threads.

I have been having a bit of a play, and it seems if you set up a dRuby
server in the way described in the Pickaxe, then each incoming request is
handled as a separate thread, but method calls are made to the same
object concurrently.

Example:

server.rb

require 'drb’
class SlowServer
def doit(x)
puts "Starting #{x}…"
sleep 3
puts "Finished #{x}"
nil
end
end
a = SlowServer.new
DRb.start_service(‘druby://localhost:9000’, a)
DRb.thread.join

client.rb

require 'drb’
DRb.start_service # not sure what this does for a client?
obj = DRbObject.new(nil, ‘druby://localhost:9000’)
10.times {
obj.doit($$)
}

Now, if I run two copies of client.rb concurrently as two separate
processes, they both only take 30 seconds to complete. The ‘doit’ method is
being run twice on the same object instance.

If I understand this rightly, then I can’t see how you can use this with any
object which carries state unless you rewrite or wrap all the methods to
make them thread-safe. Is this correct? Unfortunately I can’t find much
documentation in English, and I can’t see if there’s another way of running
it - even simple serialisation, so that method calls are handled one at a
time.

Anyway, the pattern I really want to implement is a pool of objects, with
the method call handed to any free object (and that object does not get
another request until it has finished handling the first one).

The aplication is that I have a database front-end class where each instance
keeps a DBI handle open (amongst other things) but each method call is
independent of the previous one, i.e. there’s no state kept
between calls, other than inside the database itself of course. So
incoming requests can be farmed off to any object which is free.

Do I have to write a wrapper object which duplicates every method in my
database class (or method_missing)? Or is there a simpler way of handling
this?

Another solution would be if I could run a DRb server under FastCGI, so that
I had a pool of processes (maintained by Apache) and requests would be
farmed out to those processes. Is that possible?

Thanks,

Brian.

[P.S. I am using dRuby-2.0.2 with ruby-1.6.8]

Brian Candler wrote:

I wonder if anyone can give me some hints on the interactions between dRuby
(drb) and threads.

I have been having a bit of a play, and it seems if you set up a dRuby
server in the way described in the Pickaxe, then each incoming request is
handled as a separate thread, but method calls are made to the same
object concurrently.

[snip]

If I understand this rightly, then I can’t see how you can use this with any
object which carries state unless you rewrite or wrap all the methods to
make them thread-safe. Is this correct? Unfortunately I can’t find much
documentation in English, and I can’t see if there’s another way of running
it - even simple serialisation, so that method calls are handled one at a
time.

I highly recommend Pragmatic Programmers’ book chapter on threads and
synchronization:
http://www.rubycentral.com/book/tut_threads.html

The short answer is, no, druby does not by itself provide any
synchronization or mutual exclusion for multiple threads accessing the
same remote or local object.

Anyway, the pattern I really want to implement is a pool of objects, with
the method call handed to any free object (and that object does not get
another request until it has finished handling the first one).

The aplication is that I have a database front-end class where each instance
keeps a DBI handle open (amongst other things) but each method call is
independent of the previous one, i.e. there’s no state kept
between calls, other than inside the database itself of course. So
incoming requests can be farmed off to any object which is free.

Do I have to write a wrapper object which duplicates every method in my
database class (or method_missing)? Or is there a simpler way of handling
this?

Adding thread-safe synchronization primitives to your database frontend
is probably a good idea anyway, but yes, this is probably the best way
to go. Check out ‘DelegateClass’ in the standard ‘delegate.rb’ library
file if you want a quick way to wrap only those methods that require
synchronization, while passing all the other calls to the original object.

Personally, I’m somewhat surprised that there isn’t an “object pool” or
“worker thread pool” pattern implementation included in the standard
pattern libraries…it seems like it wouldn’t be too complex, and a
standard thread pool design could speed up a lot of prototyping and
tutorial tasks, even if it needed to be tuned for specific applications.

Hmm…I guess I just volunteered to write one, since I’m not finding
anything on RAA… :wink:

Another solution would be if I could run a DRb server under FastCGI, so that
I had a pool of processes (maintained by Apache) and requests would be
farmed out to those processes. Is that possible?

You could also maintain this pool the same way that you do a set of
threads, and have each process implement its own druby server object
which would be load-balanced by a single front-end. That approach would
have the benefit of protecting you from any blocking calls that
otherwise wouldn’t be effectively avoided by the Ruby intepreter’s
non-native threads. Unfortunately, it also adds an additional layer of
marshalling/unmarshalling overhead.

Thanks,

Brian.

[P.S. I am using dRuby-2.0.2 with ruby-1.6.8]

Good luck,

Lennon Day-Reynolds
lennon@day-reynolds.com

“Brian Candler” B.Candler@pobox.com schrieb im Newsbeitrag
news:20030306234411.GA43350@uk.tiscali.com

If I understand this rightly, then I can’t see how you can use this with
any
object which carries state unless you rewrite or wrap all the methods to
make them thread-safe. Is this correct?

Yes, that’s correct. Since the server can hand the instance’s reference to
multiple clients (or a single multithreaded client) you must take measures
to handle this.

Anyway, the pattern I really want to implement is a pool of objects, with
the method call handed to any free object (and that object does not get
another request until it has finished handling the first one).

The aplication is that I have a database front-end class where each
instance
keeps a DBI handle open (amongst other things) but each method call is
independent of the previous one, i.e. there’s no state kept
between calls, other than inside the database itself of course. So
incoming requests can be farmed off to any object which is free.

Then I suggest to use a variation of the facade pattern: You have a single
instance that is known to clients. This instance internally holds a pool
of instances that do the real work. Methods of your facade fetch an
instance from the pool hand the request over to this, return the results
and put the instance back into the pool. Pool access must be synchronized
of course.

Do I have to write a wrapper object which duplicates every method in my
database class (or method_missing)? Or is there a simpler way of handling
this?

method_missing is not a bad idea, I guess.

Another solution would be if I could run a DRb server under FastCGI, so
that
I had a pool of processes (maintained by Apache) and requests would be
farmed out to those processes. Is that possible?

Possibly. Although multiple threads should be more efficient than multiple
processes.

robert

this might give you some ideas - it’s only for postgresql though.

it does deal implement the pattern you speak of.

http://raa.ruby-lang.org/list.rhtml?name=pgconngroup

-a

···

On Fri, 7 Mar 2003, Brian Candler wrote:

Anyway, the pattern I really want to implement is a pool of objects, with
the method call handed to any free object (and that object does not get
another request until it has finished handling the first one).

Ara Howard
NOAA Forecast Systems Laboratory
Information and Technology Services
Data Systems Group
R/FST 325 Broadway
Boulder, CO 80305-3328
Email: ahoward@fsl.noaa.gov
Phone: 303-497-7238
Fax: 303-497-7259
====================================

If I understand this rightly, then I can’t see how you can use this with
any
object which carries state unless you rewrite or wrap all the methods to
make them thread-safe. Is this correct?

Yes, that’s correct. Since the server can hand the instance’s reference to
multiple clients (or a single multithreaded client) you must take measures
to handle this.

OK, that’s good to know; it could probably do with making clear in the
documentation, since Bad Things will likely happen if you take an arbitrary
object (like a DBI::DatabaseHandle) and share it via DRb!

I did find references to a pool in the source (Class DRbConn), however it
looks like it’s for a client to have a pool of connections to multiple
remote URIs, rather than a server having a pool of objects to handle
requests.

Then I suggest to use a variation of the facade pattern: You have a single
instance that is known to clients. This instance internally holds a pool
of instances that do the real work. Methods of your facade fetch an
instance from the pool hand the request over to this, return the results
and put the instance back into the pool. Pool access must be synchronized
of course.

I was thinking along the same lines. I guess it shouldn’t be too hard to
code, but it’s unusual in Ruby to find something generic like this not
already implemented :slight_smile:

Another solution would be if I could run a DRb server under FastCGI, so
that
I had a pool of processes (maintained by Apache) and requests would be
farmed out to those processes. Is that possible?

Possibly. Although multiple threads should be more efficient than multiple
processes.

I’d actually like to run as multiple processes, for two specific reasons:

(1) being able to avoid any blocking issues with DBI drivers. There will
be some long-duration queries which take place; I don’t want all
other clients to block while this happens!
(2) making use of multiple CPUs

I quite like the Apache/FastCGI approach because the code for allocating
instances of the backend is already written and hopefully reasonably
debugged; it can have a fixed or variable number of instances.

I guess I could write a front-end in Ruby and use IO.popen(‘-’) to spawn
children connected via a pipe. But then I’d want a version of DRb which
sends messages over an IO object (and can respawn them if they die)

Regards,

Brian.

···

On Fri, Mar 07, 2003 at 07:15:29PM +0900, Robert Klemme wrote:

Or, I haven’t tried mod_ruby, but I guess the same would apply. Each Apache
worker process will have one instance of the application? And I don’t have
to worry about one object receiving multiple invocations from threads?

Regards,

Brian.

···

On Fri, Mar 07, 2003 at 01:12:03PM +0000, Brian Candler wrote:

I’d actually like to run as multiple processes, for two specific reasons:

(1) being able to avoid any blocking issues with DBI drivers. There will
be some long-duration queries which take place; I don’t want all
other clients to block while this happens!
(2) making use of multiple CPUs

I quite like the Apache/FastCGI approach because the code for allocating
instances of the backend is already written and hopefully reasonably
debugged; it can have a fixed or variable number of instances.

“Brian Candler” B.Candler@pobox.com schrieb im Newsbeitrag
news:20030307131203.B99950@linnet.org

Yes, that’s correct. Since the server can hand the instance’s reference
to
multiple clients (or a single multithreaded client) you must take
measures
to handle this.

OK, that’s good to know; it could probably do with making clear in the
documentation, since Bad Things will likely happen if you take an
arbitrary
object (like a DBI::DatabaseHandle) and share it via DRb!

Yes.

I did find references to a pool in the source (Class DRbConn), however it
looks like it’s for a client to have a pool of connections to multiple
remote URIs, rather than a server having a pool of objects to handle
requests.

Like:

require ‘thread’

class Pool
def initialize(type, *args)
@mutex = Mutex.new
@type = type
@args = args
@pool =
end

def get
@mutex.synchronize do
@pool.shift or createNew()
end
end

def put(elem)
@mutex.synchronize do
@pool.push elem
end
end

def createNew
@type.new(*@args)
end
end

I was thinking along the same lines. I guess it shouldn’t be too hard to
code, but it’s unusual in Ruby to find something generic like this not
already implemented :slight_smile:

Well, there’s always room to contribute something new, I guess.

regards

robert

debugged, but not user friendly. i wrote a little wrapper that dealt with
much of the signal handling, and used someone else pattern (i forget whose) to
boot strap a normal cgi objec of off a fcgi object. it’s at

http://groups.google.com/groups?q=ahoward+mod_fcgi+group:comp.lang.ruby&hl=en&lr=&ie=UTF-8&selm=Pine.LNX.4.33.0302121430040.10747-100000%40eli.fsl.noaa.gov&rnum=1

-a

···

On Fri, 7 Mar 2003, Brian Candler wrote:

I quite like the Apache/FastCGI approach because the code for allocating
instances of the backend is already written and hopefully reasonably
debugged; it can have a fixed or variable number of instances.

Ara Howard
NOAA Forecast Systems Laboratory
Information and Technology Services
Data Systems Group
R/FST 325 Broadway
Boulder, CO 80305-3328
Email: ahoward@fsl.noaa.gov
Phone: 303-497-7238
Fax: 303-497-7259
====================================

Yes, I found this sufficiently interesting that I kept a copy when you first
posted it, so thanks for reminding me! It's [ruby-talk:64468] for a shorter
link.

Did you find that installing a SIGPIPE handler was actually necessary?
According to the mod_fastcgi documentation, mod_fastcgi itself sets this for
applications which it spawns; also libfcgi installs an empty signal handler
for SIGPIPE as well (OS_SigpipeHandler in libfcgi/os_unix.c)

I removed install_traps and sent a kill -PIPE to the spawned process and it
didn't seem to blink, so I think it's not actually necessary (but perhaps it
is with the pure Ruby version of fcgi)

A few other points:

* you call '@@server.close' but this ties you to the pure-Ruby
implementation; I am using the C version and it doesn't set such an instance
variable. This means you can get warnings logged such as

mod_fcgi.rb:in `install_traps': uninitialized class variable @@server in MOD_FCGI (NameError)

* the trapping of TERM and HUP doesn't work properly for me. What happens is
that if I send such a signal to the process, nothing happens (ps shows the
same pid) until the next HTTP request comes along, at which point it fails
and Apache returns '500 Internal Server Error'. The process is then
restarted and it's fine thereafter.

I put some file debugging in:

  $stderr = File.open("/tmp/errs","a")
  $stderr.sync = true
  ...

  trap('SIGHUP') do
    $stderr.puts "#{Time.now} signals #{@@signals.inspect} handling_request #{@@handling_request.inspect} exit_requested #{@@exit_requested.inspect}"
  ...

What seems to happen is that the trap handler is not even started until the
subsequent request comes in, as shown by the timestamp. The process *then*
commits suicide (since @@handling_request is still false at that point) and
the 500 error occurs. Changing 'exit' to 'exit!' doesn't make any difference
either.

It's as if the signal is held up in accept() until the next incoming
connection arrives. I can't work out why this is the case, but for now just
removing 'install_traps' actually gives me much better results, since the
process just dies and is respawned straight away.

FYI I am running ruby 1.6.8 (2002-12-24) [i386-freebsd4.7] with
ruby-fcgi-0.8.1 + fcgi-2.4.0, apache-1.3.27, mod_fastcgi-2.2.12

* I also had to make a few changes to make it load cleanly under ruby -w
(attached)

Otherwise this all looks very cool, and I actually don't think that it's
Apache-specific. SIGPIPE isn't an Apache extension to fastcgi spec, it just
closes the socket if it isn't interested in waiting for the response, and
the OS generates SIGPIPE. Other fastcgi servers are likely to do the same.

As a result, I think that what you've written really belongs in the core
FCGI library anyway, i.e.

  - bootstrapping of a CGI object (maybe FCGI.each_cgi ?)

and perhaps also optional graceful shutdown on USR1, if it can be made to
work (although that _is_ an Apache feature)

Or else at least it can go on the RubyGardenWiki ?

Regards,

Brian.

P.S. Thinking about USR1, I just checked and libfcgi does install a USR1
handler (which sets an interal flag for a graceful abort). However it
doesn't work very well if the process is between requests, because it just
sits in accept() and catches the signal, so doesn't abort until the next
incoming connection occurs, giving a 500 error to the client.

mod_fcgi.rb.patch (2.31 KB)

···

On Sat, Mar 08, 2003 at 11:38:31AM +0900, ahoward wrote:

> I quite like the Apache/FastCGI approach because the code for allocating
> instances of the backend is already written and hopefully reasonably
> debugged; it can have a fixed or variable number of instances.

debugged, but not user friendly. i wrote a little wrapper that dealt with
much of the signal handling, and used someone else pattern (i forget whose) to
boot strap a normal cgi objec of off a fcgi object. it's at

http://groups.google.com/groups?q=ahoward+mod_fcgi+group:comp.lang.ruby&hl=en&lr=&ie=UTF-8&selm=Pine.LNX.4.33.0302121430040.10747-100000%40eli.fsl.noaa.gov&rnum=1

Another minor patch: redirect stderr down the FastCGI pipe.

    def each (*args)
      #install_traps can be called by the user first
      while request = accept
        handle_request {
          $stderr = request.err # and $stdout to request.out ?
          yield(CGI.new(request, *args))
          $stderr = STDERR
          request.finish
        }
        break if @@exit_requested
      end
    end

This gives more informative logs, and also in principle should work if the
FastCGI app is external. [Not sure how you'd set that up though; you'd need
a program which binds FD 0 to a listening socket, or manages a pool of
processes like Apache does]

If Moonwolf is around: how about making FCGX_IsCGI available from Ruby (and
an equivalent function in the pure Ruby version)? Then we could do:

    def each_cgi(*args)
      if iscgi?
        return yield ::CGI.new(*args)
      end
      # ahowards's loop as above
    end

That would let FCGI programs work as normal standalone CGI _and_ as FCGI
without any modifications :slight_smile:

Regards,

Brian.

> http://groups.google.com/groups?q=ahoward+mod_fcgi+group:comp.lang.ruby&hl=en&lr=&ie=UTF-8&selm=Pine.LNX.4.33.0302121430040.10747-100000%40eli.fsl.noaa.gov&rnum=1

Yes, I found this sufficiently interesting that I kept a copy when you first
posted it, so thanks for reminding me! It's [ruby-talk:64468] for a shorter
link.

do you use the http://blade.nagaokaut.ac.jp/ruby/ruby-talk/index.shtml
interface to ruby-talk? i tried using that search facility to pull up my post
(searching for 'ahoward'), but didn't have any luck??

Did you find that installing a SIGPIPE handler was actually necessary?
According to the mod_fastcgi documentation, mod_fastcgi itself sets this for
applications which it spawns; also libfcgi installs an empty signal handler
for SIGPIPE as well (OS_SigpipeHandler in libfcgi/os_unix.c)

yeah, i knew about the supposed mod_fastcgi installed PIPE handler but figured
it couldn't hurt. let me know if find otherwise. regarding
libfcgi/os_unix.c, this is not used if you 'require fcgi.rb', which is an
option. remember the package ships with a ruby and c impl.

I removed install_traps and sent a kill -PIPE to the spawned process and it
didn't seem to blink, so I think it's not actually necessary (but perhaps it
is with the pure Ruby version of fcgi)

which is what i tested with...

A few other points:

* you call '@@server.close' but this ties you to the pure-Ruby
implementation; I am using the C version and it doesn't set such an instance
variable. This means you can get warnings logged such as

mod_fcgi.rb:in `install_traps': uninitialized class variable @@server in MOD_FCGI (NameError)

see above.

* the trapping of TERM and HUP doesn't work properly for me. What happens is
that if I send such a signal to the process, nothing happens (ps shows the
same pid) until the next HTTP request comes along, at which point it fails
and Apache returns '500 Internal Server Error'. The process is then
restarted and it's fine thereafter.

silly question but above you said

I removed install_traps and sent a kill -PIPE to the spawned process and it

you did put it back in right? the described behaviour sound like you did not,
though i suspect you did... ;-(

I put some file debugging in:

  $stderr = File.open("/tmp/errs","a")
  $stderr.sync = true
  ...

  trap('SIGHUP') do
    $stderr.puts "#{Time.now} signals #{@@signals.inspect} handling_request #{@@handling_request.inspect} exit_requested #{@@exit_requested.inspect}"
  ...

What seems to happen is that the trap handler is not even started until the
subsequent request comes in, as shown by the timestamp. The process *then*
commits suicide (since @@handling_request is still false at that point) and
the 500 error occurs. Changing 'exit' to 'exit!' doesn't make any difference
either.

It's as if the signal is held up in accept() until the next incoming
connection arrives. I can't work out why this is the case, but for now just
removing 'install_traps' actually gives me much better results, since the
process just dies and is respawned straight away.

FYI I am running ruby 1.6.8 (2002-12-24) [i386-freebsd4.7] with
ruby-fcgi-0.8.1 + fcgi-2.4.0, apache-1.3.27, mod_fastcgi-2.2.12

* I also had to make a few changes to make it load cleanly under ruby -w
(attached)

i will need to drink more coffee and think about this... signals are so
simple, err... confusing, no simple, no confusing...

Otherwise this all looks very cool, and I actually don't think that it's
Apache-specific. SIGPIPE isn't an Apache extension to fastcgi spec, it just
closes the socket if it isn't interested in waiting for the response, and
the OS generates SIGPIPE. Other fastcgi servers are likely to do the same.

As a result, I think that what you've written really belongs in the core
FCGI library anyway, i.e.

  - bootstrapping of a CGI object (maybe FCGI.each_cgi ?)

and perhaps also optional graceful shutdown on USR1, if it can be made to
work (although that _is_ an Apache feature)

Or else at least it can go on the RubyGardenWiki ?

Regards,

Brian.

P.S. Thinking about USR1, I just checked and libfcgi does install a USR1
handler (which sets an interal flag for a graceful abort). However it
doesn't work very well if the process is between requests, because it just
sits in accept() and catches the signal, so doesn't abort until the next
incoming connection occurs, giving a 500 error to the client.

yes to all of the above. i had hoped that someone else (more than one) would
test it out and make further mods. i think this small peice of code (fastcgi
- not my peice!) is more usefull than alot of the huge app server type
projects out there and a few simple fixes like being able to obtain a
'normal' cgi object, and therefore be able to use a known api, and having the
fastcgi processes behave nicely with signals (to reload and all that) could
make this approach really attractive. IMHO fastcgi seems to overcome all
that shortcomings of cgi programming (persistence, speed, etc) while adding
only a small amount of complexity (signals, etc). in addition to that, the
code involved in implementing a fastcgi lib is so simple even idiots like me
can (sortof) understand it!

i'll try out your mods on my setup next week and get back to you. if you (or
anyone else) are interested we should keep bouncing ideas of off each other
untill we have something viable, doccument it, and either release it ourselves
or contribute to the present fcgi project. i would love to see this
technology gain some momentum in the ruby community.

thanks for the significant work.

-a

···

On Sat, 8 Mar 2003, Brian Candler wrote:

--
  ====================================
  > Ara Howard
  > NOAA Forecast Systems Laboratory
  > Information and Technology Services
  > Data Systems Group
  > R/FST 325 Broadway
  > Boulder, CO 80305-3328
  > Email: ahoward@fsl.noaa.gov
  > Phone: 303-497-7238
  > Fax: 303-497-7259
  ====================================

> > http://groups.google.com/groups?q=ahoward+mod_fcgi+group:comp.lang.ruby&hl=en&lr=&ie=UTF-8&selm=Pine.LNX.4.33.0302121430040.10747-100000%40eli.fsl.noaa.gov&rnum=1
>
> Yes, I found this sufficiently interesting that I kept a copy when you first
> posted it, so thanks for reminding me! It's [ruby-talk:64468] for a shorter
> link.

do you use the http://blade.nagaokaut.ac.jp/ruby/ruby-talk/index.shtml
interface to ruby-talk? i tried using that search facility to pull up my post
(searching for 'ahoward'), but didn't have any luck??

I do. The 'Namazu' search is broken; if it can't be fixed then IMO it would
be much better if it were simply removed.

Using 'subjects, regular expression' search and entering 'FCGI' finds your
post though.

yeah, i knew about the supposed mod_fastcgi installed PIPE handler but figured
it couldn't hurt. let me know if find otherwise. regarding
libfcgi/os_unix.c, this is not used if you 'require fcgi.rb', which is an
option. remember the package ships with a ruby and c impl.

'require fcgi.rb' pulls in the C version if you have it, otherwise it loads
the Ruby version.

If there are going to be two versions then they really should be made as
similiar in behaviour as possible of course. The only reason I haven't
tested the Ruby version yet is that I've been too lazy to install stringio
:slight_smile:

> * the trapping of TERM and HUP doesn't work properly for me. What happens is
> that if I send such a signal to the process, nothing happens (ps shows the
> same pid) until the next HTTP request comes along, at which point it fails
> and Apache returns '500 Internal Server Error'. The process is then
> restarted and it's fine thereafter.

silly question but above you said

> I removed install_traps and sent a kill -PIPE to the spawned process and it

you did put it back in right? the described behaviour sound like you did not,
though i suspect you did... ;-(

Yep, I tried both combinations.

- with install_traps: I get the hanging behaviour as described above

- with install_traps commented out: the process dies immediately on
  receipt of TERM or HUP as expected (although if it were in the middle of
  processing a request, it would bomb out without tidying up rather than
  finish the request)

FYI I am currently working on adding FCGI server support to druby (there is
already a HTTP client in samples/http0.rb). Will let you know if I get it to
work!

Regards,

Brian.

···

On Sat, Mar 08, 2003 at 04:40:17PM +0000, ahoward wrote:

If Moonwolf is around: how about making FCGX_IsCGI available from Ruby (and
an equivalent function in the pure Ruby version)? Then we could do:

ruby-fcgi-0.8.2 released
http://www.moonwolf.com/ruby/archive/ruby-fcgi-0.8.2.tar.gz

Changes(0.8.1 → 0.8.2)

  • FCGI.is_cgi? add(C & pure Ruby)
  • force ‘pure Ruby version’ load
    FCGI_PURE_RUBY = true
    require ‘fcgi’
···


MoonWolf moonwolf@moonwolf.com

‘require fcgi.rb’ pulls in the C version if you have it, otherwise it loads
the Ruby version.

doesn’t this always require ‘fcgi.rb’? i mean, isn’t only

require ‘fcgi’

allowed to chose the *.so over the *.rb?

If there are going to be two versions then they really should be made as
similiar in behaviour as possible of course. The only reason I haven’t
tested the Ruby version yet is that I’ve been too lazy to install stringio
:slight_smile:

i don’t know the history of this, but it certainly seems like a GC’d language
would be a much better choice to write fastcgi servers in…

  • the trapping of TERM and HUP doesn’t work properly for me. What happens is
    that if I send such a signal to the process, nothing happens (ps shows the
    same pid) until the next HTTP request comes along, at which point it fails
    and Apache returns ‘500 Internal Server Error’. The process is then
    restarted and it’s fine thereafter.

i checked this out a little using strace - looks like accept (or calls before
accept) are catching everything so there’s not much to be done :

[howardat@dhcppc1 fcgi-bin]# strace -p 10511
accept(0, 0xbfffe1ac, [112]) = ? ERESTARTSYS (To be restarted)
— SIGUSR2 (User defined signal 2) —
sigreturn() = ? (mask now )
accept(0,

same goes for HUP, USR1, etc.

Yep, I tried both combinations.

  • with install_traps: I get the hanging behaviour as described above

  • with install_traps commented out: the process dies immediately on
    receipt of TERM or HUP as expected (although if it were in the middle of
    processing a request, it would bomb out without tidying up rather than
    finish the request)

i think sending SIGHUP and having one request fail is about a good as it gets
;-(

i’m not sure what the alternative would be…

FYI I am currently working on adding FCGI server support to druby (there is
already a HTTP client in samples/http0.rb). Will let you know if I get it to
work!

to what end? i mean, what would be the point of having a distributed fastcgi
process? not that there isn’t a point, i’m just wondering what you’re on
about?

one thing which really needs addressed with fcgi is a way to run from a tty so
you can enter params and see the html (or error messages) come blasting back
out… not being able to do this is a real pain when debugging.

-a

···

On Sun, 9 Mar 2003, Brian Candler wrote:

Ara Howard
NOAA Forecast Systems Laboratory
Information and Technology Services
Data Systems Group
R/FST 325 Broadway
Boulder, CO 80305-3328
Email: ahoward@fsl.noaa.gov
Phone: 303-497-7238
Fax: 303-497-7259
====================================

Wonderful!

I’ll try and find time to sort out the signal handling issues (which may
just mean a FreeBSD vs. rest-of-the-world issue). Once done, just need to
add
FCGI.each_cgi do |cgi|

end
from ahoward’s code and I think we have a really nice solution. In fact, I
think CGI supports mod_ruby as well, so the program should run unchanged in
three environments: standard CGI, fastcgi, and mod_ruby!

Cheers,

Brian.

···

On Tue, Mar 11, 2003 at 07:28:42PM +0900, MoonWolf wrote:

If Moonwolf is around: how about making FCGX_IsCGI available from Ruby (and
an equivalent function in the pure Ruby version)? Then we could do:

ruby-fcgi-0.8.2 released
http://www.moonwolf.com/ruby/archive/ruby-fcgi-0.8.2.tar.gz

Changes(0.8.1 → 0.8.2)

  • FCGI.is_cgi? add(C & pure Ruby)
  • force ‘pure Ruby version’ load
    FCGI_PURE_RUBY = true
    require ‘fcgi’

‘require fcgi.rb’ pulls in the C version if you have it, otherwise it loads
the Ruby version.

doesn’t this always require ‘fcgi.rb’? i mean, isn’t only

require ‘fcgi’

allowed to chose the *.so over the *.rb?

No - it is fcgi.rb which in turn tries to do require fcgi.so, and catches
the exception if that fails, in which case it builds the Ruby classes
itself.

So require ‘fcgi’ and require ‘fcgi.rb’ are identical.

  • the trapping of TERM and HUP doesn’t work properly for me. What happens is
    that if I send such a signal to the process, nothing happens (ps shows the
    same pid) until the next HTTP request comes along, at which point it fails
    and Apache returns ‘500 Internal Server Error’. The process is then
    restarted and it’s fine thereafter.

i checked this out a little using strace - looks like accept (or calls before
accept) are catching everything so there’s not much to be done :

[howardat@dhcppc1 fcgi-bin]# strace -p 10511
accept(0, 0xbfffe1ac, [112]) = ? ERESTARTSYS (To be restarted)
— SIGUSR2 (User defined signal 2) —
sigreturn() = ? (mask now )
accept(0,

same goes for HUP, USR1, etc.

That’s what I’d expect - well, actually I’d expect EINTR. Poking around
FreeBSD header files, there’s no ERESTARTSYS but there’s an ERESTART which
is used internally by the kernel, so I guess certain types of syscall are
automatically restarted. Looks like I’ll need to play to find out whether
accept() works that way.

  • with install_traps: I get the hanging behaviour as described above

  • with install_traps commented out: the process dies immediately on
    receipt of TERM or HUP as expected (although if it were in the middle of
    processing a request, it would bomb out without tidying up rather than
    finish the request)

i think sending SIGHUP and having one request fail is about a good as it gets
;-(

i’m not sure what the alternative would be…

Don’t trap the SIGHUP. The child dies straight away, it gets respawned
straight away by Apache (which will keep a minimum of one child per fcgi
around unless you configure it otherwise), so the next request is handled
successfully.

FYI I am currently working on adding FCGI server support to druby (there is
already a HTTP client in samples/http0.rb). Will let you know if I get it to
work!

to what end? i mean, what would be the point of having a distributed fastcgi
process? not that there isn’t a point, i’m just wondering what you’re on
about?

I mean using the drb protocol over HTTP as an API: e.g. front-end server
talks to the world, and talks DRB-over-HTTP to the back end system, which
has a pool of database processes run under fastcgi. It requires Ruby at both
ends of course, but it should be a darned sight faster than SOAP or
YAML/OKAY, and is so easy to use because you just make object calls on the
front-end (which magically perform actions on the backend)

one thing which really needs addressed with fcgi is a way to run from a tty so
you can enter params and see the html (or error messages) come blasting back
out… not being able to do this is a real pain when debugging.

It can be made automatic. The C library provides a function FCGX_IsCGI()
which lets you detect whether you’re running under a fastcgi environment or
not. (Alternatively, you could write a shell which popen’s a fastcgi process
and talks fastcgi protocol to it)

Anyway, I did manage to get DRB/HTTP to work, but I’ve now been waylaid
looking at performance problems. Firstly, my server was only handling about
10 requests per second, even for plain HTML pages. I finally solved this by
setting the TCP_NODELAY socket option (see attached)

Interestingly, the Ruby MOD_FCGI/CGI module itself isn’t particularly
speedly when compared with raw fcgi:

Test 1: (MOD_FCGI)

#!/usr/local/bin/ruby
require ‘mod_fcgi’
MOD_FCGI.each(‘html3’) do |cgi|
cgi.out { “Minimal\n” }
end

Test 2: (raw FCGI)

#!/usr/local/bin/ruby
require ‘fcgi’
FCGI.each { |req|
req.out.print “Content-Type: text/plain\n\nMinimal\n”
req.finish
}

Request/response cycle times:

  • plain HTML page 0.00486 secs
  • test 1 0.0744 secs
  • test 2 0.00666 secs (> 10 times faster)

So may be worth doing some profiling of the CGI module. (This doesn’t seem
to be a TCP_NODELAY problem: trussing the code shows it read()ing a FCGI
request over the socket, and the write()ing the response back 0.07 seconds
later, so it does appear to be Ruby processing)

This I’m happy with. What I’m not happy with, under Solaris, is a strange
bug where normal CGI requests can take up to 3 seconds to complete. I did
once manage to capture this with truss, it seemed to be doing
alarm(3)
sigsuspend … sleeps here
… woken by the alarm signal

However, since today, running a truss on the child which is handling the
requests makes the problem go away :frowning: In fact, a ‘normal’ CGI of the form

#!/bin/sh
echo “Content-Type: text/html”
echo “”
echo “Hello”

actually executes faster (0.035 secs) than the Ruby mod_fcgi (0.086 secs)
although fcgi is still faster (0.022 secs) if truss is in place.

With trussing turned off, the above shell script takes an average of 1.5
seconds per iteration. Argh!

Anyway, that’s either an Apache or a Solaris problem, and hence off-topic
for this list (although I’d be very happy to hear the solution if anyone
knows it :slight_smile:

FYI, with DRb over HTTP I am managing about 20 RPC exchanges per second,
with a reasonably substantial object being returned, and a DBI query thrown
in as part of the request processing. That I’m very happy with.

Regards,

Brian.

hithtml.rb (762 Bytes)

···

On Sun, Mar 09, 2003 at 06:40:35PM +0900, ahoward wrote:

On Sun, 9 Mar 2003, Brian Candler wrote:

No - it is fcgi.rb which in turn tries to do require fcgi.so, and catches
the exception if that fails, in which case it builds the Ruby classes
itself.

hadn’t noticed this…

That’s what I’d expect - well, actually I’d expect EINTR. Poking around
FreeBSD header files, there’s no ERESTARTSYS but there’s an ERESTART which
is used internally by the kernel, so I guess certain types of syscall are
automatically restarted. Looks like I’ll need to play to find out whether
accept() works that way.

forgot to mention i was on linux. i do not that restarting systems calls is
one of those gothcas between unices. what i meant to say was that i don’t
think it would be wise to mess with accept’s sig handler and, since they get
installed AFTER the ones in ruby trying to reload via signals may be futile.

Don’t trap the SIGHUP. The child dies straight away, it gets respawned
straight away by Apache (which will keep a minimum of one child per fcgi
around unless you configure it otherwise), so the next request is handled
successfully.

the problem is that this drops the current request, if you are using a
transactional database that might be fine - otherwise…

I mean using the drb protocol over HTTP as an API: e.g. front-end server
talks to the world, and talks DRB-over-HTTP to the back end system, which
has a pool of database processes run under fastcgi. It requires Ruby at both
ends of course, but it should be a darned sight faster than SOAP or
YAML/OKAY, and is so easy to use because you just make object calls on the
front-end (which magically perform actions on the backend)

i’m fuzzy on why you need fastcgi for the database backend?

It can be made automatic. The C library provides a function FCGX_IsCGI()
which lets you detect whether you’re running under a fastcgi environment or
not. (Alternatively, you could write a shell which popen’s a fastcgi process
and talks fastcgi protocol to it)

i will look into this, but not soon. anyone else?

Interestingly, the Ruby MOD_FCGI/CGI module itself isn’t particularly
speedly when compared with raw fcgi:

there are alot more function calls in mod_fcgi…

> Argh!

good luck. :wink:

have you looked at siege - http://www.joedog.org/ - for testing?

-a

···

On Mon, 10 Mar 2003, Brian Candler wrote:

Content-Disposition: attachment; filename=“hithtml.rb”

#!/usr/local/bin/ruby

require ‘net/http’
require ‘uri’

TEST = [
[ ‘http://127.0.0.1/index.html’ ],
[ ‘http://127.0.0.1/fcgi-bin/ftest.cgi’ ],
[ ‘http://127.0.0.1/fcgi-bin/ftest2.cgi’ ],
]
N = 50
PERSIST = true

$defout.sync=true

TEST.each do |test|
it = URI.parse(test[0])
puts “Req:\t#{it.host},#{it.port},#{it.path}”
GC.disable
a = Net::HTTP.new(it.host, it.port || 80)
if PERSIST
a.start
a.socket.socket.setsockopt(6, Socket::TCP_NODELAY, 1) # <<<< FOR UNIX
end
r = a.get(it.path)
puts r ## want to check content is correct?
start = Time.now
N.times {
print “.” ## progress info
r = a.get(it.path)
}
puts “\nTime:\t#{(Time.now - start)/N}”
puts
a.finish if PERSIST
GC.enable
GC.start
end

Ara Howard
NOAA Forecast Systems Laboratory
Information and Technology Services
Data Systems Group
R/FST 325 Broadway
Boulder, CO 80305-3328
Email: ahoward@fsl.noaa.gov
Phone: 303-497-7238
Fax: 303-497-7259
====================================

brian-

i simplified my signal handlers for the MOD_FCGI class as

def install_traps
trap(‘SIGUSR2’) do
@@exit_requested = true
end

end

and this seems to work really nicely. in otherwords, receiving USR2 causes a
fcgi program to always handle the next request, and then reload itself.
using this handler one can avoid a fcgi program bombing out in the middle of a
request and bombing on the next request with ‘internal server error’ but the
developer must reload twice to see any script changes made. this seems
a reasonable price to pay if it prevents users from seeing web server errors
or having their transactions rolled back because the program they’re using has
been updated…

-a

···

On Mon, 10 Mar 2003, Brian Candler wrote:

Don’t trap the SIGHUP. The child dies straight away, it gets respawned
straight away by Apache (which will keep a minimum of one child per fcgi
around unless you configure it otherwise), so the next request is handled
successfully.

Ara Howard
NOAA Forecast Systems Laboratory
Information and Technology Services
Data Systems Group
R/FST 325 Broadway
Boulder, CO 80305-3328
Email: ahoward@fsl.noaa.gov
Phone: 303-497-7238
Fax: 303-497-7259
====================================

> Don't trap the SIGHUP. The child dies straight away, it gets respawned
> straight away by Apache (which will keep a minimum of one child per fcgi
> around unless you configure it otherwise), so the next request is handled
> successfully.

the problem is that this drops the current request, if you are using a
transactional database that might be fine - otherwise...

... the client gets half a page, which IMO is no worse than getting a 500
Server Failed error :slight_smile:

> I mean using the drb protocol over HTTP as an API: e.g. front-end server
> talks to the world, and talks DRB-over-HTTP to the back end system, which
> has a pool of database processes run under fastcgi. It requires Ruby at both
> ends of course, but it should be a darned sight faster than SOAP or
> YAML/OKAY, and is *so* easy to use because you just make object calls on the
> front-end (which magically perform actions on the backend)

i'm fuzzy on why you need fastcgi for the database backend?

Well, it's like this. If I run a raw DRb server, I get into threading
issues. Firstly, the way that a DRb server works, the object being served
may get multiple method calls simultaneously from concurrent threads as
multiple requests come in. So you can't just write:

   db = DBI.connect('dbi:foo:','bar','baz')
   DRb.start_service('druby://localhost:9000', db)

because that would be highly dangerous. That can be fixed by wrapping it in
an object which serialises the requests. In fact the that object can hold an
array of separate DBI handles, which is even better: in principle, one
client doing a long query then doesn't starve out other clients.

In practice though, calls to DBI libraries are likely to block Ruby's
threading. Maybe some don't, but at best it will depend on what type of DB
backend you are using. That is the killer for me. And finally, on a
multi-CPU box, I don't want it all taking place on one CPU.

The solution then is to fork into multiple processes, each handling one
request at a time. I didn't fancy writing the code to do that; furthermore,
if doing it in Ruby I'd also have to modify DRb to work over a stdin/stdout
pipe for the parent to pass messages to the child.

Now, mod_fastcgi handles process creation automatically, even dynamically
sizing the pool, and passing requests to individual children (using the
fastcgi protocol). This seemed to be ideal. I made a small mod to DRb so
that it does not accept() another incoming request until the current one has
completed, and hey presto.

My other solution would have been mod_ruby, which I've not tried yet - then
I get one database handle for each Apache worker process, which also gives
me the concurrency I need.

> Interestingly, the Ruby MOD_FCGI/CGI module itself isn't particularly
> speedly when compared with raw fcgi:

there are **alot** more function calls in mod_fcgi...

I think the overhead is in the Ruby standard CGI library, not in mod_fcgi.
The attached program on my little laptop takes about 0.10 seconds per
iteration.

It turns out that the vast majority of this is the 'html3' bit, since it
does a whole load of dynamic method additions:

    case type
    when "html3"
      extend Html3
      element_init()
      extend HtmlExtension

Changing CGI.new('html3') to CGI.new(nil) makes it run approximately 180
times faster, at 0.00055 seconds per iteration :slight_smile:

The moral is: if you can avoid using CGI's tag-generating feature (which I
don't think is friendly anyway) then you get a big improvement gain...

Regards,

Brian.

require 'cgi'
class Reader
  def initialize(str)
    @str = str
  end
  def read(n)
    res = @str
    @str = nil
    res
  end
  def binmode; end
end

ENV['REQUEST_METHOD']='GET'
N = 20
src = (1..N).collect { Reader.new("foo=bar") }
start = Time.now
src.each do |s|
  $stdin = s
  CGI.new('html3')
end
puts "Per iteration: #{(Time.now - start)/N}"

···

On Mon, Mar 10, 2003 at 01:03:27PM +0900, ahoward wrote:

the problem is that this drops the current request, if you are using a
transactional database that might be fine - otherwise…

… the client gets half a page, which IMO is no worse than getting a 500
Server Failed error :slight_smile:

the problem would be when using a non-transactional db, say mysql, and blowing
up midway through the application - bad thing.

I mean using the drb protocol over HTTP as an API: e.g. front-end server
talks to the world, and talks DRB-over-HTTP to the back end system, which
has a pool of database processes run under fastcgi. It requires Ruby at both
ends of course, but it should be a darned sight faster than SOAP or
YAML/OKAY, and is so easy to use because you just make object calls on the
front-end (which magically perform actions on the backend)

i’m fuzzy on why you need fastcgi for the database backend?

Well, it’s like this. If I run a raw DRb server, I get into threading
issues. Firstly, the way that a DRb server works, the object being served
may get multiple method calls simultaneously from concurrent threads as
multiple requests come in. So you can’t just write:

db = DBI.connect(‘dbi:foo:’,‘bar’,‘baz’)
DRb.start_service(‘druby://localhost:9000’, db)
because that would be highly dangerous. That can be fixed by wrapping it in
an object which serialises the requests. In fact the that object can hold an
array of separate DBI handles, which is even better: in principle, one
client doing a long query then doesn’t starve out other clients.

i did exactly this for postgresql -

http://raa.ruby-lang.org/list.rhtml?name=pgconngroup

In practice though, calls to DBI libraries are likely to block Ruby’s
threading. Maybe some don’t, but at best it will depend on what type of DB
backend you are using. That is the killer for me. And finally, on a
multi-CPU box, I don’t want it all taking place on one CPU.

The solution then is to fork into multiple processes, each handling one
request at a time. I didn’t fancy writing the code to do that; furthermore,
if doing it in Ruby I’d also have to modify DRb to work over a stdin/stdout
pipe for the parent to pass messages to the child.

or create a DRb object to handle the request, fork start this object on a port
in the child, contact the parent using drb:// protocol to register with the
parent, and then child and parent can communticate using the drb protocol
exclusively. the parent could also return a handle on the forked child DRb
object to the client - and then communication would be between them from there
on out. i guess this would be like a normal forking server, only using DRb
instead of sockets…

Now, mod_fastcgi handles process creation automatically, even dynamically
sizing the pool, and passing requests to individual children (using the
fastcgi protocol). This seemed to be ideal. I made a small mod to DRb so
that it does not accept() another incoming request until the current one has
completed, and hey presto.

sounds really cool. how do you prompt mod_fastcgi to spwan a process? are
you hitting it via http and the web server or using it directly?

My other solution would have been mod_ruby, which I’ve not tried yet - then
I get one database handle for each Apache worker process, which also gives
me the concurrency I need.

i haven’t tried it yet but the issues i’ve seen on this list with building are
not motivating…

Interestingly, the Ruby MOD_FCGI/CGI module itself isn’t particularly
speedly when compared with raw fcgi:

there are alot more function calls in mod_fcgi…

I think the overhead is in the Ruby standard CGI library, not in mod_fcgi.
The attached program on my little laptop takes about 0.10 seconds per
iteration.

It turns out that the vast majority of this is the ‘html3’ bit, since it
does a whole load of dynamic method additions:

case type
when "html3"
  extend Html3
  element_init()
  extend HtmlExtension

Changing CGI.new(‘html3’) to CGI.new(nil) makes it run approximately 180
times faster, at 0.00055 seconds per iteration :slight_smile:

ouch.

The moral is: if you can avoid using CGI’s tag-generating feature (which I
don’t think is friendly anyway) then you get a big improvement gain…

funny, i NEVER use this except for in my example program. i use amrita for
all my stuff.

i love amrita. anyone used to new C version?

-a

···

On Tue, 11 Mar 2003, Brian Candler wrote:

Ara Howard
NOAA Forecast Systems Laboratory
Information and Technology Services
Data Systems Group
R/FST 325 Broadway
Boulder, CO 80305-3328
Email: ahoward@fsl.noaa.gov
Phone: 303-497-7238
Fax: 303-497-7259
====================================