"loop do IO.select([io], nil, nil)" eats 95% of CPU, any other way?

Hi, which is the most efficient way of receiving and processing data
from a network socket?

I use GServer in this common way:

···

--------------
class MyServer < GServer
    def serve(io)
        loop do
              if IO.select([io], nil, nil)
                  ....
--------------

A "top" says to me that Ruby is eating more than 90% of CPU and there
is no connections yet... :frowning:
Any other suggestion?

Really thanks a lot.

--
Iñaki Baz Castillo
<ibc@aliax.net>

Why don't you simply use blocking IO?

  robert

···

On 26.03.2008 17:34, Iñaki Baz Castillo wrote:

Hi, which is the most efficient way of receiving and processing data
from a network socket?

I use GServer in this common way:

--------------
class MyServer < GServer
    def serve(io)
        loop do
              if IO.select([io], nil, nil)
                  ....
--------------

A "top" says to me that Ruby is eating more than 90% of CPU and there
is no connections yet... :frowning:
Any other suggestion?

Really thanks a lot.

from a network socket?

I use GServer in this common way:

···

On Thu, 27 Mar 2008, Iñaki Baz Castillo wrote:

Hi, which is the most efficient way of receiving and processing data

--------------
class MyServer < GServer
     def serve(io)
         loop do
               if IO.select([io], nil, nil)
                   ....
--------------

A "top" says to me that Ruby is eating more than 90% of CPU and there
is no connections yet... :frowning:
Any other suggestion?

Hmm.From the "man select" page...

       timeout is an upper bound on the amount of time elapsed before
        select() returns. It may be zero, causing select() to return
        immediately. (This is useful for polling.) If timeout is NULL
        (no timeout), select() can block indefinitely.

ri IO.select
  IO.select(read_array
      [, write_array
      [, error_array
      [, timeout]]] ) => array or nil

My guess is you have one too few nil's in there and the default timeout is 0 not nil.

If you are on linux say...

   "man strace"

to discover why linux is such a great place to develop on.... you can
_always_ find out what is really going on with any program.

As a little side comment... You can _always_ take a program written
around a select & state machine and make it multi-threaded doing
blocking I/O or conversely refactor a multi-threaded app into a select
based state machine.

Choose whichever is easiest / cleanest for you.

John Carter Phone : (64)(3) 358 6639
Tait Electronics Fax : (64)(3) 359 4632
PO Box 1645 Christchurch Email : john.carter@tait.co.nz
New Zealand

I don't know what GServer is, but I guess that serve is called when
there's a connection. As John points out, if io.eof? is true, select
will return immediately. For example:

#!/usr/bin/env ruby

loop do
    open("fifo") do |f|
        puts "Opened fifo"
        loop do
            ready, = select([f], nil, nil, nil).first
            break if ready.nil?
            p ready.read(1000)
            break if ready.eof?
        end
    end
    puts "Closed fifo"
end

You'll see that it opens the fifo, blocks on it, once you write to it
select returns, and once you reach EOF the inner loop breaks and the
fifo is closed. You can do something like:

$ ruby -e 'puts "x"*2000' > fifo

and you'll see the inner loop consuming all the input.

If you remove:

            break if ready.eof?

you'll see the behaviour you describe.

Short answer: are you sure serve is called only if there's a connection?

Marcelo

···

On Wed, Mar 26, 2008 at 10:34 AM, Iñaki Baz Castillo <ibc@aliax.net> wrote:

--------------
class MyServer < GServer
    def serve(io)
        loop do
              if IO.select([io], nil, nil)
                  ....
--------------

A "top" says to me that Ruby is eating more than 90% of CPU and there
is no connections yet... :frowning:
Any other suggestion?

Assuming of course that a block in one thread won't block all other threads.

···

On 26 Mar 2008, at 19:53, John Carter wrote:

As a little side comment... You can _always_ take a program written
around a select & state machine and make it multi-threaded doing
blocking I/O or conversely refactor a multi-threaded app into a select
based state machine.

Hmm.From the "man select" page...

       timeout is an upper bound on the amount of time elapsed before
        select() returns. It may be zero, causing select() to return
        immediately. (This is useful for polling.) If timeout is NULL
        (no timeout), select() can block indefinitely.

My guess is you have one too few nil's in there and the default timeout is
0 not nil.

Yes, true, with "strace" I see:

...
sigprocmask(SIG_BLOCK, NULL, ) = 0
sigprocmask(SIG_BLOCK, NULL, ) = 0
sigprocmask(SIG_BLOCK, NULL, ) = 0
sigprocmask(SIG_BLOCK, NULL, ) = 0
sigprocmask(SIG_BLOCK, NULL, ) = 0
select(4, [3], , , {0, 0}) = 0 (Timeout) <---
sigprocmask(SIG_BLOCK, NULL, ) = 0
sigprocmask(SIG_BLOCK, NULL, ) = 0
sigprocmask(SIG_BLOCK, NULL, ) = 0
sigprocmask(SIG_BLOCK, NULL, ) = 0
sigprocmask(SIG_BLOCK, NULL, ) = 0
...

but I've tryed with all the values in the 4º parameter of "IO.select":
  IO.select([io], nil, nil, 0)
  IO.select([io], nil, nil, nil)
  IO.select([io], nil, nil, X)

and nothing changes, in all cases I see the same with "strace". ¿?¿

to discover why linux is such a great place to develop on.... you can
_always_ find out what is really going on with any program.

Sure I use Linux... what else? XD

As a little side comment... You can _always_ take a program written
around a select & state machine and make it multi-threaded doing
blocking I/O or conversely refactor a multi-threaded app into a select
based state machine.

Of course I need to read a lot about IO methods :slight_smile:

Thanks a lot.

···

El Miércoles, 26 de Marzo de 2008, John Carter escribió:

--
Iñaki Baz Castillo

I've tested your code and I understand it now.
The problem is if I use GServer. I've done a very simple example code:

Your code a little modified:

--- io1.rb --------------
#!/usr/bin/env ruby

loop do
    puts "----- main loop -----"
    open("fifo") do |f|
        puts "Opened fifo"
        loop do
            p "-- second loop --"
            ready, = select([f], nil, nil, nil).first
            break if ready.nil?
            p ready.read(10)
            break if ready.eof?
        end
    end
    puts "Closed fifo"
end

···

2008/3/27, Marcelo <marcelo.magallon@gmail.com>:

I don't know what GServer is, but I guess that serve is called when
  there's a connection. As John points out, if io.eof? is true, select
  will return immediately. For example:

#!/usr/bin/env ruby

loop do
    open("fifo") do |f|
        puts "Opened fifo"
        loop do
            ready, = select([f], nil, nil, nil).first
            break if ready.nil?
            p ready.read(1000)
            break if ready.eof?
        end
    end
    puts "Closed fifo"
end

  You'll see that it opens the fifo, blocks on it, once you write to it
  select returns, and once you reach EOF the inner loop breaks and the
  fifo is closed. You can do something like:

  $ ruby -e 'puts "x"*2000' > fifo

  and you'll see the inner loop consuming all the input.

  If you remove:

            break if ready.eof?

  you'll see the behaviour you describe.

--------------------

Now I do:

~# ./io1.rb
----- main loop -----

(and waits)

~# ruby -e 'puts "X_"*2' > fifo
Opened fifo
"-- second loop --"
"XX\n"
Closed fifo
----- main loop -----

(and waits again).

Perfect.

But now I try a similar way using GServer (that implements a TCP
multithreaded server):

----------- io2.rb ----------------
#!/usr/bin/env ruby

require 'gserver'

class Server < GServer

        def initialize(port=2000)
                super(port)
        end

        def serve(io)

                puts "------------ serve(io) -------------"
                loop do
                        p "-- second loop --"
                        ready, = select([io], nil, nil, nil).first
                        break if ready.nil?
                        p ready.read(10)
                        break if ready.eof?
                end
                puts "---------- end server(io) ------------"
        end

end

server = Server.new
server.audit=true
server.start

loop do
        #puts "----- main loop -----"
        break if server.stopped?
end
-------------------------------------

~# ./io2.rb
----- main loop -----
----- main loop -----
----- main loop -----
----- main loop -----
...

So there is no connection yet but Ruby is doing a loop and eating 90%
of CPU. Why??

I connect:

~# echo "12345678901234567890QWERTYUIOP" | nc 127.0.0.1 2000 (and
press Ctrl +C):
------------ serve(io) -------------
"-- second loop --"
"1234567890"
"-- second loop --"
"1234567890"
"-- second loop --"
"QWERTYUIOP"
"-- second loop --"
"\n"
---------- end server(io) ------------
[Thu Mar 27 10:48:50 2008] Server 127.0.0.1:2000 client:44187 disconnect

  Short answer: are you sure serve is called only if there's a connection?

Yes, "puts ---------server(io) ----------" confirms it.

I will try to do the same without GServer or threads, using TCPServer
class and so.

--
Iñaki Baz Castillo
<ibc@aliax.net>

My guess is if you include the read statements you get something like this...
   read(3, "", 1000) = 0
Which means (according to "man 2 read")
   On success, the number of bytes read is returned (zero indicates end of file)

If you read "man select_tut"
        9. If the functions read(2), recv(2), write(2), and send(2) fail with errors other than those
               listed in 7., or one of the input functions returns 0, indicating end of file, then you should
               not pass that descriptor to select() again. In the above example, I close the descriptor imme‐
               diately, and then set it to -1 to prevent it being included in a set.

Thus...

mkfifo foofi
ruby -w test.rb&
echo bah > foofi

==test.rb=============================================================
loop do
    begin
       open( "foofi") do |f|
          loop do
             p select([f],nil,nil,500000)
             p f.sysread(1000)
          end
       end
    rescue EOFError => details
       p details
    end
end

···

On Thu, 27 Mar 2008, Iñaki Baz Castillo wrote:

...
sigprocmask(SIG_BLOCK, NULL, ) = 0
select(4, [3], , , {0, 0}) = 0 (Timeout) <---
sigprocmask(SIG_BLOCK, NULL, ) = 0

======================================================================
Works as expected...

But...

======================================================================
open( "foofi") do |f|
    loop do
      p select([f],nil,nil,500000)
      p f.sysread(1000)
    end
end

Bombs out with EOFError

but curiously

open( "foofi") do |f|
    loop do
       p select([f],nil,nil,500000)
       p f.read(1000)
    end
end

Exhibits exactly the behaviour you describe.

Looking at the strace....
open("foofi", O_RDONLY|O_LARGEFILE) = 9
select(10, [9], NULL, NULL, {500000, 0}) = 1 (in [9], left {500000, 0})
fstat64(1, {st_dev=makedev(0, 11), st_ino=37, st_mode=S_IFCHR|0620, st_nlink=1, st_uid=1001, st_gid=5, st_blksize=1024, st_blocks=0, st_rdev=makedev(136, 35), st_atime=2008/03/27-14:45:52, st_mtime=2008/03/27-14:45:52, st_ctime=2008/03/27-14:45:52}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7db2000
write(1, "[[#<File:foofi>], , ]\n", 26) = 26
fstat64(9, {st_dev=makedev(8, 3), st_ino=2375703, st_mode=S_IFIFO|0644, st_nlink=1, st_uid=1001, st_gid=65534, st_blksize=4096, st_blocks=0, st_size=0, st_atime=2008/03/27-14:43:56, st_mtime=2008/03/27-14:45:54, st_ctime=2008/03/27-14:45:54}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7db1000
read(9, "bah\n", 4096) = 4
read(9, "", 4096) = 0
write(1, "\"bah\\n\"\n", 8) = 8
select(10, [9], NULL, NULL, {500000, 0}) = 1 (in [9], left {500000, 0})
write(1, "[[#<File:foofi>], , ]\n", 26) = 26
write(1, "nil\n", 4) = 4
select(10, [9], NULL, NULL, {500000, 0}) = 1 (in [9], left {500000, 0})
write(1, "[[#<File:foofi>], , ]\n", 26) = 26
write(1, "nil\n", 4) = 4

The curiosity is why doesn't the f.read throw an EOFError

According to ri IO#read...

      At end of file, it returns nil or "" depend on length.
      ios.read() and ios.read(nil) returns "".
      ios.read(positive-integer) returns nil.

So it all behaves according to plan... just not your plan... :-))

John Carter Phone : (64)(3) 358 6639
Tait Electronics Fax : (64)(3) 359 4632
PO Box 1645 Christchurch Email : john.carter@tait.co.nz
New Zealand

I've tryed now with TCPServer alone (and Threads) without GServer and
this loop issue doesn't occur:

--- io3.rb ------------------------------------
#!/usr/bin/env ruby

require 'socket'

server = TCPServer.open(2000)

loop do
  p "------- main loop --------"
  socket = server.accept

  Thread.start do # one thread per client

       s = socket
        loop do
                select([s], nil, nil, nil)
                break if s.nil?
                p s.read(10)
                break if s.eof?
        end

   end

end

···

2008/3/27, Iñaki Baz Castillo <ibc@aliax.net>:

But now I try a similar way using GServer (that implements a TCP
multithreaded server):

----------- io2.rb ----------------
#!/usr/bin/env ruby

require 'gserver'

class Server < GServer

        def initialize(port=2000)
                super(port)
        end

        def serve(io)

                puts "------------ serve(io) -------------"
                loop do
                        p "-- second loop --"

                        ready, = select([io], nil, nil, nil).first
                        break if ready.nil?
                        p ready.read(10)

                        break if ready.eof?
                end

                puts "---------- end server(io) ------------"
        end

end

server = Server.new
server.audit=true
server.start

loop do
        #puts "----- main loop -----"
        break if server.stopped?
end
-------------------------------------

~# ./io2.rb
----- main loop -----
----- main loop -----
----- main loop -----
----- main loop -----
...

So there is no connection yet but Ruby is doing a loop and eating 90%
of CPU. Why??

I will try to do the same without GServer or threads, using TCPServer
class and so.

------------------------------------------------------

~# ./io3.rb
"------ main loop ------" (and waits)

~# echo "1234567890" | nc 127.0.0.1 2000
"------ main loop ------"
"1234567890"

So I don't know why but using GServer the main loop runs all the time
while witout using GServer it doesn't occur ¿?

My guess is if you include the read statements you get something like
this... read(3, "", 1000) = 0
Which means (according to "man 2 read")

Which lib package must I install to get these man pages? I don't find them in
a Debian with default installation.

==test.rb=============================================================
loop do
    begin
       open( "foofi") do |f|
          loop do
             p select([f],nil,nil,500000)
             p f.sysread(1000)
          end
       end
    rescue EOFError => details
       p details
    end
end

Works as expected...

But...

======================================================================
open( "foofi") do |f|
    loop do
      p select([f],nil,nil,500000)
      p f.sysread(1000)
    end
end

Bombs out with EOFError

but curiously

open( "foofi") do |f|
    loop do
       p select([f],nil,nil,500000)
       p f.read(1000)
    end
end

Exhibits exactly the behaviour you describe.

So it all behaves according to plan... just not your plan... :-))

Thanks a lot for so great information, it's really a good explanation.
Tomorrow I'll review all of it.

Thanks a lot for all and best regards.

···

El Jueves, 27 de Marzo de 2008, John Carter escribió:

--
Iñaki Baz Castillo

I've tryed now with TCPServer alone (and Threads) without GServer and
this loop issue doesn't occur:

--- io3.rb ------------------------------------
#!/usr/bin/env ruby

require 'socket'

server = TCPServer.open(2000)

loop do
  p "------- main loop --------"
  socket = server.accept

accept blocks if there's no connection available, unless you tell the
system not to. The default is to block.

So I don't know why but using GServer the main loop runs all the time
while witout using GServer it doesn't occur ¿?

Because...

> def serve(io)
>
> puts "------------ serve(io) -------------"
> loop do
> p "-- second loop --"
>
> ready, = select([io], nil, nil, nil).first

This will block if there's no data available but the file descriptor
hasn't reached EOF.

> loop do
> #puts "----- main loop -----"
> break if server.stopped?
> end

This code doesn't block, so it will loop as fast as it can.

Try replacing it with something like:

    loop do
        puts "----- main loop -----"
        sleep(1)
        break if server.stopped?
    end

Note that GServer.start will spawn a new thread, which handles all the
incoming connections. It will spawn a new thread for each new
connection, so your "serve" method has to handle it and then exit, it
doesn't make sense for it to keep lingering around after the client has
gone away.

You can also replace your main loop by server.join.

Marcelo

···

On Thu, Mar 27, 2008 at 4:47 AM, Iñaki Baz Castillo <ibc@aliax.net> wrote:

manpages-dev.

Marcelo

···

On Wed, Mar 26, 2008 at 8:51 PM, Iñaki Baz Castillo <ibc@aliax.net> wrote:

Which lib package must I install to get these man pages? I don't find
them in a Debian with default installation.

I'm using Ubuntu 7.10 "Gutsy Gibbon" so it might not be exactly the
same for you and I'm not sure which bundle they came in... but I'd
guess manpages-dev or glibc-doc.

John Carter Phone : (64)(3) 358 6639
Tait Electronics Fax : (64)(3) 359 4632
PO Box 1645 Christchurch Email : john.carter@tait.co.nz
New Zealand

···

On Thu, 27 Mar 2008, Iñaki Baz Castillo wrote:

El Jueves, 27 de Marzo de 2008, John Carter escribió:

My guess is if you include the read statements you get something like
this... read(3, "", 1000) = 0
Which means (according to "man 2 read")

Which lib package must I install to get these man pages? I don't find them in
a Debian with default installation.

> So I don't know why but using GServer the main loop runs all the time
> while witout using GServer it doesn't occur ¿?

Because...

> > def serve(io)
> >
> > puts "------------ serve(io) -------------"
> > loop do
> > p "-- second loop --"
> >
> > ready, = select([io], nil, nil, nil).first

This will block if there's no data available but the file descriptor
  hasn't reached EOF.

It was a failure of me, since my code is based in a chat server
example I found in a web. Using GServer is not needed at all to use
"select".

Note that GServer.start will spawn a new thread, which handles all the
  incoming connections. It will spawn a new thread for each new
  connection, so your "serve" method has to handle it and then exit, it
  doesn't make sense for it to keep lingering around after the client has
  gone away.

  You can also replace your main loop by server.join.

Great! I've used "server.join" instead of the loop and now the Ruby
process doesn't eat CPU :slight_smile:

Thanks a lot!

···

2008/3/27, Marcelo <marcelo.magallon@gmail.com>:

--
Iñaki Baz Castillo
<ibc@aliax.net>