Blocking I/O is really easy to use. But when you use it to write
servers, you run into problems: you can't run two blocking syscalls
simultaneously. So if you're writing a huge file to some guy, every
other client is stalled, and no one new can connect. Unacceptable, for
many types of servers. They need non-blocking I/O.
Non-blocking I/O is a lot more annoying to use. Instead of going
write "Hello. What's your name?"
name = read_line
write "How do you do, #{name}?"
state = read_line
write "It's nice to hear that you're #{state}, #{name}."
we have to make weird state machines.
The good news: we can use callcc to make the non-blocking nature
practically invisible. Multiplexer does that. Here's how you'd write a
hello server:
class Test < Multiplexer::Handler
def handle
write_line "Hello. What's your name?"
name = read_line
write_line "How do you do, #{name}?"
state = read_line
write_line "It's nice to hear that you're #{state}, #{name}."
disconnect
end
end
def test
multiplexer = Multiplexer::Multiplexer.new 0.5
multiplexer.listen 31337, Test
multiplexer.run
end
test
It'll run in one thread. Multiplexer handles the select(3) calls.
"Mikael Brockman" <mikael@phubuh.org> schrieb im Newsbeitrag
news:87vfbszr5m.fsf@igloo.phubuh.org...
Blocking I/O is really easy to use. But when you use it to write
servers, you run into problems: you can't run two blocking syscalls
simultaneously. So if you're writing a huge file to some guy, every
other client is stalled, and no one new can connect. Unacceptable, for
many types of servers. They need non-blocking I/O.
Non-blocking I/O is a lot more annoying to use. Instead of going
<snip/>
It'll run in one thread. Multiplexer handles the select(3) calls.
What is the advantage over a solution with threads? IOW, why should I use
multiplexer over individual threads per connection?
"Mikael Brockman" <mikael@phubuh.org> schrieb im Newsbeitrag
news:87vfbszr5m.fsf@igloo.phubuh.org...
> Blocking I/O is really easy to use. But when you use it to write
> servers, you run into problems: you can't run two blocking syscalls
> simultaneously. So if you're writing a huge file to some guy, every
> other client is stalled, and no one new can connect. Unacceptable,
> for many types of servers. They need non-blocking I/O.
>
> Non-blocking I/O is a lot more annoying to use. Instead of going
<snip/>
> It'll run in one thread. Multiplexer handles the select(3) calls.
What is the advantage over a solution with threads? IOW, why should I
use multiplexer over individual threads per connection?
Since Ruby's threads aren't native, you can't do I/O from several at a
time. So one IO#read call blocking for a long time will block the other
threads, too. You could loop a read with a time-out, I guess. But with
a single thread running select, the whole process can stall completely
while waiting for I/O. And it's more elegant.
What is the advantage over a solution with threads? IOW, why should I
use multiplexer over individual threads per connection?
The issues raised in subsequent posts illustrate the classic arguments
between one-thread-per-connection proponents and select-loop proponents
that I have heard since the mid 1990's.
Given good thread and select/non-blocking-io implementations, both
methods can solve the same problems and can work just fine.
As for me, I believe that it's good to have both methods to choose from.
The java folks have recently offered a 'select' methodology in addition
to their traditional thread-based approach. Perl has not-so-recently
added thread support on top of its traditional preference for
'select'.
And I'm glad that I also have both options in ruby.
"Mikael Brockman" <mikael@phubuh.org> schrieb im Newsbeitrag
news:87r7mgzo8x.fsf@igloo.phubuh.org...
"Robert Klemme" <bob.news@gmx.net> writes:
> "Mikael Brockman" <mikael@phubuh.org> schrieb im Newsbeitrag
> news:87vfbszr5m.fsf@igloo.phubuh.org...
>
> > Blocking I/O is really easy to use. But when you use it to write
> > servers, you run into problems: you can't run two blocking syscalls
> > simultaneously. So if you're writing a huge file to some guy, every
> > other client is stalled, and no one new can connect. Unacceptable,
> > for many types of servers. They need non-blocking I/O.
> >
> > Non-blocking I/O is a lot more annoying to use. Instead of going
>
> <snip/>
>
> > It'll run in one thread. Multiplexer handles the select(3) calls.
>
> What is the advantage over a solution with threads? IOW, why should I
> use multiplexer over individual threads per connection?
Since Ruby's threads aren't native, you can't do I/O from several at a
time.
That's not true.
So one IO#read call blocking for a long time will block the other
threads, too.
Also wrong: execute this script
tickers = [$stdout, $stderr].map do |io|
Thread.new do
100.times do |i|
io.puts "#{io.fileno}: #{Time.now}: Tick #{i}"
sleep 1
end
end
end
puts "PROMPT"
# blocks in next line:
input = gets
puts "ENTERED #{input}"
tickers.each {|th| th.join }
16:39:44 [ruby]: ruby ticker.rb
1: Fri Nov 26 17:40:05 GMT+2:00 2004: Tick 0
2: Fri Nov 26 17:40:05 GMT+2:00 2004: Tick 0
PROMPT
2: Fri Nov 26 17:40:06 GMT+2:00 2004: Tick 1
1: Fri Nov 26 17:40:06 GMT+2:00 2004: Tick 1
2: Fri Nov 26 17:40:07 GMT+2:00 2004: Tick 2
1: Fri Nov 26 17:40:07 GMT+2:00 2004: Tick 2
2: Fri Nov 26 17:40:08 GMT+2:00 2004: Tick 3
1: Fri Nov 26 17:40:08 GMT+2:00 2004: Tick 3
foo
ENTERED foo
2: Fri Nov 26 17:40:09 GMT+2:00 2004: Tick 4
1: Fri Nov 26 17:40:09 GMT+2:00 2004: Tick 4
2: Fri Nov 26 17:40:10 GMT+2:00 2004: Tick 5
1: Fri Nov 26 17:40:10 GMT+2:00 2004: Tick 5
2: Fri Nov 26 17:40:11 GMT+2:00 2004: Tick 6
1: Fri Nov 26 17:40:11 GMT+2:00 2004: Tick 6
2: Fri Nov 26 17:40:12 GMT+2:00 2004: Tick 7
1: Fri Nov 26 17:40:12 GMT+2:00 2004: Tick 7
You could loop a read with a time-out, I guess. But with
a single thread running select, the whole process can stall completely
while waiting for I/O. And it's more elegant.
"Mikael Brockman" <mikael@phubuh.org> schrieb im Newsbeitrag
news:87vfbszr5m.fsf@igloo.phubuh.org...
> Blocking I/O is really easy to use. But when you use it to write
> servers, you run into problems: you can't run two blocking syscalls
> simultaneously. So if you're writing a huge file to some guy, every
> other client is stalled, and no one new can connect. Unacceptable,
> for many types of servers. They need non-blocking I/O.
>
> Non-blocking I/O is a lot more annoying to use. Instead of going
<snip/>
> It'll run in one thread. Multiplexer handles the select(3) calls.
What is the advantage over a solution with threads? IOW, why should I
use multiplexer over individual threads per connection?
Since Ruby's threads aren't native, you can't do I/O from several at a
time. So one IO#read call blocking for a long time will block the other
threads, too. You could loop a read with a time-out, I guess. But with
a single thread running select, the whole process can stall completely
while waiting for I/O. And it's more elegant.
But if I'm not mistaken, isn't callcc implemented with threads in Ruby?
"Mikael Brockman" <mikael@phubuh.org> schrieb im Newsbeitrag
news:87r7mgzo8x.fsf@igloo.phubuh.org...
> "Robert Klemme" <bob.news@gmx.net> writes:
>
> > "Mikael Brockman" <mikael@phubuh.org> schrieb im Newsbeitrag
> > news:87vfbszr5m.fsf@igloo.phubuh.org...
> >
> > > Blocking I/O is really easy to use. But when you use it to
> > > write servers, you run into problems: you can't run two blocking
> > > syscalls simultaneously. So if you're writing a huge file to
> > > some guy, every other client is stalled, and no one new can
> > > connect. Unacceptable, for many types of servers. They need
> > > non-blocking I/O.
> > >
> > > Non-blocking I/O is a lot more annoying to use. Instead of
> > > going
> >
> > <snip/>
> >
> > > It'll run in one thread. Multiplexer handles the select(3) calls.
> >
> > What is the advantage over a solution with threads? IOW, why
> > should I use multiplexer over individual threads per connection?
>
> Since Ruby's threads aren't native, you can't do I/O from several at
> a time.
That's not true.
> So one IO#read call blocking for a long time will block the other
> threads, too.
Also wrong: execute this script
[snip]
Sorry: faulty generalization. The problem is demonstrated in this
script:
require 'socket'
server = TCPServer.new 12345
t_a = Thread.start do
a = server.accept
data = "foo" * 10000000
a << data
a.close
end
b = server.accept
b << "b"
b.close
t_a.join
The second client isn't accepted until the huge batch of data is sent.
I guess you could solve the problem by splitting it into a bunch of
smaller batches.
Another appeal could be that by keeping it single-threaded, you have
fewer concurrency issues to worry about.
In article <87r7mgzo8x.fsf@igloo.phubuh.org>,
>"Robert Klemme" <bob.news@gmx.net> writes:
>
>> "Mikael Brockman" <mikael@phubuh.org> schrieb im Newsbeitrag
>> news:87vfbszr5m.fsf@igloo.phubuh.org...
>>
>> > Blocking I/O is really easy to use. But when you use it to write
>> > servers, you run into problems: you can't run two blocking syscalls
>> > simultaneously. So if you're writing a huge file to some guy, every
>> > other client is stalled, and no one new can connect. Unacceptable,
>> > for many types of servers. They need non-blocking I/O.
>> >
>> > Non-blocking I/O is a lot more annoying to use. Instead of going
>>
>> <snip/>
>>
>> > It'll run in one thread. Multiplexer handles the select(3) calls.
>>
>> What is the advantage over a solution with threads? IOW, why should I
>> use multiplexer over individual threads per connection?
>
>Since Ruby's threads aren't native, you can't do I/O from several at a
>time. So one IO#read call blocking for a long time will block the other
>threads, too. You could loop a read with a time-out, I guess. But with
>a single thread running select, the whole process can stall completely
>while waiting for I/O. And it's more elegant.
>
But if I'm not mistaken, isn't callcc implemented with threads in Ruby?
It is, but I don't think that several ``continuation threads'' run
concurrently. In any case, there is definitely only one thread doing
the I/O.
"Mikael Brockman" <mikael@phubuh.org> schrieb im Newsbeitrag
news:87brdkzjd8.fsf@igloo.phubuh.org...
"Robert Klemme" <bob.news@gmx.net> writes:
> "Mikael Brockman" <mikael@phubuh.org> schrieb im Newsbeitrag
> news:87r7mgzo8x.fsf@igloo.phubuh.org...
> > "Robert Klemme" <bob.news@gmx.net> writes:
> >
> > > "Mikael Brockman" <mikael@phubuh.org> schrieb im Newsbeitrag
> > > news:87vfbszr5m.fsf@igloo.phubuh.org...
> > >
> > > > Blocking I/O is really easy to use. But when you use it to
> > > > write servers, you run into problems: you can't run two blocking
> > > > syscalls simultaneously. So if you're writing a huge file to
> > > > some guy, every other client is stalled, and no one new can
> > > > connect. Unacceptable, for many types of servers. They need
> > > > non-blocking I/O.
> > > >
> > > > Non-blocking I/O is a lot more annoying to use. Instead of
> > > > going
> > >
> > > <snip/>
> > >
> > > > It'll run in one thread. Multiplexer handles the select(3)
calls.
> > >
> > > What is the advantage over a solution with threads? IOW, why
> > > should I use multiplexer over individual threads per connection?
> >
> > Since Ruby's threads aren't native, you can't do I/O from several at
> > a time.
>
> That's not true.
>
> > So one IO#read call blocking for a long time will block the other
> > threads, too.
>
> Also wrong: execute this script
> [snip]
Sorry: faulty generalization. The problem is demonstrated in this
script:
> require 'socket'
>
> server = TCPServer.new 12345
>
> t_a = Thread.start do
> a = server.accept
> data = "foo" * 10000000
> a << data
> a.close
> end
>
> b = server.accept
> b << "b"
> b.close
>
> t_a.join
The second client isn't accepted until the huge batch of data is sent.
I guess you could solve the problem by splitting it into a bunch of
smaller batches.
Another appeal could be that by keeping it single-threaded, you have
fewer concurrency issues to worry about.
The typical pattern for TCPserver looks different: You create a thread per
accepted connection.
require 'socket'
server = TCPServer.new 12345
loop do
Thread.new(server.accept) do |a|
begin
data = "foo" * 10000000
a << data
ensure
a.close
end
end
end
> server = TCPServer.new 12345
>
> loop do
> Thread.new(server.accept) do |a|
> begin
> data = "foo" * 10000000
> a << data
> ensure
> a.close
> end
> end
> end
You're right, but I get the same results. Only one client at a time is
accepted.
Strange... I'm surprised there are too many differences between
your select multiplexer using continuations and a threads solution,
since: a) ruby uses select() to multiplex behind the scenes when
multiple threads are doing I/O; and b) ruby continuations are
implemented using threads.
That said I haven't studied your continuations solution so maybe
my surprise is misplaced...
But I wrote this simplistic threaded socket IO performance test
last month http://bwk.homeip.net/ftp/ruby/tcptest.rb and I've
watched it accept and handle 100+ clients without delay. Each
client is pushing bytes as fast as it can (at the specified
chunk size), and each server thread handling a client is in
return replying with bytes as fast as it can.
. . . Just chiming in in case this is somehow helpful... if not,
please disregard. =D
Until one thread terminates a system call like IO-related task, other
threads will be blocked. Kernel does not know ruby's (userland)
threads, so if your application needs concurrency in massive IO tasks,
maybe you should implement a kind of thread scheduler by yourself. OK,
Kernel#fork will be an another choice.
If 'thread' is only sufficient for all, why did Sun adopt NIO in Java2 1.4?
Run following server example, and execute 'telnet localhost 31337' on
two each shell, no client connection will be blocked.
<code>
require 'multiplexer'
class Test2 < Multiplexer::Handler
def initialize
super()
end
def handle
begin
write_line "ooo" * 10000000
rescue EOFError
puts "My client closed its socket."
end
end
end
def test
multiplexer = Multiplexer::Multiplexer.new 0.5
multiplexer.client_data = []
multiplexer.listen 31337, Test2
puts "Listening on port 31337."
multiplexer.run
end
if $0 == __FILE__
test
end
</code>
IMHO, multiplexinig is a good choice for concurrent programming in
ruby, insofar ruby VM does not support native threads.
In article <01ec01c4d3e3$68209820$6442a8c0@musicbox>,
"Bill Kelly" <billk@cts.com> writes:
Strange... I'm surprised there are too many differences between
your select multiplexer using continuations and a threads solution,
since: a) ruby uses select() to multiplex behind the scenes when
multiple threads are doing I/O; and b) ruby continuations are
implemented using threads.
Because write(2) system call blocks until whole data is written unless
nonblocking flag is set, even after select notify the fd is writable.
The multiplexer works well because it sets nonblocking flag.
The threaded server also works well if it sets nonblocking flag.
Try:
require 'socket'
require 'fcntl'
server = TCPServer.new 12345
t_a = Thread.start do
a = server.accept
a.fcntl(Fcntl::F_SETFL, Fcntl::O_NONBLOCK)
data = "foo" * 10000000
a << data
a.close
end
b = server.accept
b.fcntl(Fcntl::F_SETFL, Fcntl::O_NONBLOCK)
b << "b"
b.close
t_a.join
However nonblocking flag with stdio buffering may cause data lost...
Because write(2) system call blocks until whole data is written unless
nonblocking flag is set, even after select notify the fd is writable.
I've always used send/recv for my socket I/O. I *think* I don't
have a problem with blocking, even without NONBLOCK set. (Excepting
rare circumstances as are possible with accept(), etc.)
Does that sound right? Or should I be seeing blocking problems
even using send/recv ?
"Gyoung-Yoon Noh" <nohmad@gmail.com> schrieb im Newsbeitrag news:ff903fda04112610215fe2dc03@mail.gmail.com...
Until one thread terminates a system call like IO-related task, other
threads will be blocked. Kernel does not know ruby's (userland)
threads, so if your application needs concurrency in massive IO tasks,
maybe you should implement a kind of thread scheduler by yourself. OK,
Kernel#fork will be an another choice.
If 'thread' is only sufficient for all, why did Sun adopt NIO in Java2 1.4?
Because threads have a certain overhead and Sun wanted to provide a means to write high performance (i.e. highly optimized) servers. Apart from that I think they wanted to make the IO architecture more flexible and modular.
That said, a solution with threads and blocking IO is still more elegant IMHO. It's just that for some circumstances this is not the right solution.
In article <024601c4d40a$674fc8e0$6442a8c0@musicbox>,
"Bill Kelly" <billk@cts.com> writes:
I've always used send/recv for my socket I/O. I *think* I don't
have a problem with blocking, even without NONBLOCK set. (Excepting
rare circumstances as are possible with accept(), etc.)
Does that sound right? Or should I be seeing blocking problems
even using send/recv ?
You'll see blocking problem with send when you send huge data.
Linux man page of send(2) says:
When the message does not fit into the send buffer of the socket, send
normally blocks, unless the socket has been placed in non-blocking I/O
mode. In non-blocking mode it would return EAGAIN in this case. The
select(2) call may be used to determine when it is possible to send
more data.
Try:
require 'socket'
require 'fcntl'
server = TCPServer.new 12345
t_a = Thread.start do
a = server.accept
data = "foo" * 10000000
a.send data, 0
a.close
end
This problem is pretty easy to work around in threaded servers.
Realistically, it's usually one thread passing the data to send into a thread (or thread pool) that just does the writing, in most production servers I work with at least. The server I'm building right now uses thread safe queues to handle this communication. It's trivial to add a data length check, bust up the data, and queue multiple prints. Just remember to do that inside a synchronized block, to ensure that all messages are queued back-to-back.
James Edward Gray II
···
On Nov 26, 2004, at 7:56 PM, Tanaka Akira wrote:
You'll see blocking problem with send when you send huge data.
You'll see blocking problem with send when you send huge data.
Linux man page of send(2) says:
> When the message does not fit into the send buffer of the socket, send
> normally blocks, unless the socket has been placed in non-blocking I/O
> mode. In non-blocking mode it would return EAGAIN in this case. The
> select(2) call may be used to determine when it is possible to send
> more data.
Thank you for your patience. I've been wrong about that
for years.
Now I'm suddenly extremely interested in the non-blocking
extension for Win32 ...