TCP Sockets

It doesn’t need knowledge of the application protocol, but it needs to
know -all- about TCP/IP – it needs the headers for every packet –
source, destination, flags, window, mss, length.

Once you’re at the application (socket) level, you have buffers to deal
with – you don’t know whether the 99 bytes was 1, 2 or 3 packets at
all, unless you know how big the chunks the sender sends are.

···

On Fri, 2003-05-16 at 08:11, Dominik Werder wrote:

Sure, using the method that Nobu proposes you might be able to tell that
there are exactly 99 bytes sitting in the socket buffer right now.

But how do you know that 99 bytes is a whole request? Maybe the whole
request is 138 bytes, but because the message is sent in TCP chunks, the
first read() returned 99 bytes, and the next read later returns 39 bytes.
Or
maybe it’s two requests of 69 bytes each; the first read() returns the
whole
first request plus 30 bytes of the second request, and the the second
read()
returns the remaining 39 bytes.
This is not the problem: If the request is not complete in one packet, the
rest comes with the following packet.
But if the request does not come at all cause 99 bytes are not enough, then
I got a problem.
Think about my software as an TCP/IP router. The TCP/IP router doesn’t need
to have knowledge of the protocol either.

Yes that’s what I meant - one thread per TCP connection or ‘logical peer’
(but you could have multiple peers on the same remote host)

Regards,

Brian.

···

On Fri, May 16, 2003 at 11:11:07PM +0900, Dominik Werder wrote:

They’ll have a separate TCP connection for each peer, and process
messages
from each stream independently. Since you are parsing all the incoming
messages, you know where the message boundaries are.
Does they also have one thread per TCP connection?

Yes that’s what I meant - one thread per TCP connection or ‘logical peer’
(but you could have multiple peers on the same remote host)

Ok, then it can’t be that slow :slight_smile:

Regards,
Dominik

It doesn’t need knowledge of the application protocol, but it needs to
know -all-about TCP/IP – it needs the headers for every packet –
source, destination, flags, window, mss, length.

You’re right, my example wasn’t that good.

I tell you what I wanna do:

I got two application servers with multiple threads each to handle
requests. A Request on one server could need data from the second server.
But I don’t want to create a new connection each time. So I wanna have one
steady connection and when one or more threads need data from the other
server, they can send their packets right through this one connection. Like
telecommunications networks (ATM?): The packets are mixed together, send
through the line and extracted on the other end.

But here it is easier, because I don’t need to do the routing because the
connection is already there. Because of this the router example was
terrible bad :frowning:

Once you’re at the application (socket) level, you have buffers to deal
with – you don’t know whether the 99 bytes was 1, 2 or 3 packets at
all, unless you know how big the chunks the sender sends are.

The module I’m trying to do should not need any information of the
protocols which are used on the line. It should only mix packets together,
send it, and on the other side, the same module extract the packets again
:slight_smile:

bye!
Dominik

It doesn’t need knowledge of the application protocol, but it needs to
know -all-about TCP/IP – it needs the headers for every packet –
source, destination, flags, window, mss, length.

You’re right, my example wasn’t that good.

I tell you what I wanna do:

I got two application servers with multiple threads each to handle
requests. A Request on one server could need data from the second server.
But I don’t want to create a new connection each time. So I wanna have one
steady connection and when one or more threads need data from the other
server, they can send their packets right through this one connection. Like
telecommunications networks (ATM?): The packets are mixed together, send
through the line and extracted on the other end.

But here it is easier, because I don’t need to do the routing because the
connection is already there. Because of this the router example was
terrible bad :frowning:

So you’re basically implementing datagram semantics (in order to
multiplex the stream-oriented protocols) over a connection-oriented
link. Did I get it right?

Once you’re at the application (socket) level, you have buffers to deal
with – you don’t know whether the 99 bytes was 1, 2 or 3 packets at
all, unless you know how big the chunks the sender sends are.

The module I’m trying to do should not need any information of the
protocols which are used on the line. It should only mix packets together,
send it, and on the other side, the same module extract the packets again
:slight_smile:

But at some point you need the packet length in order to demux the stream,
don’t you? You can either use constant-sized packets or specify the length
in some sort of header. The latter is what drb does (first sends datagram
size, then info), the former is employed by ATM (48 bytes payload +
5 header).

···

On Sat, May 17, 2003 at 10:24:34PM +0900, Dominik Werder wrote:


_ _

__ __ | | ___ _ __ ___ __ _ _ __
'_ \ / | __/ __| '_ _ \ / ` | ’ \
) | (| | |
__ \ | | | | | (| | | | |
.__/ _,
|_|/| || ||_,|| |_|
Running Debian GNU/Linux Sid (unstable)
batsman dot geo at yahoo dot com

One tree to rule them all,
One tree to find them,
One tree to bring them all,
and to itself bind them.
– Gavin Koch gavin@cygnus.com

In that scenario, the easiest thing to do is open a connection to the remote
server, run a separate thread for each incoming connection, then whenever
one of those threads wants to communicate with the remote server, it “grabs”
the outgoing connection, uses it, and releases it. “grabs” in this case
means takes a Mutex, so that none of the other threads can use it until it
has finished.

You could also write your application a single thread: in that case you need
to select() on all the incoming fds, receive any messages or part-messages
which are available, append the part-messages to buffers for each incoming
stream, and when one of these buffers has a full message then you process it
as necessary. If you need to maintain separate state for each connection
then you’ll need an explicit array of objects that hold state. It’s a lot
lower-level and a lot harder work.

On the other hand, if you write your incoming-connection handler as a
thread, then it can be written in a simple linear style - accept request, do
something, send response - without having to keep an explicit array of
connections, array of buffers, array of state variables etc.

Cheers,

Brian.

···

On Sat, May 17, 2003 at 10:24:34PM +0900, Dominik Werder wrote:

I tell you what I wanna do:

I got two application servers with multiple threads each to handle
requests. A Request on one server could need data from the second server.
But I don’t want to create a new connection each time. So I wanna have one
steady connection and when one or more threads need data from the other
server, they can send their packets right through this one connection. Like
telecommunications networks (ATM?): The packets are mixed together, send
through the line and extracted on the other end.

So you’re basically implementing datagram semantics (in order to
multiplex the stream-oriented protocols) over a connection-oriented
link. Did I get it right?

Yes exactly, that’s what I wanted to say!

But at some point you need the packet length in order to demux the
stream,
don’t you? You can either use constant-sized packets or specify the
length
in some sort of header. The latter is what drb does (first sends datagram
size, then info), the former is employed by ATM (48 bytes payload +
5 header).

I had chosen the drb style, I think it’s more convenient.

thanks!
Dominik

On the other hand, if you write your incoming-connection handler as a
thread, then it can be written in a simple linear style - accept request,
do
something, send response - without having to keep an explicit array of
connections, array of buffers, array of state variables etc.

This is right, and it’s probably the way to go I think :slight_smile:

thanks!
Dominik