Question on networking with custom binary interface

So I am working on this Ruby server application for Windows that needs
to communicate with a client. Standard TCP sockets are being used, but
the data being passed is kind of weird. The problem is mostly that the
program I am communicating with on the client is a C++ program I cannot
modify. However, there is a binary interface I have the documentation
for that if I can make all data passed match the documentation, then it
should work.

My main concern is that the documentation is VERY specific. As in the
first four bytes has to be a unsigned long, the next two bytes has to be
a unsigned short, and the next is a unsigned char. Then some places in
the interface allow for a string of length N and the only guarantee is
the string will be null terminated to determine the end of it. And then
there are some odd data types used like this one four byte data type
only referred to as "DWORD" and this eight byte data type called
"FILETIME.dwLowDateTime(DWORD)".

Any advice for doing this type of binary IO?

···

--
Posted via http://www.ruby-forum.com/.

Greg Chambers wrote:

Any advice for doing this type of binary IO?

Array#pack, String#unpack are your friends. See also

http://redshift.sourceforge.net/bit-struct/
http://rubyforge.org/projects/binaryparse/

···

--
       vjoel : Joel VanderWerf : path berkeley edu : 510 665 3407

I don't have anything to add to Joel's excellent reply, but one thing caught my eye:

My main concern is that the documentation is VERY specific.

That sounds odd to me. Most of the time people complain that there is no documentation or that it's inaccurate. Apparently you got documentation leaving no questions and are concerned. This is the best that could happen to you in this situation. Why are you concerned?

Kind regards

  robert

PS: One additional heads up: when encoding and decoding numbers pay special attention to byte ordering (big endian, little endian).

···

On 30.06.2009 00:08, Greg Chambers wrote:

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

Robert Klemme wrote:

I don't have anything to add to Joel's excellent reply, but one thing
caught my eye:

My main concern is that the documentation is VERY specific.

That sounds odd to me. Most of the time people complain that there is
no documentation or that it's inaccurate. Apparently you got
documentation leaving no questions and are concerned. This is the best
that could happen to you in this situation. Why are you concerned?

Kind regards

  robert

PS: One additional heads up: when encoding and decoding numbers pay
special attention to byte ordering (big endian, little endian).

Well my concern was since this was specific documentation intended for
C++ programs, I was worried about how Ruby would attempt to handle this
under the hood. I mean, I would not even worry if the client program
was also a Ruby program because I could be sure that one Ruby program
would know what another Ruby program is saying, but I couldn't be sure
that some proprietary C++ program would know what a Ruby program was
trying to say. Mostly I was worried about compatibility of data between
programs. That is why I was worried about how specific the
documentation was. Otherwise it is greatly appreciated.

Also, for a quick switch in topic, thanks you guys for your help. You
are really helping an intern out.

···

On 30.06.2009 00:08, Greg Chambers wrote:

--
Posted via http://www.ruby-forum.com/\.

Joel VanderWerf wrote:

Greg Chambers wrote:

Any advice for doing this type of binary IO?

Array#pack, String#unpack are your friends.

I am looking at the documentation of these methods and this covers a LOT
of what I need to do, but I am worried in a couple areas. For example,
part of the binary string I will be getting will contain an eight byte
unsigned long long. How should I approach custom byte structures like
this?

···

--
Posted via http://www.ruby-forum.com/\.

Robert Klemme wrote:

I don't have anything to add to Joel's excellent reply, but one thing
caught my eye:

My main concern is that the documentation is VERY specific.

That sounds odd to me. Most of the time people complain that there is
no documentation or that it's inaccurate. Apparently you got
documentation leaving no questions and are concerned. This is the best
that could happen to you in this situation. Why are you concerned?

Well my concern was since this was specific documentation intended for
C++ programs, I was worried about how Ruby would attempt to handle this
under the hood. I mean, I would not even worry if the client program
was also a Ruby program because I could be sure that one Ruby program
would know what another Ruby program is saying, but I couldn't be sure
that some proprietary C++ program would know what a Ruby program was
trying to say. Mostly I was worried about compatibility of data between
programs. That is why I was worried about how specific the
documentation was. Otherwise it is greatly appreciated.

Well, over the network it's just bytes and I haven't heard yet that
Ruby bytes are any different from C++ bytes. :slight_smile: The only tricky part
is to get the ordering of bytes right. See also:

Also, for a quick switch in topic, thanks you guys for your help. You
are really helping an intern out.

You're welcome!

Kind regards

robert

···

2009/6/30 Greg Chambers <gregory.w.chambers@gmail.com>:

On 30.06.2009 00:08, Greg Chambers wrote:

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

Greg Chambers wrote:

Joel VanderWerf wrote:

Greg Chambers wrote:

Any advice for doing this type of binary IO?

Array#pack, String#unpack are your friends.

I am looking at the documentation of these methods and this covers a LOT of what I need to do, but I am worried in a couple areas. For example, part of the binary string I will be getting will contain an eight byte unsigned long long. How should I approach custom byte structures like this?

Ok, Array#pack, String#unpack are your _relatives_. BitStruct is your _friend_.

require 'bit-struct'

class MyPacket < BitStruct
   unsigned :x, 8*8, "The x field", :endian => :network
     # :network is the default, and it's the same as :big
   unsigned :y, 8*8, "The y field", :endian => :little
end

pkt = MyPacket.new
pkt.x = 59843759843759843
pkt.y = 59843759843759843

p pkt.x # 59843759843759843
p pkt.y # 59843759843759843

p pkt.inspect
# "#<MyPacket x=59843759843759843, y=59843759843759843>"
p pkt.to_s.inspect

···

# "\"\\000\\324\\233\\225\\037\\202\\326\\343\\343\\326\\202\\037\\225\\233\\324\\000\""

puts pkt.inspect_detailed
# MyPacket:
# The x field = 59843759843759843
# The y field = 59843759843759843

puts MyPacket.describe
# byte: type name [size] description
# ----------------------------------------------------------------------
# @0: unsigned x [ 8B] The x field
# @8: unsigned y [ 8B] The y field

--
       vjoel : Joel VanderWerf : path berkeley edu : 510 665 3407

Oops, there was an extra layer of inspection in two of the output lines. Fixed:

require 'bit-struct'

class MyPacket < BitStruct
   unsigned :x, 8*8, "The x field", :endian => :network
     # :network is the default, and it's the same as :big
   unsigned :y, 8*8, "The y field", :endian => :little
end

pkt = MyPacket.new
pkt.x = 59843759843759843
pkt.y = 59843759843759843

p pkt.x # 59843759843759843
p pkt.y # 59843759843759843

p pkt
# #<MyPacket x=59843759843759843, y=59843759843759843>
p pkt.to_s
# "\000\324\233\225\037\202\326\343\343\326\202\037\225\233\324\000"

puts pkt.inspect_detailed
# MyPacket:
# The x field = 59843759843759843
# The y field = 59843759843759843

puts MyPacket.describe
# byte: type name [size] description
# ----------------------------------------------------------------------
# @0: unsigned x [ 8B] The x field
# @8: unsigned y [ 8B] The y field

···

--
       vjoel : Joel VanderWerf : path berkeley edu : 510 665 3407

Sorry to bump an old topic of mine, but I got sidetracked with work from
a different project and had one more question.

Joel VanderWerf wrote:

insert example code here

I've been using BitStruct and it works like a dream except in one area,
which is with strings. My first test class I am making keeps it easy by
putting it at the end of the packet so I can just use a rest field. And
although the output returned by description looks fine, when I convert
to a String for sending, the string is added on as is instead of
converted like all the other data. Here was the test code I wrote:

# test_ascii_message.rb
require 'bit-struct'

class TestAsciiMessage < BitStruct
  unsigned :message_length, 4*8, "Message Length", :endian => :network
  unsigned :message_id, 2*8, "Unique Message ID", :endian => :network
  unsigned :service_id, 1*8, "Service Identifier", :endian => :network
  unsigned :event_id, 1*8, "Event Identifier", :endian => :network
  rest :text_msg, "Null terminated ASCII Message String"
end

# main.rb
require 'test_ascii_message'

test_packet = TestAsciiMessage.new
test_packet.message_id = 0
test_packet.service_id = 2
test_packet.event_id = 3
test_packet.text_msg = "Testing 01 10 11"
test_packet.message_length = test_packet.length
puts "-"*75
puts test_packet.inspect
puts "-"*75
puts test_packet.inspect_detailed
puts "-"*75
puts TestAsciiMessage.describe
puts "-"*75
p test_packet.to_s

And the resulting output was this:

···

---------------------------------------------------------------------------
#<TestAsciiMessage message_length=24, message_id=0, service_id=2,
event_id=3, text_msg="Testing 01 10 11">
---------------------------------------------------------------------------
TestAsciiMessage:
                Message Length = 24
             Unique Message ID = 0
            Service Identifier = 2
              Event Identifier = 3
Null terminated ASCII Message String = "Testing 01 10 11"
---------------------------------------------------------------------------
    byte: type name [size] description
----------------------------------------------------------------------
      @0: unsigned message_length[ 32b] Message Length
      @4: unsigned message_id [ 16b] Unique Message ID
      @6: unsigned service_id [ 8b] Service Identifier
      @7: unsigned event_id [ 8b] Event Identifier
---------------------------------------------------------------------------
"\000\000\000\030\000\000\002\003Testing 01 10 11"

This output is fine until it gets to the string. I just worry, like I
did above, because the program receiving this needs to see the string as
a null terminated *char string... did I say that right? Don't know
enough C for this.

Anyways, that was my first question. The second question is how would I
go about converting a binary string I received into a BitStruct when
there is a variable length, null terminated string in the center of the
binary string? Figured I would ask since using a rest field won't work
here.
--
Posted via http://www.ruby-forum.com/\.

Greg Chambers wrote:

I've been using BitStruct and it works like a dream except in one area, which is with strings. My first test class I am making keeps it easy by putting it at the end of the packet so I can just use a rest field. And although the output returned by description looks fine, when I convert to a String for sending, the string is added on as is instead of converted like all the other data. Here was the test code I wrote:

...

"\000\000\000\030\000\000\002\003Testing 01 10 11"

This output is fine until it gets to the string. I just worry, like I did above, because the program receiving this needs to see the string as a null terminated *char string... did I say that right? Don't know enough C for this.

I see your point. You can of course add the null manually:

test_packet.text_msg = "Testing 01 10 11\0"

but that's kinda defeating the purpose of bit-struct.

Another thing you can do is define reader/writer methods that strip/append the null. Rename the field to _text_msg and define:

   def text_msg; _text_msg[/(.*)\0\z/, 1]; end
   def text_msg=(s); self._text_msg = "#{s}\0"; end

It would be easy to add this to bit-struct, I'm just not sure what the api should be. Maybe an option like

   rest :text_str, :terminator => "\0"

Thoughts?

Anyways, that was my first question. The second question is how would I go about converting a binary string I received into a BitStruct when there is a variable length, null terminated string in the center of the binary string? Figured I would ask since using a rest field won't work here.

Is the field fixed length, even if the string itself is variable length? If so, you can use a #text field, which pads with null chars:

require 'bit-struct'

class Msg < BitStruct
   unsigned :x, 16
   text :str, 10*8
   unsigned :y, 16
end

m = Msg.new(:x => 1, :str => "foo", :y => 2)
p m
p m.to_s

__END__

#<Msg x=1, str="foo", y=2>
"\000\001foo\000\000\000\000\000\000\000\000\002"

···

--
       vjoel : Joel VanderWerf : path berkeley edu : 510 665 3407

Thanks for help of the first question. As for the second question, no
the field is not fixed length. The only guarantees are that the
string is the only variable length field, it is null terminated, it's
start byte is known, and the length of the entire packet is known.
Any ideas?

···

On 7/7/09, Joel VanderWerf <vjoel@path.berkeley.edu> wrote:

Greg Chambers wrote:

I've been using BitStruct and it works like a dream except in one area,
which is with strings. My first test class I am making keeps it easy by
putting it at the end of the packet so I can just use a rest field. And
although the output returned by description looks fine, when I convert
to a String for sending, the string is added on as is instead of
converted like all the other data. Here was the test code I wrote:

...

"\000\000\000\030\000\000\002\003Testing 01 10 11"

This output is fine until it gets to the string. I just worry, like I
did above, because the program receiving this needs to see the string as
a null terminated *char string... did I say that right? Don't know
enough C for this.

I see your point. You can of course add the null manually:

test_packet.text_msg = "Testing 01 10 11\0"

but that's kinda defeating the purpose of bit-struct.

Another thing you can do is define reader/writer methods that
strip/append the null. Rename the field to _text_msg and define:

   def text_msg; _text_msg[/(.*)\0\z/, 1]; end
   def text_msg=(s); self._text_msg = "#{s}\0"; end

It would be easy to add this to bit-struct, I'm just not sure what the
api should be. Maybe an option like

   rest :text_str, :terminator => "\0"

Thoughts?

Anyways, that was my first question. The second question is how would I
go about converting a binary string I received into a BitStruct when
there is a variable length, null terminated string in the center of the
binary string? Figured I would ask since using a rest field won't work
here.

Is the field fixed length, even if the string itself is variable length?
If so, you can use a #text field, which pads with null chars:

require 'bit-struct'

class Msg < BitStruct
   unsigned :x, 16
   text :str, 10*8
   unsigned :y, 16
end

m = Msg.new(:x => 1, :str => "foo", :y => 2)
p m
p m.to_s

__END__

#<Msg x=1, str="foo", y=2>
"\000\001foo\000\000\000\000\000\000\000\000\002"

--
       vjoel : Joel VanderWerf : path berkeley edu : 510 665 3407

--
Sent from my mobile device

Greg Chambers wrote:

Thanks for help of the first question. As for the second question, no
the field is not fixed length. The only guarantees are that the
string is the only variable length field, it is null terminated, it's
start byte is known, and the length of the entire packet is known.
Any ideas?

Was afraid of that... it's kind of a tricky case. The only way to read a field after the variable length field is to scan to the null terminator and then jump from that to a known offset, right? And it gets worse with each such field. So the accessor-based approach of bit-struct becomes complicated and inefficient. You might be better of pre-parsing the data into blocks of fixed and variable length fields. Then use BitStruct classes to parse the former. Wrap your own class (even just an array with positional accessors) around this and it could be fairly nice.

Ara Howard had a suggestion along those lines, using pack/unpack instead of bit-struct, in:

http://blade.nagaokaut.ac.jp/cgi-bin/scat.rb/ruby/ruby-talk/152096

That example wasn't for variable length fields, but it would apply. The key point is that instead of leaving the data as a string and accessing fields as substrings (like bit-struct does), you parse it into an array, access the entries in the array, and write it back to a string when needed.

Maybe a hybrid would work for you... parse into an array of BitStructs and variable length Strings.

···

--
       vjoel : Joel VanderWerf : path berkeley edu : 510 665 3407