Is resolv.rb broken in 1.9.3 due to encodings?

Hi,

I'm trying to work through some issues using 1.9.3 for my project RubyDNS,
and I noticed the following code in resolv.rb's MessageEncoder:

      class MessageEncoder # :nodoc:
        def initialize
          @data = ''
          @names = {}
          yield self
        end

        def to_s
          return @data
        end

        def put_bytes(d)
          @data << d
        end

        def put_pack(template, *d)
          @data << d.pack(template)
        end

        def put_length16
          length_index = *@data.length*
          @data << "\0\0"
          data_start = *@data.length*
          yield
          data_end = *@data.length*
* @data[length_index, 2] = [data_end - data_start].pack("n")*
        end

I'm just wondering, shouldn't this be using byteslice, bytesize and
friends? Isn't this code completely broken by default since we use UTF-8
encoding? Am I missing something?

Kind regards,
Samuel

Okay, so I just did a bit more testing and found that at least I get an
error. When I try to encode a IN::TXT record that includes UTF-8 Japanese
characters, I get the following error:

Encoding::CompatibilityError: incompatible character encodings: ASCII-8BIT
and UTF-8

I can't see why this is happening though because the initial @data = '' –
shouldn't that use the default encoding (UTF-8)?

Finally, I am happy to pack UTF-8 data into the IN::TXT, but it doesn't
seem that this is possible using IN::TXT as it currently stands – is there
some way to consider the UTF-8 string as ASCII-8BIT for binary data
purposes?

Kind regards,
Samuel

···

On 12 June 2012 04:28, Samuel Williams <space.ship.traveller@gmail.com>wrote:

Hi,

I'm trying to work through some issues using 1.9.3 for my project RubyDNS,
and I noticed the following code in resolv.rb's MessageEncoder:

      class MessageEncoder # :nodoc:
        def initialize
          @data = ''
          @names = {}
          yield self
        end

        def to_s
          return @data
        end

        def put_bytes(d)
          @data << d
        end

        def put_pack(template, *d)
          @data << d.pack(template)
        end

        def put_length16
          length_index = *@data.length*
          @data << "\0\0"
          data_start = *@data.length*
          yield
          data_end = *@data.length*
* @data[length_index, 2] = [data_end - data_start].pack("n")*
        end

I'm just wondering, shouldn't this be using byteslice, bytesize and
friends? Isn't this code completely broken by default since we use UTF-8
encoding? Am I missing something?

Kind regards,
Samuel