Hi,
I'm trying to work through some issues using 1.9.3 for my project RubyDNS,
and I noticed the following code in resolv.rb's MessageEncoder:
class MessageEncoder # :nodoc:
def initialize
@data = ''
@names = {}
yield self
end
def to_s
return @data
end
def put_bytes(d)
@data << d
end
def put_pack(template, *d)
@data << d.pack(template)
end
def put_length16
length_index = *@data.length*
@data << "\0\0"
data_start = *@data.length*
yield
data_end = *@data.length*
* @data[length_index, 2] = [data_end - data_start].pack("n")*
end
I'm just wondering, shouldn't this be using byteslice, bytesize and
friends? Isn't this code completely broken by default since we use UTF-8
encoding? Am I missing something?
Kind regards,
Samuel
Okay, so I just did a bit more testing and found that at least I get an
error. When I try to encode a IN::TXT record that includes UTF-8 Japanese
characters, I get the following error:
Encoding::CompatibilityError: incompatible character encodings: ASCII-8BIT
and UTF-8
I can't see why this is happening though because the initial @data = '' –
shouldn't that use the default encoding (UTF-8)?
Finally, I am happy to pack UTF-8 data into the IN::TXT, but it doesn't
seem that this is possible using IN::TXT as it currently stands – is there
some way to consider the UTF-8 string as ASCII-8BIT for binary data
purposes?
Kind regards,
Samuel
···
On 12 June 2012 04:28, Samuel Williams <space.ship.traveller@gmail.com>wrote:
Hi,
I'm trying to work through some issues using 1.9.3 for my project RubyDNS,
and I noticed the following code in resolv.rb's MessageEncoder:
class MessageEncoder # :nodoc:
def initialize
@data = ''
@names = {}
yield self
end
def to_s
return @data
end
def put_bytes(d)
@data << d
end
def put_pack(template, *d)
@data << d.pack(template)
end
def put_length16
length_index = *@data.length*
@data << "\0\0"
data_start = *@data.length*
yield
data_end = *@data.length*
* @data[length_index, 2] = [data_end - data_start].pack("n")*
end
I'm just wondering, shouldn't this be using byteslice, bytesize and
friends? Isn't this code completely broken by default since we use UTF-8
encoding? Am I missing something?
Kind regards,
Samuel