I'm sure there ought to be a Ruby function to do this, but I've been
scratching my head whilst going through the Pickaxe book 
I want to encode/decode a positive number to/from a variable-length
big-endian binary string.
Originally the best I could come up with was:
str = "\001\002\003"
val = 0
str.each_byte { |b| val = (val << 8) | b }
p val
# => 66051
val = 1234
str = ""
while (val > 0)
str = (val & 0xff).chr + str
val >>= 8
end
p str
# => "\004\322"
pack/unpack seem only to work for fixed lengths, e.g. 2 or 4 bytes.
Is there a faster or simpler way of doing this in Ruby?
Then I discovered I can go via hex:
p "\001\002\003".unpack("H*")[0].hex
# => 66051
str = 1234.to_s(16)
str = "0#{str}" if str.length % 2 != 0
val = [str].pack("H*")
p val
# => "\004\322"
That's still pretty nasty. Any better offers?
Thanks,
Brian.
You're on the right track... It looks like you're just doing too much
work. how about:
[int.to_s(16)].pack('H*')
to pack it, and:
string.unpack('H*').first.to_i(16)
to unpack?
Also, if you aren't tied to this exact format, there's the
BER-compressed integer option in #pack/#unpack, which handles
variable-length integers in a nice way:
[12345678901234567890].pack('w')
==>"\201\253\252\252\261\316\330\374\225R"
[1].pack('w')
==>"\001"
cheers,
Mark
···
On 6/4/05, Brian Candler <B.Candler@pobox.com> wrote:
I'm sure there ought to be a Ruby function to do this, but I've been
scratching my head whilst going through the Pickaxe book 
I want to encode/decode a positive number to/from a variable-length
big-endian binary string.
Originally the best I could come up with was:
str = "\001\002\003"
val = 0
str.each_byte { |b| val = (val << 8) | b }
p val
# => 66051
val = 1234
str = ""
while (val > 0)
str = (val & 0xff).chr + str
val >>= 8
end
p str
# => "\004\322"
pack/unpack seem only to work for fixed lengths, e.g. 2 or 4 bytes.
Is there a faster or simpler way of doing this in Ruby?
Then I discovered I can go via hex:
p "\001\002\003".unpack("H*")[0].hex
# => 66051
str = 1234.to_s(16)
str = "0#{str}" if str.length % 2 != 0
val = [str].pack("H*")
p val
# => "\004\322"
That's still pretty nasty. Any better offers?
how about:
[int.to_s(16)].pack('H*')
That doesn't work if the number of hex digits is odd:
irb(main):006:0> 1234.to_s(16)
=> "4d2"
irb(main):007:0> [1234.to_s(16)].pack("H*")
=> "M " # that's \x4d \x20
irb(main):008:0> ["04d2"].pack("H*")
=> "\004\322" # that's \x04 \xd2
irb(main):009:0>
Also, if you aren't tied to this exact format, there's the
BER-compressed integer option in #pack/#unpack, which handles
variable-length integers in a nice way
As it happens, I'm unpacking BER. The length field in a BER-encoded element
is encoded as a straightforward N octets. See [**] below:
def ber_read(io)
blk = io.read(2) # minimum: short tag, short length
tag = blk[0] & 0x1f
len = blk[1]
if tag == 0x1f # long form
tag = 0
while true
ch = io.getc
blk << ch
tag = (tag << 7) | (ch & 0x7f)
break if (ch & 0x80) == 0
end
len = io.getc
blk << len
end
if (len & 0x80) != 0 # long form
len = len & 0x7f
raise "Indefinite length encoding not supported" if len == 0
offset = blk.length
blk << io.read(len)
# is there a more efficient way of doing this? [**]
len = 0
blk[offset..-1].each_byte { |b| len = (len << 8) | b }
end
offset = blk.length
blk << io.read(len)
return blk, [blk[0] >> 5, tag], offset
end
The reason for this is so that I can read a DER element from a stream
(OpenSSL::ASL1::decode requires the data to be in memory first)
You'll notice that the long-form *tag* is encoded in the format you mention;
however I can't use unpack for that since the length isn't known up-front.
Regards,
Brian.