Converting variable-length binary value

I'm sure there ought to be a Ruby function to do this, but I've been
scratching my head whilst going through the Pickaxe book :slight_smile:

I want to encode/decode a positive number to/from a variable-length
big-endian binary string.

Originally the best I could come up with was:

  str = "\001\002\003"
  val = 0
  str.each_byte { |b| val = (val << 8) | b }
  p val
  # => 66051

  val = 1234
  str = ""
  while (val > 0)
    str = (val & 0xff).chr + str
    val >>= 8
  end
  p str
  # => "\004\322"

pack/unpack seem only to work for fixed lengths, e.g. 2 or 4 bytes.

Is there a faster or simpler way of doing this in Ruby?

Then I discovered I can go via hex:

  p "\001\002\003".unpack("H*")[0].hex
  # => 66051

  str = 1234.to_s(16)
  str = "0#{str}" if str.length % 2 != 0
  val = [str].pack("H*")
  p val
  # => "\004\322"

That's still pretty nasty. Any better offers?

Thanks,

Brian.

You're on the right track... It looks like you're just doing too much
work. how about:

  [int.to_s(16)].pack('H*')

to pack it, and:

  string.unpack('H*').first.to_i(16)

to unpack?

Also, if you aren't tied to this exact format, there's the
BER-compressed integer option in #pack/#unpack, which handles
variable-length integers in a nice way:

  [12345678901234567890].pack('w')
    ==>"\201\253\252\252\261\316\330\374\225R"
  [1].pack('w')
    ==>"\001"

cheers,
Mark

···

On 6/4/05, Brian Candler <B.Candler@pobox.com> wrote:

I'm sure there ought to be a Ruby function to do this, but I've been
scratching my head whilst going through the Pickaxe book :slight_smile:

I want to encode/decode a positive number to/from a variable-length
big-endian binary string.

Originally the best I could come up with was:

  str = "\001\002\003"
  val = 0
  str.each_byte { |b| val = (val << 8) | b }
  p val
  # => 66051

  val = 1234
  str = ""
  while (val > 0)
    str = (val & 0xff).chr + str
    val >>= 8
  end
  p str
  # => "\004\322"

pack/unpack seem only to work for fixed lengths, e.g. 2 or 4 bytes.

Is there a faster or simpler way of doing this in Ruby?

Then I discovered I can go via hex:

  p "\001\002\003".unpack("H*")[0].hex
  # => 66051

  str = 1234.to_s(16)
  str = "0#{str}" if str.length % 2 != 0
  val = [str].pack("H*")
  p val
  # => "\004\322"

That's still pretty nasty. Any better offers?

how about:

  [int.to_s(16)].pack('H*')

That doesn't work if the number of hex digits is odd:

irb(main):006:0> 1234.to_s(16)
=> "4d2"
irb(main):007:0> [1234.to_s(16)].pack("H*")
=> "M " # that's \x4d \x20
irb(main):008:0> ["04d2"].pack("H*")
=> "\004\322" # that's \x04 \xd2
irb(main):009:0>

Also, if you aren't tied to this exact format, there's the
BER-compressed integer option in #pack/#unpack, which handles
variable-length integers in a nice way

As it happens, I'm unpacking BER. The length field in a BER-encoded element
is encoded as a straightforward N octets. See [**] below:

  def ber_read(io)
    blk = io.read(2) # minimum: short tag, short length
    tag = blk[0] & 0x1f
    len = blk[1]

    if tag == 0x1f # long form
      tag = 0
      while true
        ch = io.getc
        blk << ch
        tag = (tag << 7) | (ch & 0x7f)
        break if (ch & 0x80) == 0
      end
      len = io.getc
      blk << len
    end

    if (len & 0x80) != 0 # long form
      len = len & 0x7f
      raise "Indefinite length encoding not supported" if len == 0
      offset = blk.length
      blk << io.read(len)
      # is there a more efficient way of doing this? [**]
      len = 0
      blk[offset..-1].each_byte { |b| len = (len << 8) | b }
    end

    offset = blk.length
    blk << io.read(len)
    return blk, [blk[0] >> 5, tag], offset
  end

The reason for this is so that I can read a DER element from a stream
(OpenSSL::ASL1::decode requires the data to be in memory first)

You'll notice that the long-form *tag* is encoded in the format you mention;
however I can't use unpack for that since the length isn't known up-front.

Regards,

Brian.