String#unpack and big-endian versus little-endian

I hit an interesting problem yesterday. I have a Ruby script that reads the
contents of a binary file generated by another program. Inside this file,
there are, among other things, binary encodings of integers, longs and
doubles.

To suck those into my code, I’m doing things like reading 4 bytes then using
unpack(“L”) to convert it into a long.

As far as I can tell, unpack() is designed to use whatever native encoding
your hardware uses. Hence, if you’re running on a big-endian machine, it
assumes that’s the order of bytes in the data you want to unpack.

However, in this case, the files may have been written on a machine with a
different endian-ness … which is precisely what I discovered yesterday.

Obviously, the people who wrote the original software should have chosen an
endian-ness for the external storage of their data! However, they didn’t, so
I have to deal with it :-(.

I was able to sort it out fairly quickly by writing some code to determine
whether the endian-ness of the data and the endian-ness of the machine where
the script is run match and, if not, do some byte swapping before doing the
unpack().

I ended up with code like …

byte[0], byte[1], byte[2], byte[3] =
byte[3], byte[2], byte[1], byte[0]

I can’t help but wonder whether there’s a cleverer way to do this. I couldn’t
see anything in unpack() that allowed for specifying the endian-ness, but
maybe there’s some other class or library that handles this kind of stuff
that I just don’t know about.

Any suggestions?

Thanks in advance,

Harry O.

Harry Ohlsen harryo@zip.com.au writes:

I ended up with code like …

byte[0], byte[1], byte[2], byte[3] =
byte[3], byte[2], byte[1], byte[0]

I can’t help but wonder whether there’s a cleverer way to do this.
I couldn’t see anything in unpack() that allowed for specifying the
endian-ness

See http://www.rubycentral.com/book/ref_c_string.html#String.unpack.

In particular, ‘V’ and ‘N’ may do what you want.

See http://www.rubycentral.com/book/ref_c_string.html#String.unpack.

In particular, ‘V’ and ‘N’ may do what you want.

How did I miss those?! There were heaps of others that were of use to me,
too, for unpacking shorts and doubles.

My apologies for being so blind. Perhaps it has something to do with having
been up until 0200 getting a damned USB ADSL modem going … and then having
to get up at 0500 to go to work :-(.

Harry O.