IO.readint?

Hi all,
I'm parsing a binary file, and need to read an integer, something I
would do in C like this:

int b;
read(f, &b, sizeof(int));

obviously considering endianness. I'm pretty sure there has to be a
faster way to do it, but this is how I'm doing it right now (as you
can see, pretty naive):

class IO
  # read int, assume little endian
  def geti
    c1 = getc
    c2 = getc
    c3 = getc
    c4 = getc
    c4 << 3*8 | c3 << 2*8 | c2 << 8 | c1
  end
end

What would be the ruby-way to do it?
thanks for any tip...

···

--
rolando -- [[ knowledge is empty, fill it ]] --
"Tam pro papa quam pro rege bibunt omnes sine lege."

class IO
  def geti( endian = :little )
    str = self.read( 4 )
    str = str.reverse if endian == :little
    str.unpack( 'N' )[0]
  end
end

The default for this method is to return the integer in little endian
byte order. You can change this by passing :big as an argument ...

io.geti :big

It does not have to be :big, but I'm just following the metaphor of
using :little for little endian byte order.

Blessings,
TwP

···

On 10/12/06, Rolando Abarca <funkaster@gmail.com> wrote:

Hi all,
I'm parsing a binary file, and need to read an integer, something I
would do in C like this:

int b;
read(f, &b, sizeof(int));

obviously considering endianness. I'm pretty sure there has to be a
faster way to do it, but this is how I'm doing it right now (as you
can see, pretty naive):

class IO
  # read int, assume little endian
  def geti
    c1 = getc
    c2 = getc
    c3 = getc
    c4 = getc
    c4 << 3*8 | c3 << 2*8 | c2 << 8 | c1
  end
end

What would be the ruby-way to do it?
thanks for any tip...

Check out String#unpack:

(dta = File.read('ints.bin')).unpack('I' * (dta.length / 4))
# => [1234, 2345, 3456]

It has different type specifiers for endianness and so on. Also, if you
gotta crawl through the file, you can tell IO#read how many bytes you
want:

File.open('ints.bin') { |f| puts f.read(4).unpack('I') until f.eof? }
# 1234
# 2345
# 3456
# => nil

···

On Fri, 2006-10-13 at 00:35 +0900, Rolando Abarca wrote:

Hi all,
I'm parsing a binary file, and need to read an integer,

--
Ross Bamford - rosco@roscopeco.REMOVE.co.uk

Tim Pease:

···

On 10/12/06, Rolando Abarca <funkaster@gmail.com> wrote:

int b;
read(f, &b, sizeof(int));

class IO
  def geti( endian = :little )
    str = self.read( 4 )
    str = str.reverse if endian == :little
    str.unpack( 'N' )[0]
  end
end

Yet you cannot be sure that sizeof(int) is 4.

Kalman

Thanks a lot, I'll try that!!!

···

On 10/12/06, Tim Pease <tim.pease@gmail.com> wrote:

class IO
  def geti( endian = :little )
    str = self.read( 4 )
    str = str.reverse if endian == :little
    str.unpack( 'N' )[0]
  end
end

The default for this method is to return the integer in little endian
byte order. You can change this by passing :big as an argument ...

io.geti :big

It does not have to be :big, but I'm just following the metaphor of
using :little for little endian byte order.

Blessings,
TwP

--
rolando -- [[ knowledge is empty, fill it ]] --
"Tam pro papa quam pro rege bibunt omnes sine lege."

Yet you cannot be sure that sizeof(int) is 4.

no, but in my case works just fine. it will always be a 4 byte unsigned integer.

Kalman

regards,

···

On 10/12/06, Kalman Noel <invalid@gmx.net> wrote:
--
rolando -- [[ knowledge is empty, fill it ]] --
"Tam pro papa quam pro rege bibunt omnes sine lege."

No problem. Take a look at bit-struct if you find yourself needing to
do some more complex packing and unpacking of binary data ...

http://raa.ruby-lang.org/project/bit-struct/

Blessings,
TwP

···

On 10/12/06, Rolando Abarca <funkaster@gmail.com> wrote:

Thanks a lot, I'll try that!!!

class IO
  def geti( endian = :little )
    str = read( [0].pack('N').length )
    str.reverse! if endian == :little
    str.unpack( 'N' )[0]
  end
end

Regards

  robert

···

On 12.10.2006 18:10, Kalman Noel wrote:

Tim Pease:

On 10/12/06, Rolando Abarca <funkaster@gmail.com> wrote:

int b;
read(f, &b, sizeof(int));

class IO
  def geti( endian = :little )
    str = self.read( 4 )
    str = str.reverse if endian == :little
    str.unpack( 'N' )[0]
  end
end

Yet you cannot be sure that sizeof(int) is 4.

Kalman

have you been using this for your stuff tim?

-a

···

On Fri, 13 Oct 2006, Tim Pease wrote:

On 10/12/06, Rolando Abarca <funkaster@gmail.com> wrote:

Thanks a lot, I'll try that!!!

No problem. Take a look at bit-struct if you find yourself needing to
do some more complex packing and unpacking of binary data ...

http://raa.ruby-lang.org/project/bit-struct/

Blessings,
TwP

--
my religion is very simple. my religion is kindness. -- the dalai lama

Rolando Abarca wrote:

···

--
rolando -- [[ knowledge is empty, fill it ]] --
"Tam pro papa quam pro rege bibunt omnes sine lege."

Quicquid Venus imperat, labor est suavis...

:wink:
Hal

Robert Klemme:

Kalman Noel wrote:

Tim Pease:

str = self.read( 4 )

Yet you cannot be sure that sizeof(int) is 4.

str = read( [0].pack('N').length )

Hey, only now I learnt that sizeof(int) is 4 even on my amd64 machine. I had to
check that with a C program to make me believe it.

Kalman

Ooooo ... clever!

class IO
  SIZEOF_INT = [0].pack('N').length

  def geti( endian = :little )
    str = read( SIZEOF_INT )
    str.reverse! if endian == :little
    str.unpack( 'N' )[0]
  end
end

I'm too lazy to benchmark it today, but is reverse! faster than
reverse on strings?

Blessings,
TwP

···

On 10/13/06, Robert Klemme <shortcutter@googlemail.com> wrote:

On 12.10.2006 18:10, Kalman Noel wrote:

class IO
  def geti( endian = :little )
    str = read( [0].pack('N').length )
    str.reverse! if endian == :little
    str.unpack( 'N' )[0]
  end
end

No, we've just been parsing very large pixel images. No complex data
structures. Read four bytes, mask off the hamming code and error
bits, store the pixel data in an mmap cache, repeat until EOF.

From the bit-struct readme ...

"BitStruct is most efficient when your data is primarily treated as a
binary string, and only secondarily treated as a data structure. (For
instance, you are routing packets from one socket to another, possibly
looking at one or two fields as it passes through or munging some
headers.) If accessor operations are a bottleneck, a better approach
is to define a class that wraps an array and uses pack/unpack when the
object needs to behave like a binary string."

TwP

···

On 10/12/06, ara.t.howard@noaa.gov <ara.t.howard@noaa.gov> wrote:

On Fri, 13 Oct 2006, Tim Pease wrote:
>
> No problem. Take a look at bit-struct if you find yourself needing to
> do some more complex packing and unpacking of binary data ...
>
> http://raa.ruby-lang.org/project/bit-struct/
>
> Blessings,
> TwP

have you been using this for your stuff tim?

Tim Pease wrote:

···

On 10/13/06, Robert Klemme <shortcutter@googlemail.com> wrote:

On 12.10.2006 18:10, Kalman Noel wrote:

class IO
  def geti( endian = :little )
    str = read( [0].pack('N').length )
    str.reverse! if endian == :little
    str.unpack( 'N' )[0]
  end
end

Ooooo ... clever!

class IO
SIZEOF_INT = [0].pack('N').length

def geti( endian = :little )
   str = read( SIZEOF_INT )
   str.reverse! if endian == :little
   str.unpack( 'N' )[0]
end
end

I'm too lazy to benchmark it today, but is reverse! faster than
reverse on strings?

Yes, most likely. No new object is created.

  robert

I thought that 'N' was _always_ a 32-bit in network byte order?
According to the docs, platform independent sizes are used everywhere
except the SsIiLl directives when escaped by an underscore...?

···

On Sat, 2006-10-14 at 02:42 +0900, Tim Pease wrote:

On 10/13/06, Robert Klemme <shortcutter@googlemail.com> wrote:
> On 12.10.2006 18:10, Kalman Noel wrote:
>
> class IO
> def geti( endian = :little )
> str = read( [0].pack('N').length )
> str.reverse! if endian == :little
> str.unpack( 'N' )[0]
> end
> end
>

Ooooo ... clever!

class IO
  SIZEOF_INT = [0].pack('N').length

  def geti( endian = :little )
    str = read( SIZEOF_INT )
    str.reverse! if endian == :little
    str.unpack( 'N' )[0]
  end
end

--
Ross Bamford - rosco@roscopeco.REMOVE.co.uk

have you looked into using narray? then you can mask the entire image at
once. i have code that turns an mmap into an narray - it's quite simple. got
a sample file?

-a

···

On Fri, 13 Oct 2006, Tim Pease wrote:

On 10/12/06, ara.t.howard@noaa.gov <ara.t.howard@noaa.gov> wrote:

On Fri, 13 Oct 2006, Tim Pease wrote:
>
> No problem. Take a look at bit-struct if you find yourself needing to
> do some more complex packing and unpacking of binary data ...
>
> http://raa.ruby-lang.org/project/bit-struct/
>
> Blessings,
> TwP

have you been using this for your stuff tim?

No, we've just been parsing very large pixel images. No complex data
structures. Read four bytes, mask off the hamming code and error
bits, store the pixel data in an mmap cache, repeat until EOF.

--
my religion is very simple. my religion is kindness. -- the dalai lama