Binary file modification

Hi

I've been modifying a binary file that contains various data including
some audio. I'm trying to add my own audio (instead of the audio in the
file) which I believe I have done but I need to modify the
content-length header in the file which indicates the length of the
audio sample. I've been reading the content-length data from the file
using something similar to :

f=open(config[:file],"rb")
f.pos=AUDIO_CONTENT_HEADER_OFFSET
length=f.read(3).unpack("H2H2H2").hex.to_i
f.close
=> an integer

This basically opens the file as a binary file, skips to the audio
content header data position and then unpacks 3 bytes into a string
(formatted as hex) which is then converted to an integer value.

I'd like to be able to reverse this process and take any an integer
value (1024 in the case shown below) and write it to a binary file with
some header and footer data - something like :

mydata = SOMEHEADERDATA
mydata += ["1024"].pack("someformat")
mydata += AUDIODATA
f=open(config[:file],"wb")
f.write(mydata)
f.close

However I'm a bit stuck on how to pack the data (if this is the correct
solution). If someone could point me in the correct direction it would
be much appreciated.

···

--
Posted via http://www.ruby-forum.com/.

From: list-bounce@example.com
[mailto:list-bounce@example.com] On Behalf Of Rob Lee

[...]

I've been reading the content-length data from
the file
using something similar to :

f=open(config[:file],"rb")
f.pos=AUDIO_CONTENT_HEADER_OFFSET
length=f.read(3).unpack("H2H2H2").hex.to_i

That should die because Array#hex doesn't exist, but I get the idea. If you
do this kind of thing a lot with different kinds of data then you should
note that to_i and to_s both take optional base arguments, so
foo.unpack('H*').first.to_i(16) should work

[...]

I'd like to be able to reverse this process and take any an integer
value (1024 in the case shown below) and write it to a binary
file with
some header and footer data - something like :

[...]

mydata += ["1024"].pack("someformat")

[...]

However I'm a bit stuck on how to pack the data (if this is
the correct
solution).

Here's the problem - if you need exactly three bytes then you are going to
have to apply your own padding, since the pack routines will only pack
directly as a long or a short which will mostly be 4 and 2 bytes - both of
which could cause you problems. My hacks always involve <<'ing a single byte
integer onto a string. In your case, here is a horrible oneliner which you
should not use because it is gross.

num=1024
num.to_s(16)[0..2].instance_eval {(self.reverse + '0' * (6 -
self.length)).reverse}.scan(/../).inject('') {|s,byte| s << byte.to_i(16)}

You could also try googling BitStruct, which is a ruby library that might
help by defining these headers and footers as structure objects.

Cheers,

ben

···

-----Original Message-----

Rob Lee wrote:

Hi

I've been modifying a binary file that contains various data including some audio. I'm trying to add my own audio (instead of the audio in the file) which I believe I have done but I need to modify the content-length header in the file which indicates the length of the audio sample. I've been reading the content-length data from the file using something similar to :

f=open(config[:file],"rb")
f.pos=AUDIO_CONTENT_HEADER_OFFSET
length=f.read(3).unpack("H2H2H2").hex.to_i
f.close
=> an integer

This basically opens the file as a binary file, skips to the audio content header data position and then unpacks 3 bytes into a string (formatted as hex) which is then converted to an integer value.

I'd like to be able to reverse this process and take any an integer value (1024 in the case shown below) and write it to a binary file with some header and footer data - something like :

mydata = SOMEHEADERDATA
mydata += ["1024"].pack("someformat")
mydata += AUDIODATA
f=open(config[:file],"wb")
f.write(mydata)
f.close

However I'm a bit stuck on how to pack the data (if this is the correct solution). If someone could point me in the correct direction it would be much appreciated.

Ben's right, you can use BitStruct for this:

require 'bit-struct'

class AudioData < BitStruct

   unsigned :audio_length, 3*8, :endian => :little
   rest :data

   # Note: don't use :length as the name of a field, because it will
   # conflict with the #length method inherited from String.

   # the :endian option can also be :big, :network (== :big), or :native
end

audio_data = AudioData.new

data = "foo bar baz"
audio_data.data = data
audio_data.audio_length = data.length

p audio_data
p audio_data.to_s

__END__

Output:

#<AudioData audio_length=11, data="foo bar baz">
"\v\000\000foo bar baz"

···

--
       vjoel : Joel VanderWerf : path berkeley edu : 510 665 3407

You've had other answers that answer the question as put. Another
approach that may be useful is to look at SNG, a textual means of
accessing the contents of PNG graphics:

http://www.faqs.org/docs/artu/ch06s01.html#id2910193

        HTH
        Hugh

···

On Mon, 23 Oct 2006, Rob Lee wrote:

Hi

I've been modifying a binary file that contains various data including
some audio. I'm trying to add my own audio (instead of the audio in the
file) which I believe I have done but I need to modify the

Joel VanderWerf wrote:

Ben's right, you can use BitStruct for this:

Forgot the link:

http://raa.ruby-lang.org/project/bit-struct/

···

--
       vjoel : Joel VanderWerf : path berkeley edu : 510 665 3407

Ben Nagy wrote:

From: list-bounce@example.com
[mailto:list-bounce@example.com] On Behalf Of Rob Lee

[...]

I've been reading the content-length data from
the file
using something similar to :

f=open(config[:file],"rb")
f.pos=AUDIO_CONTENT_HEADER_OFFSET
length=f.read(3).unpack("H2H2H2").hex.to_i

That should die because Array#hex doesn't exist, but I get the idea. If
you
do this kind of thing a lot with different kinds of data then you should
note that to_i and to_s both take optional base arguments, so
foo.unpack('H*').first.to_i(16) should work

Thanks for pointing that out, it's strange as the code definately works
(it returns correct content lengths repeatedly for different files) but
as you say Array#hex doesn't exist - I'll look into this ...

[...]

I'd like to be able to reverse this process and take any an integer
value (1024 in the case shown below) and write it to a binary
file with
some header and footer data - something like :

[...]

mydata += ["1024"].pack("someformat")

[...]

However I'm a bit stuck on how to pack the data (if this is
the correct
solution).

Here's the problem - if you need exactly three bytes then you are going
to
have to apply your own padding, since the pack routines will only pack
directly as a long or a short which will mostly be 4 and 2 bytes - both
of
which could cause you problems. My hacks always involve <<'ing a single
byte
integer onto a string. In your case, here is a horrible oneliner which
you
should not use because it is gross.

num=1024
num.to_s(16)[0..2].instance_eval {(self.reverse + '0' * (6 -
self.length)).reverse}.scan(/../).inject('') {|s,byte| s <<
byte.to_i(16)}

I'm intrigued by this, I can see you are repacking the int value into
three bytes but I'm still unclear as to how you would then add this to
the file. I guess it needs to be packed as 3 bytes - my guess was

num=1024
newpacket += [num].pack("3C")

However this doesn't seem to work using the reading scheme above. Could
you tell me where I'm going wrong ?

You could also try googling BitStruct, which is a ruby library that
might
help by defining these headers and footers as structure objects.

I'm looking at using bitstruct for the next version of the project, but
as I've almost got this working now I'd like to avoid a re-write at this
stage if possible !

Cheers,

ben

Thanks for all the help - you've saved me hours of work already !

···

-----Original Message-----

--
Posted via http://www.ruby-forum.com/\.

Thanks for pointing that out, it's strange as the code definately works
(it returns correct content lengths repeatedly for different files) but
as you say Array#hex doesn't exist - I'll look into this ...

A quick reply to myself - I've just checked the code and I'm missing an
Array#join call in the above example, so the example wasn't correct ...

···

--
Posted via http://www.ruby-forum.com/\.

From: list-bounce@example.com
[mailto:list-bounce@example.com] On Behalf Of Rob Lee
Sent: Monday, October 23, 2006 6:06 PM
To: ruby-talk ML
Subject: Re: Binary file modification

Ben Nagy wrote:

[...]

> num=1024
> num.to_s(16)[0..2].instance_eval {(self.reverse + '0' * (6 -
> self.length)).reverse}.scan(/../).inject('') {|s,byte| s <<
> byte.to_i(16)}
>

I'm intrigued by this, I can see you are repacking the int value into
three bytes but I'm still unclear as to how you would then
add this to
the file.

It's already packed. The return value of the long expression will be a 3
byte string which you can insert as you like (print it to the IO handle, <<
it to another string etc etc). The accumulator in the inject method is a
string {|str,byte| ... might have been more readable, sorry.

ben

···

-----Original Message-----

It's already packed. The return value of the long expression will be a 3
byte string which you can insert as you like (print it to the IO handle,
<<
it to another string etc etc). The accumulator in the inject method is a
string {|str,byte| ... might have been more readable, sorry.

ben

Sorry I should have studied it a little closer ...

I've tried the following and your re-packing doesn't seem to work :

f = open("testfile","wb")
num=359447
f.write(num.to_s(16)[0..2].instance_eval {(self.reverse + '0' * (6 -
self.length)).reverse}.scan(/../).inject('') {|s,byte| s <<
byte.to_i(16)})
f.close
f = open("testfile","rb")
f.pos=0
puts f.read(3).unpack('H*').first.to_i(16)
f.close

=> 1404

It does seem to work for small values like 1024 but not larger ones. I
know there will be an upper limit to the integer size I can fit into the
three bytes but the 359447 value is one I've taken from an existing file
so I believe this is a valid integer to try and pack into these three
bytes.

Any advice ?

···

--
Posted via http://www.ruby-forum.com/\.

It does seem to work for small values like 1024 but not larger ones. I
know there will be an upper limit to the integer size I can fit into the
three bytes but the 359447 value is one I've taken from an existing file
so I believe this is a valid integer to try and pack into these three
bytes.

Any advice ?

Some more information on this - the maximum value that can be
successfully packed is 4095

Any advice ?

···

--
Posted via http://www.ruby-forum.com/\.