Finding a tag in a binary file

rob_s · 23 February 2011 21:28

Hi I'm a complete non programmer but willing to give it a try.

I have a DICOMDIR (medical imaging file) which has various tags in,
following a tag ie 0010, 0010 is the "patient name", "study date"
0008,0020 is another.

I have several DICOMDIR and I would like to extract just a few bits of
information from them to a .txt file, the files average size is about
343 KB as its full of other information as well which I don't need. Each
file has amongst other things a list of about 10 names, scan dates etc.

So far I've managed to read the file into an array using

contentsArray=[] #open new array empty

open('1DICOMDIR', 'rb') { |f| f.each_byte { |f| contentsArray.push f } }

This gets the info into an array

although its a hex file it displays the contents in decimal, but they
correspond to the hex codes I've seen in a hex editor...

My problem is I don't know how to detect 0010,0010 ? then extract the
name. Can anyone help, I've looked all over the web but not finding
anything to help me much. Thanks a lot

···

--
Posted via http://www.ruby-forum.com/.

niklas_brueckenschla · 23 February 2011 22:07

Example:
  byte = 0b00100010
  # clear the trailing 4 bit, then shift:
  first_four_bits = (byte & 0b11110000) >> 4
  # clear the leading 4 bits:
  last_four_bits = byte & 0b1111

btw., your code would be shorter as:
   open('1DICOMDIR', 'rb') { |f|
     contentsArray = f.each_byte.to_a
   }

(also it's not so common to use camelCase in the ruby world, we usually
prefer under_scored_identifiers, butThatsUpToYou...)

By default ruby will print numbers as decimal, however you can look at
them in other ways by using to_s and passing in a base:
123.to_s(16) #=> "7b"
123.to_s(2) #=> "1111011"

hope this helps.

-- niklas

···

On Thu, 2011-02-24 at 06:28 +0900, rob s. wrote:

Hi I'm a complete non programmer but willing to give it a try.

I have a DICOMDIR (medical imaging file) which has various tags in,
following a tag ie 0010, 0010 is the "patient name", "study date"
0008,0020 is another.

I have several DICOMDIR and I would like to extract just a few bits of
information from them to a .txt file, the files average size is about
343 KB as its full of other information as well which I don't need. Each
file has amongst other things a list of about 10 names, scan dates etc.

So far I've managed to read the file into an array using

contentsArray= #open new array empty

open('1DICOMDIR', 'rb') { |f| f.each_byte { |f| contentsArray.push f } }

This gets the info into an array

although its a hex file it displays the contents in decimal, but they
correspond to the hex codes I've seen in a hex editor...

My problem is I don't know how to detect 0010,0010 ? then extract the
name. Can anyone help, I've looked all over the web but not finding
anything to help me much. Thanks a lot

David8 · 23 February 2011 22:12

Hi Rob,

I don't usually work with binary files, but I would imagine you could do
something like this (untested):

search_byte = 0b00100010
contents = File.binread('1DICOMDIR').bytes.to_a
index = contents.index(search_byte)

# I assume you want the next entry after
name = contents[index.next]

Then simply decode the name however is appropriate.

David

···

On Wed, Feb 23, 2011 at 4:28 PM, rob s. <rsnotnats@gmail.com> wrote:

Hi I'm a complete non programmer but willing to give it a try.

I have a DICOMDIR (medical imaging file) which has various tags in,
following a tag ie 0010, 0010 is the "patient name", "study date"
0008,0020 is another.

I have several DICOMDIR and I would like to extract just a few bits of
information from them to a .txt file, the files average size is about
343 KB as its full of other information as well which I don't need. Each
file has amongst other things a list of about 10 names, scan dates etc.

So far I've managed to read the file into an array using

contentsArray= #open new array empty

open('1DICOMDIR', 'rb') { |f| f.each_byte { |f| contentsArray.push f } }

This gets the info into an array

although its a hex file it displays the contents in decimal, but they
correspond to the hex codes I've seen in a hex editor...

My problem is I don't know how to detect 0010,0010 ? then extract the
name. Can anyone help, I've looked all over the web but not finding
anything to help me much. Thanks a lot

--
Posted via http://www.ruby-forum.com/\.

rob_s · 23 February 2011 22:47

thanks for the replies looks like I don't have
contents = File.binread('1DICOMDIR').bytes.to_a
undefined method `binread' for File:Class
I have ruby 1.8.7

I'll have a play with the code you suggested tomorrow.

···

--
Posted via http://www.ruby-forum.com/.

rob_s · 24 February 2011 21:53

Hi David, sorry that does not work either, not on my system anyway.
looks like a nice way of doing it though if I can get the code to run

···

--
Posted via http://www.ruby-forum.com/.

David8 · 24 February 2011 18:56

My bad, it should be IO.binread

···

On Thu, 2011-02-24 at 07:47 +0900, rob s. wrote:

thanks for the replies looks like I don't have
contents = File.binread('1DICOMDIR').bytes.to_a
undefined method `binread' for File:Class
I have ruby 1.8.7

I'll have a play with the code you suggested tomorrow.

David8 · 25 February 2011 01:01

I forgot to mention, it's 1.9 only.

···

On Thursday, 24 February 2011 at 4:53 pm, rob s. wrote:

Hi David, sorry that does not work either, not on my system anyway.
looks like a nice way of doing it though if I can get the code to run

--
Posted via http://www.ruby-forum.com/\.

Topic		Replies	Views
Finding a tag in a binary file ruby-talk	11	171	8 March 2011
Problem with unpack ruby-talk	7	117	5 July 2008
[SUMMARY] ID3 Tags (#136) ruby-talk	0	78	30 August 2007
Reformatting a text file that has some binary in it ruby-talk	19	148	28 April 2009
How to search for ascii char sequence in binary file ruby-talk	1	114	23 October 2007

Finding a tag in a binary file

Related topics