Problems running code to read binary in Windows7

Hello to all in forum,

My first post here, I'm newbie in ruby, maybe somebody could help me
with this.

I have a script that read/parse a binary file. The scripts works fine if
I run it in Ubuntu, but I'm trying
to run the code in IRB on Windows7 with Ruby version "2.0.0p247
(2013-06-27) [i386-mingw32]" and I receive
the following errors.

···

##########################################################################
C:\Scripts>ruby script.rb binaryfile
script.rb:18:in `gets': encoding mismatch: CP850 IO with UTF-8 RS
(ArgumentError)
        from script.rb:18:in `gets'
        from script.rb:18:in `<main>'
##########################################################################

The script is like below (line 18 contains while gets):

##########################################################################
#!/usr/bin/env ruby

File.open(ARGV[0])

while gets
    line = $_.unpack('H*', "rb")[0]

  Some code
end
##########################################################################

Thanks in advance for any help.

Best Regards

--
Posted via http://www.ruby-forum.com/.

First guess ...

- File.open(ARGV[0])
+ File.open(ARGV[0], 'rb') # opens file in 'binary' mode.

No Windows here to try it.

Abinoam Jr.

···

On Fri, Sep 13, 2013 at 5:16 PM, Sever Siller <lists@ruby-forum.com> wrote:

Hello to all in forum,

My first post here, I'm newbie in ruby, maybe somebody could help me
with this.

I have a script that read/parse a binary file. The scripts works fine if
I run it in Ubuntu, but I'm trying
to run the code in IRB on Windows7 with Ruby version "2.0.0p247
(2013-06-27) [i386-mingw32]" and I receive
the following errors.

##########################################################################
C:\Scripts>ruby script.rb binaryfile
script.rb:18:in `gets': encoding mismatch: CP850 IO with UTF-8 RS
(ArgumentError)
        from script.rb:18:in `gets'
        from script.rb:18:in `<main>'
##########################################################################

The script is like below (line 18 contains while gets):

##########################################################################
#!/usr/bin/env ruby

File.open(ARGV[0])

while gets
    line = $_.unpack('H*', "rb")[0]

  Some code
end
##########################################################################

Thanks in advance for any help.

Best Regards

--
Posted via http://www.ruby-forum.com/\.

Hello Abinoam Jr.,

I've tried adding File.open(ARGV[0], 'rb') too, but is the same error :frowning:

Maybe somebody could help me.

Thanks in advance.

Best regards

···

--
Posted via http://www.ruby-forum.com/.

Hello Chris and Abinoam,

Thank you for your answers. I'll your suggestions.

I try to use "while gets" because the binary is divided by blocks, so
each time that appears the beginning of each block executes the code
inside "while gets".

The issue is the binary files is 2GB in size and I don't know why the
code works in linux under ruby 2.0 and doesnt work in windows with ruby
2.0.

Thanks again, I'll check

···

--
Posted via http://www.ruby-forum.com/.

Hello again Chris and Abinoam,

I've tested Chris option but I still receive the error. Maybe is
something of the interpretation or encoding that ruby is not
understanding.

script.rb:18:in `gets': encoding mismatch: CP850 IO with UTF-8 RS
(ArgumentError)
        from script.rb:18:in `gets'
        from script.rb:18:in `<main>'

I've tested Abinoam option too and I receive this error:
script.rb:50: syntax error, unexpected end-of-input, expecting
keyword_end

Line 50 contains "end". Is the end of the "while gets" loop.

Thanks in advance for your help.

···

--
Posted via http://www.ruby-forum.com/.

Hello tamouse/Robert

Thank you for answer.

Adding what you said it means eliminate the "while gets" loop, but now
that you say I need to process the buffer, instead of the line:

line = $_.unpack('H*')[0]

What should go inside the loop "while f.read(1024,buffer)"?

$/ is defined at the begin of the script as below:

···

############################################################
BEGIN{ $/="\xff\x45" }

File.open(file, 'rb') do |f|
  buffer = ''

  while f.read(1024, buffer)
     # process the buffer
  end
end
############################################################

Thanks again for the help.

Best regards

--
Posted via http://www.ruby-forum.com/.

Hello Robert,

Thank you.

I think it is fixed now adding
BEGIN{ $/="\xff\x45".force_encoding("BINARY") }

"http://yehudakatz.com/2010/05/05/ruby-1-9-encodings-a-primer-and-the-solution-for-rails/"

Many thanks for all the your help

···

--
Posted via http://www.ruby-forum.com/.

Hello Robert,

How would be the way you say considering the current code is like this:

···

##########################################################
BEGIN{ $/="\xff\x45".force_encoding("BINARY")}

while gets
    line = $_.unpack('H*')[0]
    next unless line =~ /Regexp/
  #Some code using "line" content
end
##########################################################

Should be something like:

while line = gets(\xff\x45.force_encoding("BINARY"))

Thanks in advance

--
Posted via http://www.ruby-forum.com/.

FIle.open returns a file object, which you are not saving a reference to
As far as I know gets will default to standard in. does the app wait till
you hit a key?

IO has a binread method that looks applicable

···

On Fri, Sep 13, 2013 at 6:47 PM, Sever Siller <lists@ruby-forum.com> wrote:

Hello Abinoam Jr.,

I've tried adding File.open(ARGV[0], 'rb') too, but is the same error :frowning:

Maybe somebody could help me.

Thanks in advance.

Best regards

--
Posted via http://www.ruby-forum.com/\.

Chris is right (I didn't take care of the rest of the code).

Well... why do you want to handle a "binary" file in an "each_line" fashion?

But, if this is really the case... (with your code as the start point, I did).

File.open(ARGV[0], "rb") do |f|
  while f.gets
    line = $_.unpack('H*')[0]
    # Some code here
  end
end

To read the file as a whole... (as Chris suggested).

file = IO.binread(ARGV[0])
hexfile = file.unpack('H*')[0]

Abinoam Jr.

···

On Fri, Sep 13, 2013 at 10:06 PM, Chris Hulan <chris.hulan@gmail.com> wrote:

FIle.open returns a file object, which you are not saving a reference to
As far as I know gets will default to standard in. does the app wait till
you hit a key?

IO has a binread method that looks applicable
Class: IO (Ruby 1.9.3)

On Fri, Sep 13, 2013 at 6:47 PM, Sever Siller <lists@ruby-forum.com> wrote:

Hello Abinoam Jr.,

I've tried adding File.open(ARGV[0], 'rb') too, but is the same error :frowning:

Maybe somebody could help me.

Thanks in advance.

Best regards

--
Posted via http://www.ruby-forum.com/\.

Difficult to know what is happening here.

Wondering if you should specify:

   File.open(file, 'rb', :encoding => Encoding::UTF_8 ) do |f|
     while (f.read(1024,buffer))
        # process the buffer
     end
   end

···

On Sep 14, 2013, at 12:42 AM, Sever Siller <lists@ruby-forum.com> wrote:

Hello again Chris and Abinoam,

I've tested Chris option but I still receive the error. Maybe is
something of the interpretation or encoding that ruby is not
understanding.

script.rb:18:in `gets': encoding mismatch: CP850 IO with UTF-8 RS
(ArgumentError)
       from script.rb:18:in `gets'
       from script.rb:18:in `<main>'

I've tested Abinoam option too and I receive this error:
script.rb:50: syntax error, unexpected end-of-input, expecting
keyword_end

Line 50 contains "end". Is the end of the "while gets" loop.

Thanks in advance for your help.

You need to declare buffer:

irb(main):028:0> File.open('x','rb'){|io| while io.read(1024, buffer);
puts buffer.bytesize; end}
NameError: undefined local variable or method `buffer' for main:Object
        from (irb):28:in `block in irb_binding'
        from (irb):28:in `open'
        from (irb):28
        from /usr/bin/irb:12:in `<main>'

Also, there seems to be no point in declaring an encoding when reading binary.

File.open(file, 'rb') do |f|
  buffer = ''

  while f.read(1024, buffer)
     # process the buffer
  end
end

Kind regards

robert

···

On Sat, Sep 14, 2013 at 10:20 AM, Tamara Temple <tamouse.lists@gmail.com> wrote:

On Sep 14, 2013, at 12:42 AM, Sever Siller <lists@ruby-forum.com> wrote:

Hello again Chris and Abinoam,

I've tested Chris option but I still receive the error. Maybe is
something of the interpretation or encoding that ruby is not
understanding.

script.rb:18:in `gets': encoding mismatch: CP850 IO with UTF-8 RS
(ArgumentError)
       from script.rb:18:in `gets'
       from script.rb:18:in `<main>'

I've tested Abinoam option too and I receive this error:
script.rb:50: syntax error, unexpected end-of-input, expecting
keyword_end

Line 50 contains "end". Is the end of the "while gets" loop.

Thanks in advance for your help.

Difficult to know what is happening here.

Wondering if you should specify:

   File.open(file, 'rb', :encoding => Encoding::UTF_8 ) do |f|
     while (f.read(1024,buffer))
        # process the buffer
     end
   end

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

Adding what you said it means eliminate the "while gets" loop, but now
that you say I need to process the buffer, instead of the line:

line = $_.unpack('H*')[0]

What should go inside the loop "while f.read(1024,buffer)"?

Maybe you go first with stating what it is that you want to achieve.

$/ is defined at the begin of the script as below:

Note that you do not need that: you can use #gets with an argument for
the delimiter.

Cheers

robert

···

On Sat, Sep 14, 2013 at 6:44 PM, Sever Siller <lists@ruby-forum.com> wrote:

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

I think it is fixed now adding
BEGIN{ $/="\xff\x45".force_encoding("BINARY") }

Btw. there's no point in using a BEGIN block here.

And, still I would prefer to use the value as an argument to #gets and
not change the system wide setting. That is much more robust.

"http://yehudakatz.com/2010/05/05/ruby-1-9-encodings-a-primer-and-the-solution-for-rails/&quot;

Many thanks for all the your help

You're welcome.

Cheers

robert

···

On Sat, Sep 14, 2013 at 8:23 PM, Sever Siller <lists@ruby-forum.com> wrote:

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

Almost. I'd put the result of this expression in a local variable.
That is more efficient and you can give it a telling name.

Cheers

robert

···

On Sat, Sep 14, 2013 at 9:35 PM, Sever Siller <lists@ruby-forum.com> wrote:

Hello Robert,

How would be the way you say considering the current code is like this:

##########################################################
BEGIN{ $/="\xff\x45".force_encoding("BINARY")}

while gets
    line = $_.unpack('H*')[0]
    next unless line =~ /Regexp/
  #Some code using "line" content
end
##########################################################

Should be something like:

while line = gets(\xff\x45.force_encoding("BINARY"))

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/