Byte order reading on windows versus unix in ruby

I have written some code that reads bytes from a file in bigendian order (the file was written in big-endian order). It works fine on unix, but on windows it dies when it encounters 1a, also known as 26 also known as ctrl-z also known as an eof character in dos. This causes the read to stop. Is this a known bug? Is something else going on here?

Thanks in advance,
Bob Evans

File.open("name", "rb")

s.

···

On Wed, 19 Oct 2005 09:38:43 +0200, Robert Evans <robert.evans@acm.org> wrote:

I have written some code that reads bytes from a file in bigendian order (the file was written in big-endian order). It works fine on unix, but on windows it dies when it encounters 1a, also known as 26 also known as ctrl-z also known as an eof character in dos. This causes the read to stop. Is this a known bug? Is something else going on here?

Thanks in advance,
Bob Evans

Robert Evans <robert.evans@acm.org> writes:

I have written some code that reads bytes from a file in bigendian
order (the file was written in big-endian order). It works fine on
unix, but on windows it dies when it encounters 1a, also known as 26
also known as ctrl-z also known as an eof character in dos. This
causes the read to stop. Is this a known bug? Is something else going
on here?

  Try to open file with "rb" option i.e. File.open("foobar", "rb").

- Ville

ASCII 26 is the EOF character for text files in DOS/Windows
systems. In order to treat 0x1a as "just another byte", you
need to open the file in binary mode. This is done by using the
'b' mode indicator. Binary mode on Windows/DOS systems also
turns off some line-ending handling used for text files.

Using binary mode on files on a Unix system has no effect, and
is safe to use when there is a chance the code may be run on
a Microsoft OS.

Tim Hammerquist

···

Robert Evans <robert.evans@acm.org> wrote:

I have written some code that reads bytes from a file in bigendian
order (the file was written in big-endian order). It works fine on
unix, but on windows it dies when it encounters 1a, also known as 26
also known as ctrl-z also known as an eof character in dos. This
causes the read to stop. Is this a known bug? Is something else going
on here?

:slight_smile: Just found it. Thanks!

···

On Oct 19, 2005, at 12:46 AM, Stefan Schmiedl wrote:

On Wed, 19 Oct 2005 09:38:43 +0200, Robert Evans > <robert.evans@acm.org> wrote:

I have written some code that reads bytes from a file in bigendian order (the file was written in big-endian order). It works fine on unix, but on windows it dies when it encounters 1a, also known as 26 also known as ctrl-z also known as an eof character in dos. This causes the read to stop. Is this a known bug? Is something else going on here?

Thanks in advance,
Bob Evans

File.open("name", "rb")

s.

Thanks for the detailed description.

Bob

···

On Oct 19, 2005, at 1:01 AM, Tim Hammerquist wrote:

Robert Evans <robert.evans@acm.org> wrote:

I have written some code that reads bytes from a file in bigendian
order (the file was written in big-endian order). It works fine on
unix, but on windows it dies when it encounters 1a, also known as 26
also known as ctrl-z also known as an eof character in dos. This
causes the read to stop. Is this a known bug? Is something else going
on here?

ASCII 26 is the EOF character for text files in DOS/Windows
systems. In order to treat 0x1a as "just another byte", you
need to open the file in binary mode. This is done by using the
'b' mode indicator. Binary mode on Windows/DOS systems also
turns off some line-ending handling used for text files.

Using binary mode on files on a Unix system has no effect, and
is safe to use when there is a chance the code may be run on
a Microsoft OS.

Tim Hammerquist

Isn't bindmode supposed to be getting set automatically? Or is this up and coming for ruby 1.8.3 or 2.0 ?

Zach

Hi,

···

In message "Re: Byte order reading on windows versus unix in ruby" on Wed, 19 Oct 2005 22:06:43 +0900, Zach Dennis <zdennis@mktec.com> writes:

Isn't bindmode supposed to be getting set automatically? Or is this up
and coming for ruby 1.8.3 or 2.0 ?

Short answer: no.

Longer answer: if there could be a rule to set binmode automagically,
I'm glad to add it to CVS HEAD.

              matz.

Yukihiro Matsumoto schrieb:

Hi,

>Isn't bindmode supposed to be getting set automatically? Or is this up >and coming for ruby 1.8.3 or 2.0 ?

Short answer: no.

Longer answer: if there could be a rule to set binmode automagically,
I'm glad to add it to CVS HEAD.

Shouldn't binmode be the default? This way we'd have the same behaviour on Unix and Windows, which would increase the portability of Ruby scripts. We'd need a new method to leave binmode, though, but I haven't heard of anyone using non-binmode on Windows.

Regards,
Pit

···

In message "Re: Byte order reading on windows versus unix in ruby" > on Wed, 19 Oct 2005 22:06:43 +0900, Zach Dennis <zdennis@mktec.com> writes:

Possibly because it ~is~ the default. If you changed the default to
binmode, I reckon you'd be hearing a lot more about 'non-binmode'! :wink:

Sean

···

On 10/25/05, Pit Capitain <pit@capitain.de> wrote:

I haven't heard of anyone using non-binmode on Windows.

Sean O'Halpin schrieb:

···

On 10/25/05, Pit Capitain <pit@capitain.de> wrote:

I haven't heard of anyone using non-binmode on Windows.

Possibly because it ~is~ the default. If you changed the default to
binmode, I reckon you'd be hearing a lot more about 'non-binmode'! :wink:

Yes, I thought of that, too. But I really doubt that there are many Ruby programs depending on non-binary mode. Maybe I should write a RCR so that others could vote on this.

Regards,
Pit

Hi Pit,

'Non-binary' mode (or 'text mode') comes from the underlying C
library. It's part of ANSI C and is what makes C and Ruby on Windows
work like C and Ruby on Unix for all text file processing out of the
box. It turns those horrible CRLFs (\r\n) into nice LFs (\n) only so
programmers see a uniform interface. YMMV but I do a heck of a lot
more text file processing than binary file processing in Ruby and my
(admittedly non-scientific) hunch is that most Ruby users on both Unix
and Windows do the same (e.g. Rails).

Would you want to have to set textmode on to make all your text
processing scripts portable? Yikes! But as I said, YMMV.

Regards,

Sean

···

On 10/25/05, Pit Capitain <pit@capitain.de> wrote:

Sean O'Halpin schrieb:
> On 10/25/05, Pit Capitain <pit@capitain.de> wrote:
>
>>I haven't heard of anyone using non-binmode on Windows.
>
> Possibly because it ~is~ the default. If you changed the default to
> binmode, I reckon you'd be hearing a lot more about 'non-binmode'! :wink:

Yes, I thought of that, too. But I really doubt that there are many Ruby
programs depending on non-binary mode. Maybe I should write a RCR so
that others could vote on this.

Regards,
Pit

Sean O'Halpin schrieb:

'Non-binary' mode (or 'text mode') comes from the underlying C
library. It's part of ANSI C and is what makes C and Ruby on Windows
work like C and Ruby on Unix for all text file processing out of the
box. It turns those horrible CRLFs (\r\n) into nice LFs (\n) only so
programmers see a uniform interface. YMMV but I do a heck of a lot
more text file processing than binary file processing in Ruby and my
(admittedly non-scientific) hunch is that most Ruby users on both Unix
and Windows do the same (e.g. Rails).

Would you want to have to set textmode on to make all your text
processing scripts portable? Yikes! But as I said, YMMV.

Thanks for your input, Sean. I answered the other thread.

Regards,
Pit