Encoding woes with command promt output

Im working on a script which is going to be printing some non-ascii
characters (æøå), and for the life of me, I just cant seem to make it
print in the Windows Command Prompt terminal! I think I have done what
should be enough to make it work, defining the encoding in my ruby file,
but it just isnt working. However, if I try to print the same
characters in irb, it works just fine. I can make it work by injecting
hex values in my strings, but Id rather not have to do that, as that
code isnt very readable. Im at a loss here, and would be grateful if
anyone can help me out with my encoding woes!

OS: Windows XP 32 bit SP3

C:\>ruby -v
ruby 1.9.2p180 (2011-02-18) [i386-mingw32]

Content in file test.rb:
# Encoding: CP850
puts "æøå"
puts "\x91\x9B\x86"
puts Encoding.default_external

C:\>ruby test.rb
├ª├©├Ñ
æøå
CP850

C:\>irb
irb(main):001:0> puts "æøå"
æøå
=> nil
irb(main):002:0> puts "\x91\x9B\x86"
æøå
=> nil
irb(main):003:0> puts Encoding.default_external
CP850

Regards,
Chris

···

--
Posted via http://www.ruby-forum.com/.

Hi,

Im working on a script which is going to be printing some non-ascii
characters (æøå), and for the life of me, I just cant seem to make it
print in the Windows Command Prompt terminal! I think I have done what
should be enough to make it work, defining the encoding in my ruby file,
but it just isnt working. However, if I try to print the same
characters in irb, it works just fine. I can make it work by injecting
hex values in my strings, but Id rather not have to do that, as that
code isnt very readable. Im at a loss here, and would be grateful if
anyone can help me out with my encoding woes!

OS: Windows XP 32 bit SP3

C:\>ruby -v
ruby 1.9.2p180 (2011-02-18) [i386-mingw32]

Content in file test.rb:
# Encoding: CP850
puts "æøå"
puts "\x91\x9B\x86"
puts Encoding.default_external

It seems that the actual encoding of test.rb is not CP850 but UTF-8.

The string "æøå" is encoded as "\xC3\xA6\xC3\xB8\xC3\xA5" in UTF-8.

C:\>ruby test.rb
├ª├©├Ñ
æøå
CP850

The string "├ª├©├Ñ" is "\xC3\xA6\xC3\xB8\xC3\xA5" in CP850.

C:\>irb
irb(main):001:0> puts "æøå"
æøå
=> nil
irb(main):002:0> puts "\x91\x9B\x86"
æøå
=> nil
irb(main):003:0> puts Encoding.default_external
CP850

Regards,
Chris

Regards,
Park Heesob

···

2011/12/15 Chris Lervag <chris.lervag@gmail.com>

Heesob Park wrote in post #1036844:

It seems that the actual encoding of test.rb is not CP850 but UTF-8.

The string "æøå" is encoded as "\xC3\xA6\xC3\xB8\xC3\xA5" in UTF-8.

Well, I've tried marking the file with # Encoding: UTF-8 as well, but it
still doesnt help me getting my "æøå" string printed properly to the
screen. So my problem remains, how do I need to configure this so that a
string like "æøå" in my Ruby file gets printed, and not having to use
the 'ugly' hex injections?

Thanks,
Chris

···

--
Posted via http://www.ruby-forum.com/\.

Chris Lervag wrote in post #1036849:

Heesob Park wrote in post #1036844:

It seems that the actual encoding of test.rb is not CP850 but UTF-8.

The string "æøå" is encoded as "\xC3\xA6\xC3\xB8\xC3\xA5" in UTF-8.

Well, I've tried marking the file with # Encoding: UTF-8 as well, but it
still doesnt help me getting my "æøå" string printed properly to the
screen. So my problem remains, how do I need to configure this so that a
string like "æøå" in my Ruby file gets printed, and not having to use
the 'ugly' hex injections?

Did you check your console is actually using TrueType fonts?

Have you tried setting the codepage to Unicode? (chcp 65001)

···

--
Luis Lavena

--
Posted via http://www.ruby-forum.com/\.

Luis Lavena wrote in post #1036857:

Did you check your console is actually using TrueType fonts?

Have you tried setting the codepage to Unicode? (chcp 65001)

--
Luis Lavena

Thanks for the suggestion. I changed to unicode as suggested (chcp
65001). Also I changed my font in the Command Prompt to Lucida Console.

This seems to....almost work. However, my Ruby hangs! :frowning:

Content in file test.rb:
# Encoding: CP65001
puts "æøå"

C:\>ruby test.rb
æøååå

And there it apparently hangs indefinitely! I have to press Ctrl+C to
abort it:
æøåååtest.rb:2:in `write': Interrupt
        from test.rb:2:in `puts'
        from test.rb:2:in `puts'
        from test.rb:2:in `<main>'

Curiously, I am also not able to access irb when using CP 65001:

C:\>irb

C:\>
C:\>chcp 850
Aktiv tegntabell: 850

C:\>irb
irb(main):001:0> puts "æøå"
æøå
=> nil
irb(main):002:0> exit

It's not that big a deal for me, I can still manage, but I cant help but
feel annoyed not getting the encoding working right.

Regards,
Chris

···

--
Posted via http://www.ruby-forum.com/\.